4d "timbre" macro-descriptor

rodrigo.constanzo · September 25, 2020, 8:55pm

So in example 11 there’s a processing chain that takes 13MFCCs and a variety of weighted stats, and reduced them down to 4d via PCA.

I’m curious what the methodology was to arrive at that amount of MFCCs, stats, DR algorithm, and output dimensions.

In my, thorough, but not comprehensive, testing I consistently got significantly worse results (in terms of “raw accuracy”) when PCA’d MFCCs.

So I’m wondering if these values, stats, and dimensions were arrived at via testing, intuition, or both, etc…

I guess in the context of LPT, as long as there are four dimensions that represent timbre, the “raw accuracy” (as in, being able to correctly match a trained class consistently) isn’t as important?

Same goes for the 13 vs 20 MFCCs, and what appears to be using all of the MFCCs (inclusive of the 0th coefficient), as far as I can tell from the code.

tremblap · September 26, 2020, 12:31pm

This is just part of an ongoing experimentation. I found out in example 10a that standardisation, vs using derivatives or not, vs using loudness weighing, make enough differences that I am still not settled on which I think is best. All of them are best than just centroid so far but this is composerly work in progress.

rodrigo.constanzo · September 26, 2020, 1:35pm

I’ll respond in here first to avoid overlap in the other thread.

In my testing I found a big improvement going from 13MFCCs to 20MFCCs (with a drop again at 24MFCCs), but that was for the kind of drum stuff I was doing with short drum hits.

I’m curious of the variables with and without derivatives (I found I got good results without), but to see how this also works with some PCA in the mix. As much as I’d like to assume that if you whittle down the features pre-PCA that you can expect the output of PCA to be better than if you just fed everything into PCA first, I have to imagine that wouldn’t be the case. Since “you” and “the algorithm” are looking at different things.

I wrote down in my to-do list coming up with a thing that analyzes everything, and then runs iterative permutations to calculate what combinations of features pre/post PCA gives me the best matching, that’s kind of a nightmare to build in Max…

At minimum I’ll try doing something like in this patch where you analyze “everything” and then create sub-datasets with different permutations for easier testing (rather than my previous approach in which I’d create a different analysis file per features I wanted).

tremblap · September 26, 2020, 1:52pm

you can try weighing with mfcc0 instead of loudness (extracting it with bufscale)

PCA is always better once you have selected your dims. garbage in garbage out as they say

Have fun, and I’m sure you’ll let us know what is best. with the new sexy compositing, you can do a few datasets and replace/assemble them in a patch (hence the research now of how it behaves for you much easier to do I feel…)

rodrigo.constanzo · September 26, 2020, 2:04pm

I guess that’s the thing. Things I may think are redundant may be useful to PCA to divide up the space. Or at least, that’s my second “gut” reaction.

Hehe, kind of.

I will test however.