So after getting something kind of useful out of fluid.knn~
in the other thread (I’m going to do some more testing tomorrow and see how I get on with some additional stuff @tremblap mentioned offline) I wanted to see what else I can make sense of from TB2.
In seeing how useful the MFCCs are for timbral matching, I was thinking about incorporating that into my onset descriptors thing when working with corpus navigation, but as a single “thing”. Meaning if I’m primarily using loudness, centroid, and flatness, I can’t just chuck 13 new numbers in that, as those would be disproportionately be weighed during the matching/querying process, not to mention that the scale of things would be all off (as per @tremblap’s APT (team ATP here!)).
Ideally I’d be able to do something like weigh loudness and centroid equally, then perhaps consider a bunch of the other spectral descriptors as a combined “item”, and then MFCCs as another “item”, plus fitting in a bunch of statistics in there too.
From this I remember @tremblap’s patch from the plenary day that did some AudioGuide-type things. In looking back at the patch, it’s not terribly clear what exactly is happening at each stage (although I did manage to get it to work!), but the core nuggets of it seem like they could be useful for “generic” querying/matching stuff.
So from the looks of it points are shoved into fluid.dataset~
, which needs a name and a size (ala entrymatcher~
). This could presumably be something like this:
1, "METAL RESONANCE DROPS Hard concrete ANGLE IRON 01 - 01.wav" 1956.507937 1023.095805 4 -17.815588 -0.042764 3.267899 -17.815588 2205. 0.303731 538.381897 1474.484863 7241.071289 2.690974 351.409302 7550.347168 -7.610556 0.003646 0.61819 -9.106574 20943.193359 5.410414 1321.936401 19468.511719 5551.549805 0.94741 138.995499 5434.257324;
2, "METAL RESONANCE DROPS Hard concrete ANGLE IRON 01 - 02.wav" 2202.721088 1181.382368 2 -28.928728 -0.037673 2.115414 -28.928728 1284.415771 0.42789 1061.948853 539.157104 6360.745117 1.902401 195.140411 5399.84082 -10.174833 0.002133 0.359874 -10.33946 20943.193359 4.831788 780.422302 10516.023438 4918.262207 0.614422 69.877556 4570.10791;
Then the named dataset is passed into a fluid.kdtree~
, and then I can simply query it for kNearest [buffer/dataset] 1
and kNearestDist [buffer/dataset] 1
and it should return the index(?) (which I would then use to play back the matched sample/entry).
Am I on the right track here?
I guess this wouldn’t necessarily solve the problem of over weighing something like MFCCs as that would take up the same amount of dimensions in a fluid.kdtree~
as it with an entrymatcher~
, which is, I supposed, where data reduction stuff will come in handy.
So is what I’m after possible at the moment, and/or is it just a question of data reduction which is independent of querying/matching?