This is super cool in general. I specifically like the travelling salesman restriction.
This is super interesting as well, and though we ended up chatting about this in the last Thursday chat, I wanted to bump the thread version of it.
I’ve been chasing that dragon of having conceptually meaningful sub-groups of descriptors (LTEp/LTP) but that gets problematic when you’re combining sources in which some characteristics aren’t as relrevant/meaningful. For my case, this is often pitch, but if variance is what one is after maximizing, it’s possible that it may not make sense, or be sub-optimal to conceptually group things ahead of time.
Maybe some kind of hybrid approach where you have a reduced descriptor/statistics space (similar to what you have here), where it doesn’t matter what’s in it, it’s “a good representation of the stuff”. And then along side having some perceptually meaningful descriptors (loudness, pitch, centroid, etc…) that can be used to bias/skew the query (e.g. return the nearest match from the pile of goop that also has loudness > -6
).
That’s some (more) functionality that isn’t presently possible with the querying tools we have, but having a kdtree that can also resolve parallel logical queries (e.g. find the nearest neighbor from columns 5-20 (which have been pre-fit) &&
some other query on column 2).