Not entirely sure what to call this, but I got the idea today that it would be interesting to do the short term matching I’m presently doing/planning (e.g. 256 sample analysis/matching window) while still retaining a semblance of morphology of the samples being matched.
The way I’ve done this up to this point is to use additional descriptors/metadata to bias the query where I can match the initial window, but then use something else to choose “sample length” or other such variables.
This works super well, but essentially gives me only two modalities. Either vanilla mosaicking where I query/match grain-per-grain and get a granular sound, or the short-to-long matching where I use a tiny window to play back an entire single file.
I’m now thinking about doing something that’s a bit in between both. That is, doing something mosaic-like but retaining as much morphology as possible from the sample being played. So what I’m thinking is a sloppy/“fuzzy” version of with AudioGuide does with its frame-by-frame matching, where I would match the 256 samples to 256 samples, then have an increasing radius being applied to subsequent searches if the next bit of my realtime input is “close enough” to the next bit of the chosen sample, then carry on playing it, and carrying on in this way, such that it would be kind of like mosaicking but prioritizing continuity over accuracy.
That’s also where I thought novelty might fit in, in that rather than using an increasingly permissive matching criteria as time progresses, I can use parallel novelty segmenting so that I keep playing the initially matched sample until a change in novelty is found in one or both of the processes (e.g. there’s a novelty slice in my incoming audio cuz I changed my playing and there’s also a novelty slice in the sample because it gets really loud all of the sudden).
So this second idea is more like “carry on until things go diverge greatly” vs “carry on matching, but increasingly sloppily” of the first idea.
Has anyone done anything like this?