Biasing a query

So jumping on the discussion from the LPT thread and the concept from the hybrid synthesis thread I got to thinking about how to bias a query in the newschool ML context.

For example, I want to create a multi-stage analysis similar to the LPT and hybrid approach, but a bit simpler this time. An ‘initial’ stage, and ‘the rest’. I then want to do the apples-to-orange-shaped-apples thing and match incoming onset descriptors-based analysis to find a sample from the corpus. So far so good. But what I’m thinking will be useful to do now, is have a parameter that lets me bias the querying/matching towards being more accurate in the short term vs being more accurate in the long term. Or more specifically, weighing the initial time window more, or the full sample more.

In the context of entrymatcher this would be a matter of adjusting the matching criteria by increasing/decreasing the distance for each of the associated descriptors/statistics. But with the ML stuff, it seems to me (with my limited/poor understanding at least), that the paradigm is “give me the closest n-amount of matches”, and that’s basically it. No way to bias that query (other than generic normalization/standardization stuff).

I guess I could do some of the logical database subset stuff, but that seems like it would only offer binary decision making (including or excluding either time frame).

Is there a way to do this in the newschool stuff? Or is there another solution to a similar problem/intention?

So I went ahead and created a patch that does some of what we talked about in the last chat and made a quick video demo-ing it.

So I’ve analyzed a mix of a bunch of different samples for 4 descriptors with some statistics each, and then analyzed three time scales from the file.

The descriptors/stats are:


And the time series analyzed are 0-512samples, 0-4410samples, and the entire file.

I chose these as they more closely correspond with my “real-time” analysis window of 512 samples (so I can do like-for-like matching there), and for the Kaizo Snare performance I matched against the first 100ms of each file, and that provided musically useful results.

The results for each matching window is quite different, not surprisingly. I think given the material I like the 512 and 4410 options the most. The 512 gives the most range and surprise as it’s matching the initial 11ms very well, and the rest is a “surprise”. The whole file matching is dogshit, particularly with these samples, as there’s so much fading out during the file that you don’t get anything useful here.

For the variable part of the video I’m using the % matcher in entrymatcher here and crossfading between putting all the weight on the 512 query, or the 4410 query.

So based on this, I think that it would indeed be useful to be able to bias a query between multiple versions of the same n-dimensional space. Because everything is in the same scaling/dimensions/space, nothing gets fucked around like what would happen when trying to bias things towards “caring more about loudness” or “caring more about spectral shape”.

Musically, this makes a lot of sense to me, to be able to nudge a query, or match in a direction that musically grounded, as opposed to selecting/massaging algorithms based on data science stuff.

I don’t really know what this would mean in practice, however. Simply exposing multipliers pre-matching would add some usefulness, like with @danieleghisi mentioned here, though things get complicated really quickly once you have a lot of dimensions and/or data scaling/sanitizing.

I’m going to try to move my querying/matching setup into the ML world (gonna have a jitsi geekout with @jamesbradbury tomorrow), and in the lead up to that I’ve been trying to think of other use cases that I need to satisfy.

Beyond biasing a query, being able to single out individual parameters would still be incredibly useful. I’m specifically thinking of metadata-esque stuff like overall duration, or time centroid, or amount of onsets within a file etc… Things that wouldn’t really be useful to use in a ML context, but would still be incredibly useful in terms of nudging along queries in a certain direction.

I remember seeing some database-esque logical/boolean queries, but I think that was primarily for creating subsets, not for doing individual queries.

With the tools as they stand (or how they plan on standing for the bits that are still in motion), is there a solution or workflow for queries like this?

A naive way for me to conceive it would be where you would ask for the n-nearest neighbors && timeCentroid > 3000.0 or something like that.

Found and played with the 7-making-subsets-of-datasets.maxpat example and this looks like it would kind of do what I want.

I don’t really understand the syntax (or intended use case actually).

Say I have a fluid.dataset~ that has 5000 points (rows) with 300 features (columns), and I want to filter and create a subset of those (something like filter 0 > 0.128). Meaning, I want to keep the amount of columns in tact.

My intended use case here would be to create a subset based on some metadata/criteria which I would then query/match against. In this case I want all the actual features to stay in tact, so I can query them. I don’t want to filter out just a single columns worth of stuff.

I don’t understand what addcolumn (or addrange) are supposed to do. I played with the messages a bit, but none of the examples show the dataset retaining the amount of columns.

The process also isn’t terribly fast. Even with just 100 points like in the example, it takes around 0.5ms to transform a dataset into another one. If queries are chained together, that can start adding up.

Granted this process wouldn’t be happening per grain/query, but it may need to happen often enough to be fluid (say if I’m modulating the filter criteria by an envelope follower or something where the louder I’m playing, the longer the samples I’m playing back are etc…).

Ok, I setup a speed test with a “real world” amount of data (10k points, 300 columns) and I get around 29-32ms per dump of the process.

This is, perhaps, not the way to go about doing what I want to do, but given the current tools I don’t know how else to go about doing something like this. (as in, I don’t want or need a new dataset, I just want to filter through the dataset as part of the query itself)


you actually do every time you change the query parameters. But I don’t see in your patch what you are trying to do… you’re making a 300 dim 10000 entry dataset and you want to get a subset? Why do you time the dump which has an overhead? I have here 30ms for the first query, then 18ms. If I keep all the columns (addrange 0 300) it gets to 77ms which I presume is the overhead of copying more, although @weefuzzy and @groma will be able to confirm…

In my intended use case I’ve got a descriptor space with 10k samples (or however many), each with 300 descriptors (vanilla descriptors, stats, mfccs, stats of mfccs etc…).

I would like to use one column of that (timecentroid for the sake of simplicity here), and bias a query based on that single number. As in, return me the nearest match (along all the dimensions of the descriptors) that has a time centroid above a certain value. Or find me the nearest match where the loudness is below a certain value (this gets weirder).

But the overall idea is something along these lines. Where some columns in the thin flat buffer correspond to data that I’d like to use to query against in a more direct way.

That’s how addrange works! I kept trying stuff and didn’t couldn’t get the default example to return every column it started with. (I could have sworn I tried addrange 0 4). What happens when addrange is backwards, like in the example? (addrange 2 1)

its not backward, it is like all our other range interface: start and num
0 300: start at 0 and gimme 300
2 1: start at 2 and gimme 1

In your case, if you’re going to bias a query on a subset, then query in RT, you make the subset, you then make a kdtree of that subset, and query that kdtree. the query should be faster than on the real dataset since you have less values. what is even better is to query only what you care about, so you could dismiss the columns you don’t care about and make a smaller subset in that dimension too (nb of dims)

Right. It’s just visually confusing because there is no associated message that goes with them, and addrange 0 5 returns the data I want, but for visually confusing reasons (as in, it looks like it goes from 0 to 5, inclusively). I’ve mentioned this already, but this <dataset> <dataset> <number> <number> syntax is a lot more confusing overall.

I tried to make a dataset that would have around the amount of numbers I’d be dealing with in a realworld context. I can easily get up to 10k samples/grains/bits and 300 descriptors/stats is about par for the course these days. Granted my realworld numbers may be slightly smaller, but I wanted to test it with numbers greater than 100 entries and 5 features.

I perhaps wouldn’t need all the columns for each query type, specifically if each column corresponds with a different time series and I’m choosing between the initial 256samples, or the initial 4410 samples etc…, but if I have a single value in that column that could potentially update on every query, I’d have to do this process per query.

To further clarify. It wouldn’t always need to update ‘per query’, but if I’m doing the thing you (@tremblap) suggested ages ago, where I have a slower envelope follower going, and us that output of that weighted against the current analysis frame to decide how “long” (or higher ‘timeness’) a sample to play back, that means I would need to update things per query.

Kind of an old bump here, but I think this thread is the most relevant to this discussion/idea.

After getting some ok results using a regressor to use a short analysis window to predict a longer one I was wondering how to best implement this in a patch.

Ideally I would take the realtime audio input (analysis window of 256 samples) and compare this to the corresponding window in the corpus,

and also

use the regressor on the realtime input to predict a longer window and then use that longer window to query the longer window in the corpus… with less weight.

I was thinking that a “lofi” solution to this would be to concatenate the realtime input descriptors/stats (8d) with the predicted/regressed one (also 8d) into a 16d point which would get compared to a concatenated 16d for the corpus (8d of the first 256 samples, and 8d of the first 4410 samples).

But this would give me a 50/50 weighting between the “real” data, and the predicted one.

So I was wondering if I could just double up on the initial 8d by duplicating them so each point would have:

8d of first 256samples
8d of first 256samples (a literal duplicate)
8d of predicted 4410samples

and that would be used to query:

8d of first 256samples (of corpus)
8d of first 256samples (of corpus, a literal duplicate)
8d of first 4410samples (of corpus)

I was thinking I could also achieve similar results by scaling the 8d of the 4410 windows down, but there aren’t simple ways to transform a buffer this way on the fly (as far as I know). I was thinking I could perhaps do something weird like creating a fit using [fluid.normalize~ @min 0. @max 0.5] and then use that to transformpoint the corresponding realtime input, but not sure if that would behave as I would expect.

Any thoughts on this kind of “lofi” weighting?

I think that at that point, you might overload the fluid.normalise object with loading your own scaling per dimension. To check the format, dump it first in a dict. each value of bias is added and each value of scale is multiplied.

Hmm, you lost me there.

What I was thinking was that as part of the normal processing chain pre-regressor (robustscale->PCA->UMAP->normalize) I would set @min 0.5 @max 0.5 such that the whole of this dataset would be scaled down, with the corresponding realtime version being the same.

I guess I could just keep things 0.->1. for the regression step but then normalize the output afterwards? Does transformpoint allow you to apply separate @min @max attributes to the output or does it literally only apply what has been fit?


Is it bad to send fluid.normalize~ @min 0. @max 0.5 data into fluid.mlpregressor~ @activation 1?

no. but you want to bias the query, so you might as well bias the relative weight of each dimension.

check this example. it not about fitting a standardizer but using it as a scaler - subversion is fun :slight_smile:

standardize-hack.maxpat (15.7 KB)

I was planning on applying this transformation to only the regressed/predicted dimensions, and leaving the analyzed ones intact.

So effectively doing:

8d of first 256samples
8d of predicted 4410samples (scaled down)

Would the fact that it’s standardized (vs normalized) really matter here if it’s just changing the min/max?

And correspondingly I suppose this would have to be with a different @activation in fluid.mlpregressor~.

no problem - use my hacked standardiser (you will notice it is not ‘fitted’ and independant) with transformpoint for your query… but using that in anything else than a kdtree is definitely not going to work - this is a way to squew the euclidian distances…

it is like changing the range manually for each dimension. i use standardize because I know how it works under the hood.