Weighing descriptors in a query

rodrigo.constanzo · January 6, 2024, 3:07pm

I’ll give that a spin as what I originally did is whatever I posted above (some straight multiplication stuff).

Actually on that note, if I wanted to completely remove a column from consideration in the distance matching (but didn’t want to actually remove it from the dataset (because I hate fluid.datasetquery~…)), can I just set the respective columns in the dataset and incoming buffer to 0?

As in, if don’t want to care about centroid at all for a given query, I can set the column in the dataset to 0 and set the same index in the incoming buffer to 0 and it would then not consider it at all?

tremblap · January 6, 2024, 3:20pm

yes. if you think about it, euclidian distance does that (in theory, summing an infinite number of 0 squared plus the dimensions you cared about, then the roots of that sum). that is one of the thing that made me re-think everything yesterday.

try it and let me know. it should be the same, or you found a bug (again )

rodrigo.constanzo · January 6, 2024, 10:08pm

In the end this seems to work out the best.

Basically going with your exponential approach but with a 100% = 1 paradigm. Then a function to scale it so at 0% it gives me 100k (which seems to zero any values out, giving it “no weight”).

The curve actually looks exponential, but my maths aren’t good enough to turn a 0-1000% range into a 100k-0.001 one hence the function crutch.

Sounds pretty good!

Also makes for some cool functionality (like in the realworld example above) where I can spoof a descriptor (pitch most commonly/usefully) if I have a corpus with a lot of pitch but a non-pitched input. This way I can give pitch (as a column) more weight, making the overall effect more apparent.

Screenshot 2024-01-06 at 10.05.35 PM


----------begin_max5_patcher----------
695.3ocuVEtahCCC92sOEQQ2Ot6TGpokBr6UYaBEZCrrqjTkj1wzzd2uDmVf
NJnNX5jfDhiis+r+hCuGFfWI2wzXzePOfBBdOLH.D4DDztN.uktKujpA0vaY
ZMcCCG42yv1Y.44kLpJBEitiLOBQrSw1o3XjeJ1NzclRtfkKqEvASZEJqMkL
i4sJlOZvXzSsaUQM4OyEaVpX4F+tjrjIVCmLyMRll4lRsi6OindKWXMn9XeX
E5cCHkzJkW..Pt5k6HoDrS3GggtgnQlSJ4MrIEbZYGBanJAcKq+lOPl+D9rn
MBgWWJolKB6o2C3EFmRfwEmCzjAAcx.fNIAu2kJaXaXpkLAcUI6X6noMrhkT
iQwWUaXG9ktMyzlZbnurlIW2ItSdOyuqRJXdFP5jzrnSUgK3FWJ0Wui2ixgT
5zvsmVkRwlApGIckidJu0VZACYIs1L7oJneVpL6M25RpQXuSLjo5puCYkZaj
qMu4C5L+9.sqa5KxAWWKxMbo3rLLO6JxS0beWQEaNv1.KwyWlWqZ5U2OKM79
X3V2rE.mbdxQDQk0zs901N.1O9qwSmNDOM6R7z1jLlVTTI4BSaWMHHaCBWb3
WF2sL1+qCKi6hVWurmthBw4ZPRlfF5C9p5+4q.S8QaZJzFLI4l6+Maw0z9Sv
d0d3S.bi8RtBUIe8mIO9XzOVSP+1l4+0s.3TemdRR7kALYr.dN4aDvUz+ZYR
C94V.cx89t9oYWFzKFKnWzCzfwf2j+z+C.BHm79YBsrVk2E96e+.cHnJXZCW
PglQGqUF97WoFuuxFgur.7avUoj+avZ9Xbk8xo6OVcadZwX7z7SyedVBsppg
ozsZCNwdy3EoZO+CaIj9kyfkJVCuSenCMlprzbikiWq7ufta1Tr+nxBlRTyA
ZanCdgsuK4dsUWQ8HAtbF9Q3+vGSA5C
-----------end_max5_patcher-----------

tremblap · January 7, 2024, 9:56am

your code does a fake expon (like pots do to then go into my expon… not ideal

if you want 0->1000 become exponetially 100k-0.001 you can do this:

[expr pow(10, 5-(8*$f1/1000.0))]

rodrigo.constanzo · January 7, 2024, 1:28pm

It’s definitely ugly but it does center important things. With the pure exponential version when you’re at 100% (“no weighting”) your expr gives a crazy high value around 15k.

tremblap · January 7, 2024, 1:48pm

that’s because you’re not symetrical (you have 0 to 100 which is a 10th of the rotation, then 100 to 1000 above)

so I think you should center for a good gui and make sure your user get the right result if they do 50% of one, or 200% or another - which should be equivalent…

but hey, it’s your users who will complain

rodrigo.constanzo · January 7, 2024, 2:33pm

I’m trying to think but I think in most places I use 100% = default/normal (as opposed to 0), so would keep it consistent either way.

And indeed, not symmetrical. At “1%” I found your maxed out (minimum) range good at 1024, and then to properly zero it out I found 100k worked there, hence that part of the angle. The other two bits were just to put 100% at 1.

rodrigo.constanzo · January 20, 2024, 2:23pm

Revisiting this now in the context of fluid.normalize~ (and preset regression).

I have a bunch of canned synthesis algorithms where I’ve pre-baked the ranges, names, amount of parameters into a dict to make plugging it into a regressor fast/easy.

So far I’ve formatted things like this:

{
	"resonance" : [ 0.0, 100.0, 9 ],
	"decay" : [ 0.0, 1000.0, 54 ],
	"brightness" : [ 0.0, 100.0, 100 ],
	"sharpness" : [ 0.0, 100.0, 3 ],
	"position" : [ 0.0, 100.0, 4 ],
	"impulse" : [ 0.0, 100.0, 100 ],
	"input" : [ 0.0, 100.0, 50 ],
	"impulseout" : [ 0.0, 100.0, 50 ],
	"reflect" : [ 0.0, 100.0, 99 ],
	"reflectmode" : [ 0, 3, 1 ],
	"damping" : [ 0.0, 100.0, 47 ],
	"a1" : [ 0.0, 100.0, 4 ],
	"a2" : [ 0.0, 100.0, 1 ],
	"node1" : [ 0.0, 600.0, 61.428570000000001 ],
	"node2" : [ 0.0, 600.0, 127.142859999999999 ],
	"node3" : [ 0.0, 600.0, 194.285720999999995 ]
}

Where I have {paramName, rangeLow, rangeHigh, paramValue} with my thinking being that when I dump things out to add a fluid.dataset~ point for future regression, I can just dict.iter through that and scale accordingly:

This actually works fine, but gets a bit messier when wanting to unpack/scale the regressor output at the backside, particularly since I want it to be fast (and I don’t think dicts are fast).

Hence being reminded of this fluid.standardize~ “hack”.

Poking at the dump output of fluid.normalize~ it looks like it has a low/high value for each column, then a min/max value for the scaled output. Like this:

{
	"cols" : 3,
	"data_max" : [ 10329.1865234375, 21145.490234375, -2.56557559967041 ],
	"data_min" : [ 463.205169677734375, 877.32891845703125, -49.015274047851562 ],
	"max" : 1.0,
	"min" : 0.0
}

Would it just be a matter of plugging in my dict data above into this format, then transformpoint-ing the buffer~ contents on the way into the regressor then inversetransformpoint with the same fit on the way out?

That would be much simpler, and presumably faster operating on the buffer~s directy.

//////////////////////////////////////////////////////////////////

Another thing that I was thinking is that it would be ideal would be to (manually) linearize frequency-based parameters (with some ftom-ing on the way in to the regressor and mtof-ing on the way out), but I don’t know if that matters when using a regressor for synth preset morphing.

That would make things a bit stickier with using vanilla fluid.normalize~ but wondering if the new array things in Max8.6 lets you operate on buffers in this way (in a high-priority thread).

Something like this (Max pseudo code):

tremblap · February 7, 2024, 3:51pm

the main difference is between std and nrm is that we store the extreme of the range in nrm (data_max and data_min and (target)max and (target)min) and recompute them upon loading. so yes, you can enter your min and max in there as a dict and it’ll work.

I’m not sure, but most probably.

our freq out descriptors do have a @unit (or similar attribute) to do exactly that, for exactly that reason

rodrigo.constanzo · February 7, 2024, 4:02pm

So I did get this working in the end, but ran into what seems like a bug (documented here).

Luckily in my case the @min and @max are always fixed, so I can just set them as @attributes in the object, but it does feel like those should load dynamically with what you write in the dict.

Indeed, though in this case I’m dealing with arbitrary (UI) parameters, so it’s a bit better visually reading a filter cutoff in Hz rather than MIDI etc…