Is it possible to create @antiranks? (fluid.nmfmatch~)

rodrigo.constanzo · December 18, 2018, 11:51pm

Is it possible to create dictionaries of @ranks that you specifically don’t want to match with fluid.nmfmatch~?

The specific use case I’m thinking is training up a bunch of different percussion classifiers, but wanting to make sure that it doesn’t accidentally get triggered by non-intended sounds.

So say that I’ve created a set of training data that is comprised of a bunch of snare, and prepared-snare sounds. Would it then be possible to give it some examples of, say hihat sounds, that I can then have as some kind of @antirank that if it matches that, it returns nothing.

edit:
(ok, in thinking about this further, this could just be another regular @rank that is filtered out after the matching, right? I’ll still create this thread as it may be a useful idea for some other use case?)

weefuzzy · December 19, 2018, 12:09am

I think so: if the algorithm is ‘aware’ of them, then it’s matching them. Just ignore those matches when they happen.

jamesbradbury · December 19, 2018, 12:20am

Furthermore you could make it so that your activation only occurs over a certain threshold. I know this has been talked to death over in the other thread (how do you do those speccy links easily?) but that method was always taking the maximum rank at each onset. If you had a threshold then bad matches won’t meet that.

Unfortunately it means more parameters and more tweaking, so more supervised.

rodrigo.constanzo · December 19, 2018, 12:35am

Yeah that’s another possibility, although I guess having it have a strong match, and then ignoring that match (route 5 or whatever) would mean that it’s definitely what you don’t want, whereas having it gated would mean that the @antirank wouldn’t have matched too closely to begin with.

Do you mean the hyperlinks to threads or the quoting of a reply?

(answering both, just in case, so pardon if you know one or both and/or meant something else)

For the hyperlinks I just do ⌘K and then paste the link (or there’s a little icon for it). For the quoting thing, I just select the bit of text I want to quote, and then press the “Quote” button that pops up.

tremblap · December 19, 2018, 12:11pm

I think @antirank wins concept-of-the-year!

groma · December 20, 2018, 8:16pm

as @weefuzzy says, antiranks are ranks, you just need to know which are which and ignore the bad ones.
One idea could be train for the good stuff, then concatenate into a larger dictionary so that the new entries will catch bad stuff. Then play for some time while retraining the new dict.
If you feel the good ones are degrading, replace them again. Rinse and repeat.

rodrigo.constanzo · December 20, 2018, 8:28pm

You lost me here a bit. Do you mean doing a @filterupdate 1 thing?
How/what would “catch” the bad stuff? Do you mean the algorithm would refine the existing dictionaries while building up an “immunity” to things that aren’t already present? Or maybe a @filterupdate 2 thing where the dicts stay intact, but the algorithm catches “the other stuff”?

groma · December 20, 2018, 8:49pm

It should be 1, because it either will change everything or nothing. So let me try to rephrase. You tell it there are 3 things and play those three (or train 1-rank separately). Then construct a dictionary with 5 things, 3 of which are the previous ones. There are two more, you let it figure what. It will put something on those 2, while re-updating the original ones. But it already has ideas of what should go there, so ideally it keeps the good ones. If it does not, replace them. The other two components should have “other things”. It all depends on how varied are your mistakes and how similar they are between them or to the correct gestures. You would have to think about classes of mistakes, the same way you classify the sounds you make.

groma · December 20, 2018, 8:50pm

Actually for best results seed the mistakes as well, because if they start at zero they will never get anything. So they should be either random or something similar to what you want them to catch.

rodrigo.constanzo · December 20, 2018, 9:05pm

Ok, I’m getting a bit more, but I’m still unclear on what exactly I should retrain it.

Say I have 4 drum sounds (drumA…drumD) and 1 sound I want to avoid (drumX).

So create a bunch of 1-rank dicts for each individual drum (I’m still unclear on the specifics of this, but that’s more for the other thread), including drumX.

Then seed fluid.bufnmf~ with a dictionary that contains 5 things (all five drums). (in this step is it necessary to feed it “extra” dicts as seeds too?, like a dict with 7 things in it (5 drums + 2 noise))

THEN

Run fluid.bufnmf~ on an audio recording that contains examples of all the sounds? (including drumX)

Is that correct?

If so, how would that work if my original 1-rank dicts were built off 30ms chunks of audio, but it’s being re-trained on audio that contains those transients along with decays/silence/etc…

Or should the re-training step only be run on the audio that was originally used to train each example? (say a buffer concatenated from every single transient that was analyzed from every single drum)

groma · December 20, 2018, 9:33pm

What you say is correct, give it always examples corresponding to the rank. The second part I don’t get totally, but re-training does not need to be with the same audio, better to give it more examples of the same. Always try to make the training as similar as possible to reality, i.e. what you’ll do with NMFMatch. The transients thing should work but it could also be tricky, another idea would be to (if training with the whole sound) compute the derivative of the activations and detect when there is a jump of an activation in NMFMatch.

rodrigo.constanzo · December 20, 2018, 9:48pm

Yeah in this specific case the “reality” has been bent a bit to get accurate transient-based dicts.

The original version I built analyzed everything, but based on @weefuzzy’s intuition that was changed since I don’t actually care if I’m matching the decay, or envelope, etc… of a given sound. In context I only want it to match via the transient, because I want it to match quickly (and with a tiny fft (64)), hence training it up on just that part of the sound.

In reality nmfmatch~ will be getting the whole sounds, but there’s a bit of post-processing that tries to grab the frame that is most synced up to a parallel onset detection algorithm (or hopefully in the future, being able to just trigger an analysis frame manually).

So even though nmfmatch~ will hear the “whole sound”, the post-processing stops caring after the transient.

What I’ll do for now is try both versions (training it off whole audio, with mixed hits, as well as with just transients) and see how it responds.

The sounds interesting, but is above my head in terms of maths/dsp!

weefuzzy · December 20, 2018, 10:23pm

derivative = difference between frames (approximately):


----------begin_max5_patcher----------
437.3ocsT0taCBBE825SAgreZaDzpM6UYYYwOXcznnAwN6Z569jKZWcwNsY1
zTH2KG4bOG3xIaKbbQCqBidF8Bxx5jskEjRmvpK1BmG0jjEUAvvE0pLlB6XV
RTmaR.K51ksRcLiAnuBGWzCizkrLRk7AWr6MIKQYpApm2ZWGDcCQO441Ohds
6a3olpHd+Jx19cOoHOmITCXjKRYfNb0wmss0CNyTjPsNtFI2TiFPpikLiVv3
KU8.46No725MC0G7vTuf8YKC8akh0X1aEJq8W731h2cXKNl+wQhc2vhl9FBg
5CNUHbcgNtGEhWN0ef0TJQO8NAspcj9ntbPmt2vMXZkSWPk+UFRx1Mtfo24w
9cKZOW3k.RvV8ju6eIZx0hF.fy3he+7FPhN+PmnpnVlzWqFKzA8CMorJEWDo
3EhqvneBRCZTydtDo6jmjovEfnvYvCcg3gLkyM.SNOsrfKTcGUzvf0aZWeCL
44QtDs.UFcFU1+9HclNMY.OlqrQkkGXxpNv.Esso6Kj5vPGHjKLgvqtXI6.u
GuOjIR11GoZahpkltxl.er4SKRYRQMGdsxVy7Y6uAPpVMm.
-----------end_max5_patcher-----------

rodrigo.constanzo · December 20, 2018, 10:36pm

Ah interesting. That could be very useful in the post-processing step after nmfmatch~. At the moment we’re just taking the maximum 0. and trying to “catch” it at the right time with the onset detection algorithm.

groma · December 20, 2018, 11:34pm

yes, what I suggested is to look at the current value minus previous. If it is to shaky you may want to smooth it before. This way you do not depend on absolute values so much