Training for real-time NMF (fluid.nmfmatch~)

it’s not derailing, it is actually important for us to know where problem occur in your creative coding.

with change now your code works (and we only get difference because you put >= for the on-thresh - but where your code will fail is at the use of change for a list - you need to check the change for each component separately…

Ok, after building a special use-case for when it’s a single float vs a list, it turns out that the mighty zl change doesn’t give a shit what it gets.

This appears to be working for me, in all use cases:


----------begin_max5_patcher----------
2115.3oc0aszraZCEdsu+JXX8sdzCDRzYZm9XQ1zNcQVzoSlNdv15xUIx.CH
mbS6z+6UO.LXCNXCNWmrvgnGny224nuidP92GV3uN6Edou2268NuEK92GVrv
VjofEU+6E96heYiLtz1L+T9mxV+d+GcUo3unrEmWHRU0klGq17rHMYUAeix8
xwvvkfG8vrnkjv.FCfoPPDlRzkgL0fz+582Uufz86DoRtxNjvpBEasijdz+N
n+gFlsWU2Rfov+6gGL+73HwyNdYYbB+D.AWhfdeGXI.6AWFv7.0svMdpOmyc
Hy2uwr6A2AHlEcPGHwryiUzoXEi6Grvq.rOIyzujgQxid9qiSSNKhBcdRlEI
DvE66vvCgIEw63JdwJdZ7Zo0J.U08TVwtX6HF1K5QuRnmfIFDGguVzC9VF8A
jfIgdTz2xnGSCmF5YuVne.Q62mIR8B79IUgHIgWT5cPX8xT4pXFHgZIE2+ZX
tInGtgNepbCJo6A7fWuPNdIo+DXPzkqpif2Wp5AP6LaTD3JCtgeSO0lDDMQ3
S+lVWmglH7IuVvWkkjH4CBeypRGgxUeSqQtYDAW.MLpffoOgen0f6IEkp2t4
4cBkZRB4X85TOQkiBOeXAtmU6DMfJWaKfWTA0JrpiSDR9G04iDYosZ8B+377
VEunUWLDz6yruH1iMEIRcEAZJpf+QQc+QMkFWn4AklD1WXYI+WBC7O7Zx1xK
R2KZ7VVWUkIYcJoZecYd7FWmM9t5pOvrLazDDY0YHtDkAQvCjo1omHy17A91
1lreVNOUjlWvK4opXUks2T8V9Sw6kpUOkkpJE+i0Br60nm5epxB6sRCFrl+O
WHhkM.HoPrMK0XDc7DlhqGt2oy66x92FL1VjFm2Sm0QFZZYfJK0fbe453Bii
pZlSsmROYOS1spl9I4OoppNWjldDKpxxGtR8ped9L8ccltxcm6caqob09TWs
qzwDpUkwerKaqhkxpYtce8uDmJz5hbkv4BPflJcpGOWtoHSJ6fWWMermZ1pi
w2v+jXq5Y6.0NXP2bQdcPjeiWdqHgWp5VlJNoraIkpO6H8VEsec0b3UJ9tbo
FEcaPmyWn8D11ZbcJ+bZcc069Go2lm0o13sqs2DfMZdCo6Airha3P6eEdRBv
AWS2PY.FJ61hZ09JYj4mUJzCIuYUtikSFVBXHFKvsnWMWEp+ChDQCI5+Rqpw
tZ1qY+gM1TOZZuxLK7VyrUK+RmhfNM1DRumYye7GzRSmkKau1soEjp26lIxz
kC9p3RL3p3R3WIt71Sk0aJvwjTlYyvWGSBumYRkmzSNa4Sf.KqwfeY1BNfhH
5UOghNMahV5Ke9xxhHikUFJOA7dfUj5ktNYNAQrmqDDa2X.ld84NA2CbhdaT
EWbJxyLAxQKAWOs.i95PKVC4739KgYbjCrVMCbyuCCYv.JrjggbaYlMY61wS
6tre6qJcK+k6P9oJlX57S3LwOv6K9o5T7lN+PmI9AOi7iyF7GWDh6vrGCELP
ZWLaXJ.L0PjCGRhTjNztisnyTe+bSY19hM0QFUa7zqKN061WIRaNxn2c.YlF
NJWykZDl7xixJn2PivrsGO3HLhZq81XEQi0g.NogYEacGJJ3U1zfCaZvaS.D
Xjll02AuUgwnQZE0L4syJFUbL3VZEQ2CylviNt3V5QviUdCdK02LKua7SeuU
yQLKhZbdDzsjKnWhJa+bQUg02oiu4f62txcIEqhUpBw58JWZ51WR0EcX5Ixr
0wxpiJu4bW54r1e3fwUcofVv0ckBt6H7Xtp5lBOki5H2OTDam1znx2+MxNqi
bsR9wCMXRCMjNhg1ZeHm++5GIxXGI3TGonwNRfINRixwQ6OjANoQFQu.LNyC
MaLQqzYfdQQicjlZHS27lmajl5zft4FO2HgmJ6MpPD7WVUwordzGUfYTO5iI
3nOjfS+HBF9CH33Od.aRF2EodjZ9gCie+VQ1as2B9pemmt2kjn9t56xMqSdR
HkaxjYm7MTTevA9tZa9dEpa667.KQQAPHSSlKwPLEEZeR+.gzdIet9.q6T.I
HBfLMMHDQCH1mXHLAadBbT2PGFK.LxMBfHFHv8jtHndrZ2s3zjpa8m15pqyK
xxyJZ97GVhiZZ+dUVRQ7VQ09vAcRg9XUnUgt1iu+Z+cZBUT2m1ghMNiea+l3
y4.LG8oEmqDolnHdCbwQHnEtnF9BQYA3tvsy8vnCc+ibdp2aiSK8dKemXclb
6gOWmw3raHhA72izGz0iyZ7yLJfvXtmNAKG6wCI5AwzVy8CGRcO4dS2DGdEM
EuYithN3Fa7AVvFTaSlnbSYs8EbIe2wckxzltC6.Hi4dR+VPnt8ccRmtEABf
T6PxhHATpqanH2qhD1zuN8JfRfXaKPQXcGcgOPP3QFZo1R2XnltlJR2PK3fg
Tja1HC.fAUcdVlP7gRoPqn8mF4tyMwnkk4NH+1+LS1xt3WBa9Xqtv4mfGa8S
atsmvGW.eyDZZ0bmAc+GAXs+dFAbqyI9Kx75HJstLJpc.uQFJf.ma+Pmu454
xWb8Jdsztp0dN8oyncQYlLZO16Sm1s6xbbcmcLNuX6CYeDQWVeVq0CnwLNZN
irb203uH2yq+elP+VUeZ9PZXfKoSiAFYJBdCLvhrOkdwV3A6hzL2Deirve8y
wWtABP5bWNIOJFibycPPHDO+F3aJ37qvBsrG17GWdOyRUlei6u3RY1mtXqio
ycScJGPFzpuQzLJ4zU.15SoEYZD.vlK.re2ZdwazKj7hMeSnHv59I5TeDKRf
sUylECrvsfBiEd9c5bY1VeKuJLHLDGUEESqVTl1eDc8wytsPNvGA9C+2C+Of
oLXZt
-----------end_max5_patcher-----------

edit: actually in the context of the patch, the zl change isn’t wanted. Here is a fleshed out vexpr $f1 * $f2 context:


----------begin_max5_patcher----------
2166.3oc0a00iiZqF95L+JPndUUZj+.iwUpUsmyE8ldz4h8hppUUQjDOLr0A
Pfyry1p9eu9Cf.IPFmDX2Y2UKCi+.+9779I1r+8CK72j+Bux2668du2hE+8C
KVXZR2vh5eeg+93W1JhqLCyOi+w7MeveosKI+EooYomvSzzZ9AofKkepfaex
99KU+y6Op6tHVt8ozrj0k7sR6HXjUjkdHLcEXoGD.M+FZEncNYG1mlodnFg.
V2X5NyZqjmuC6ebf1k2LRjtw+4gGzWVdeH7Y9KEkdeyiPuuUcEMNVuHNC0.D
ivM3DbIbhNGmngwIb5vYQYZlzebHfsPfBVQBChh.XJDvvThFVWFNCn1fCCGv
M.m87pp3D9Y3AtBG4AVAYdvUdfaSsACnZKRH1pvvQWsdCimNE2ihb0C4hNaa
hyRtHhvDjwGyfDB3pUbX3QSjx38bIubMOKdivHEf59dLuberYECmL2yo.8nv
vUQXLN.S.jHh07814BvWybALBtBFFFp7lYAv.PH5d3BD6qYtnNMzMC9nuTfe
jX4eHOMyKv6mjkoII7xJuiAbux7VFhgZB8ip+wnLSv.LCc5B+MVfdfm4u2ZD
drFgCkVCht9v8H3aqv8PaJKHCeil1vup8qwXvcBe5W0o6HA2I7IeofuLOIQv
GE95RUcnd6A7pQFJQe0YVvIaf4qtbOQZk7caeZepTdiA4n1HYpebVPN5q7tH
3Ap6gMRPttR.urFp0XUYljJ3OqRFklm0YzK7iKJ5z7hNSQSPeH27fhV11TZl
sIPaSk7mSalOps03REOHUjvgRCK4+RXf+wGS9NdY1gzVskQUUKRFkRlRWWUD
u0NYstqo6tLq8M5vFqJhJco5GAjNroRqmHx29m7cckY+7BdVZVQIuhmIik0B
ea263OFePHW+XdlrJ8uLh.T+ZVCz+i0h3fcpAgQ9+4xzXQKBRJS2kmoEhdpB
cyMKmBbFznudDLlQjEWLvjUlFJdYjNqTf7P0l3Rslp10oQUob1yE86pcdB9i
x5tKRyxNgEk4Ei2op1mmtvb2jq5b+kd1ldpVeHy16ZkQgbcU7y8YaYrPT651
+w+RbVpJtHWlZUAHPam1vGOUssLWH5gWaOOOPO6TF4a4eLcm7IyB00XPM7zh
FiH+Vs7tzDdkreax3jp9sTI+jkz6zzgM0Nwqk78EBEJ5OfdaeTWO1tA4509k
B10OfWoJjBusdNG1Xo5jPiYrOZke1fhgfUp2BR81OLp5cgzuDTvY08MZsei9
ZPsxz.duijgbQSJi5XQyEyBmalstPCBcE89Xy1pudSxl+3OnbBuHW1sJk6yH
EgWosLiBtYtDCtItD9YhKmepro7WKSpKE5VYR3aYlr6Fx6pG9PFdLncqpMrV
D70YK3HQDQ9eocT+KgWhJzWwjwJXDwUVYr7Dv2BrhPUj1cyIHB1tG4L6qbc6
4NAuE3D0aLTd0oHufCjkVBtcZAx97PKFA4x390vLlYAqIlAt853PFLRDVx3P
taXls462yy5Wfq4Qksi+xaP9o1l394mvIhefus3mfHzzvOzIhevSH+XkAe2R
vX2pcWnfQR6hiFmB.2qIxwsCPjlM16AZPmt+g4lp7CkaarLZRI50GnpWrUll
0t6Hu+HzzCzIcy0JEZSG2jBU1JO3bIEgNJE5pqlOtf3JW.mStvU6BHcN4Bfq
ZD1LJElGNzEo.NmRAxUo.L2RgS1EypFwY6h4zGoAht4o1ef4k6r6UO7KrnAF
Wz.yinQueuo5FaN5.e81CuascqvWGKkkoaNHsoH6dVHW0V1lHx2DKp2P1187
Xfcz8giBW8YOY.W+rz1ih5Ttp9.oN29tqR4ULsaNapdZsyNDLWWYLykk9nS0
MuPt.wdUEbhKyrtvvgWXz8QsNxrf6jYw8rKFwsmMAKDh5.hvygNDxbek+L.Q
7r3IBcwdQKe360fA35JgtW5j45Jc2gWhbcktWSDHwESD1DfIH00UpudxlN5j
C7WuHmbP+mbH+me.+ie39mdv9lLy1y37jTfGO8fC6Ryem4.pW++3YGrYVaNF
89TwljGSEhs4h7y99FZ1oCeauseKAMi88dfUHV.DFo3tUXHlhBM2otgP5VIh
cNvlIEPBX.jdnAgHZ.wbWDBSv56.mLMzw0B.Y1U.vh.A16TMAUqU2oEmkTef
7zNmpbQYdQdY6WlvJLqc7Gj4Ikw6Rq23.Pu5NVVaIUp58ziV1euhPSalSWKu
VkwudXa7kT.58p0fy0oYZqHdKbwLDz.WTKegnQA39vs2AGorT++E7Lu2EmU4
8N99zM4hcG+TZbQY2RDinucTGzWiG0pmin.RTj8tyvxoZ7PhZQziUef1gT6c
1mzrnvqoo3saUczC2XsNv.1fFYRakqaqqtfK36OcpzHknawN.FEYuS8TPn9y
cSRuow.APpYIiXj.J0NMDy9nHgsyq2rBnDH1LBDCqln07ABBOQPqTR5VM0zW
TQpAZ.GLjhrdiQ..LndxShCweVIRUQz9Mc3tK4XzQxrm7P2KSjrrO9kv1ODp
qz+Drryktb6.lOVC9VGZZsuynp+S.rReOg.tyFa+pLuxhREWFw5ZvqCCEPfS
sdn2mC8ToKt8Hdchc0D6476tPrKZjNi1xAu67o8lLGWeuC2zhcOU.GrtL5rN
0CnvLlMkVV1CG8+HNva9uLvvR0Pw7gzv.aRmVAjoaBNCBXY9GytZI7nbQZ8M
wyjD9e+T70Kf.jJ2kMjGEiQVeGDDBwSu.9Kkb9MHgF1Cq+iMumtTkoW39ctP
j+wqV5hT4to1HGvHnI9FQwnjyq.ryW4JROH.HZp.vg8a3k+hpPxqV70lh.i5
mnR8QLHA1MZ1jHfk1BJzR3keSmqS1Fp7pvfvPLq1JlVWTlRevtc6Y6qPNxGn
8C+yC+KP1Yi.q
-----------end_max5_patcher-----------

I’m sad to tell you it is not working as a Schmidt trigger completely - during the DMZ you are lacking values hence not outputting! so if one your list is in DMZ state, all the other are paralysed. Try it - make all input high, then put input 2 in a value between the thresh, then try to make input 1 low - it won’t budge. If you print the output of the p listSchmidt you will see you have no update because inside you do not get 4 values…

1 Like

I see. It is possible to break it.

It seems like this should be possible without having to go to js at the current data output rate (which I worry would jam the scheduler, since it’s always in the low priority thread).

I guess it’s trivial with a finite/known list length as you can just unpack -> schmitt -> pack all the bits, but having something that doesn’t care about list length would be good.

(what is DMZ?)

actually it doesn’t work. I mean, the only reason to have a Schmidt trigger is that hysteresis. Otherwise just do a simple comparison and you’re sorted!

I meant that there is a special status between the 2 thresh…

2 Likes

Back to nmf-y questions (for @weefuzzy and/or @groma (or @tremblap too)).

So I’m building a thing that will train off an arbitrary amount of hits (per classifier), and I’m wondering if there’s a (computational) difference between these two approaches:

1a. Run fluid.buftransientslice~ on a buffer and get the transients.
1b. Take 30ms chunks of audio from each transient point and fluid.bufcompose~ them all into a single buffer.
1c. Once this process is done, run fluid.bufnmf~ on each slice of the buffer separately.
1d. Sum all of those dicts together into another buffer.
1e. Add the summed total to a single seed dict for fluid.nmfmatch~.

2a. Run fluid.buftransientslice~ on a buffer and get the transients.
2b. Take 30ms chunks of audio from each transient point and fluid.bufcompose~ them all into a single buffer.
2c. Once this process is done, run fluid.bufnmf~ on each the entire contents of that concatenated buffer (which only contains 30ms “transients”)
2d. Add the summed total to a single seed dict for fluid.nmfmatch~.

Basically, does it matter if I analyze each transient separately, and then sum their dicts, or if I just analyze all the transients at the same time to make a dict. (the latter is just a smaller/tidier patch)

Well, the short answer is that I’m not sure. I had wondered whether the longer decomposition in (2) would be much slower, but I think NMF scales linearly with the FFT size & number of frames (but quadtratically with the rank). I’d be inclined just try a small example and see (the scaling will definitely be different though).

That said: I’m not sure what useful work 1b is doing. Can you not BufNMF the segments directly out of the original buffer?

1 Like

Yeah I wrote that down wrong, duh.

The correct “1”:

1a. Run fluid.buftransientslice~ on a buffer and get the transients.
1b. Process 30ms chunks using fluid.bufnmf~ to get a @filterbuf output.
1c. Concatenate each @filterbuf buffer into a long buffer.
1d. Sum each @filterbuf back into a single buffer.

In reading/writing it out now (I was in the middle of patching, and brain was pointing in different directions when I wrote this post) I can see that I can just keep summing the output of fluid.bufnmf~ “in place” on the combined buffer.

So there’s no need for an actual/extra “summing” step. I can just stack the outputs as I go.

A further thing to try is not summing the dicts into a new buffer, but to leave each dict in place and run (after the first time) with @filterupdate 2: this will use the dict as a seed but also update it as it adapts to the new input.

1 Like

In this case I’ll have an arbitrary amount of dicts, as I’m trying to feed it longer training data (i.e. a recording with 10-20 hits at different dynamics), so I’ll have a whole bunch of dicts. Or do you mean having a dict-per-channel, and then feeding that one as a seed?

Wouldn’t it be @filterupdate 1 that allows it to update? (I thought 2 was the fixed seed one).

So if I follow you correctly, you’re suggesting:

  1. Doing the whole fluid.transientslice~ -> fluid.bufnmf~ to create a dict out of each individual transient.
  2. Use fluid.bufcompose~ to stack all those together into a multi-channel buffer (one channel per dict).
  3. Running fluid.bufnmf~ with @filterupdate 1 on ALL the transients (should I window them? should it just be the transients? or should it be on the “raw” audio).
  4. Taking the output of that fluid.bufnmf~ instead of a “summed” version of all the dicts from step #2.

Is that right?
A lot of subquestions in #3 there.

Uh, yeah. Probably safe to say you know the interface better than I by now :wink:

I’d perhaps got confused by what you’re trying to do. I thought you were trying to generate a single dictionary entry from an ensmeble of different candidate hits. If each transient in your original is meant to map to a distinct dictionary entry, then the filterupdate thing is a red herring.

Hehe, I only really just learned what it was with @tremblap referring to it in the CV splitter thread.

I’m trying to step up the process from earlier in the thread.

So I have 4 distinct sounds I’m trying to train it on. Earlier in the thread each one was being trained on individual example hits (from your suggestion to not wash out too much detail). Now I want to try stepping it back up so each classifier/dict is being trained off multiple candidate hits.

So 10-20 hits on drumA, turning into the best representative dict for drumA. (then rinse/repeating for drumB-D)

So with that in mind, should I not do the filterupdate 1 thing, and instead just sum them in place?

Ok, in that case trying the @filterupdate 1 thing is worthwhile trying, yes.

Meanwhile, Discourse just told me off for replying to you too much:

Have you considered replying to other people in the discussion, too? A great discussion involves many voices and perspectives.

Uh, thanks.

3 Likes

lol, it complains at me for everything I make too, since they are all “similar”.

So with that being said, back to these questions from step #3:

  1. Running fluid.bufnmf~ with @filterupdate 1 on ALL the transients (should I window them? should it just be the transients? or should it be on the “raw” audio).

I would try running it on each identified segment from transientslice~ + 30ms, one by one with @filterupdate 1. Not sure what windowing would ential, a fade? I wouldn’t, given that they won’t have fades IRL.

Do you mean taking each single dict (say 15 of them), and then running fluid.bufnmf~ with @filterupdate 1 on a buffer of concatenated transients 15 times? (and THEN summing all that together)

Yeah, I meant a fade since an audio file of 30ms chunks of audio will have discontinuities not present in the originals (hence on whether it should happen on the unsliced audio, or a buffer of concatenated transients).

Hi guys, sorry I haven’t been following this conversation and it’s gotten a bit long. So I’m not sure I get the questions right, but here some random bits:

  • NMF is a learning process, it starts from random, so there is no notion of mathematical equivalence, but it is possible to arive to similar results in different ways.
  • The dictionary elements will be normalized (each one with respect to itself) for each iteration of the training. NMFMatch also trains and does that. The activations will have to compensate for that, so they will represent the actual scales of each detected component corresponding to the input spectra. The dictionary elements only care about the relative scale of each frequency bin.
  • Given this behavior, I expect the easiest thing would be to process mixtures and sequences. Rank-1 NMF has also been used, but I don’t think it should be thought as equivalent. While the result of NMF looks linear, the learning process is not, and as I said it starts from random. I would not normalize or average the dictionaries outside the algorithm.

Hope it helps!

2 Likes

I can see how using rank 1 can be practical for training it in a supervised setting. However always keep in mind that the amount of trainig examples also makes a huge difference. Every time you do the “same” gesture, there will be small differences. If you tell the algorithm there is only one, it will have to “compress” all the examples into one. Same when you tell it there are 4, but then it will have to work harder because it may have to compress also mixtures of those…
On the other hand, skipping silence or unwanted sounds is generally good.

2 Likes

Hmm, some monkey wrenches in there.

So initially I was running bufnmf~ on a chunk of audio that included multiples of the “same” hit. After some discussion on here this got reduced to using buftransientslice~ to find the peak in an audio clip with one transient, and only running bufnmf~ on the transient + 30ms (also testing +50ms), to get a more accurate representation of the transient part of the sound, to match against.

This led to some somewhat working matching. Not perfect, but at least in the ballpark.

So now I’m trying to broaden the training for the classifiers by giving it multiple examples of each individual hit (still by finding the transients via buftransientslice~).

Where this has diverged is that I wasn’t sure whether to take the @rank 1 dicts created from each hit and sum them (which you’re saying I shouldn’t do that outside of the algorithm, because the scaling will be off?). With @weefuzzy (and @tremblap a while back) suggesting that I then retrain it with @filterupdate 1 enabled.

So with all of that being said, in order to best train robust classifiers, should I run it on “raw” audio (lots of different attacks, with decays/space between) or just on the transient + 30ms?
If so, how should I go about summing these if I have multiple versions of each classifier? (or did you mean something different by average)
Do you think that running @filterupdate 1 on the source audio again will be useful?
If so, should it be on just the transients or the “raw” audio again.

edit: so when using fluid.bufcompose~ to “sum” a bunch of dicts, I end up with dicts that have values > 1 in them yes? (as in, the waveform~ display will appear to be super clipped)

I’ll try to find the patronise flag in the pref and lower it to ‘kind daddy’ instead of ‘self-righteous uncle’ :wink:

update: sorted to new values:
Number of posts a user has to make in a row in a topic before being reminded about too many sequential replies = 20 instead of 2
Number of posts a user has to make to the same person in the same topic before being warned. = 20 instead of 3

Is that libertarian enough?

2 Likes