Training for real-time NMF (fluid.nmfmatch~)

The magnitude spectrum isn’t the only option besides the time domain for representing a signal, but because it’s versatile and often good enough, it tends to get reached for first, and so gets used a lot, and so is well understood, and so gets used a lot. Etc. etc.

As you probably know, with STFTs there’s always a tradeoff between temporal acuity and frequency resolution. In the specific case of the STFT, this trade off is applied uniformally, so that there’s an even division of frequencies, and the same granularity of time at every frequency. But there are represntations that don’t do this: for instance, we can have a finer frequency grid with coarser time at LF, and vice versa at the top. The upside it that it’s easier to match these to aspects of our hearing; the downside is that these reprentations are harder to turn back to audio that doesn’t suck, and harder to understand + process.

That doesn’t matter so much if what you’re making is a classifier: indeed, for things that you don’t need to hear the results of directly, you can get pretty baroque with the model you use to try and classify things with. For instance, by using quite abstract statistical models on top of something else; or some other aspect of the signal, like trying to find patterns in how the spectrum changes over time, without being interested in particular frequencies per se (e.g the ‘modulation spectrum’).

It could be that you’re washing out too much detail in that case. To keep it simple, you could try just making a single dictionary entry from a single hit, and combining a bunch of these into a dictionary. Would be inetresting to see if the performance changes drastically, one way or the other – and which things it struggles to discriminate on: those might give us a clue for a better approach.

Turns out I have the binaries already. However, opening the patch in PD vanilla, it seems like there’s a whole bunch of old pd-extended goodies the example wants. I’ll try and get it working. It also turns out he was training on 32-point FFTs, rather than raw audio (so it may be no better).

Rebecca Fiebrink’s Kadenze thing is great, I think. It’s relatively gentle, but gives a useful general glimpse of what the basics of ML can do. There’s also Parag Mital’s creative applicartions of deep learning course, which are closer to the sorts of technique that get @groma out of bed in the morning.

2 Likes

Hi, It is certainly possible to get a similar workflow to the one in the diagram using NMF, and we also have an onset detection object in the pipeline that could help.
In the end the main trade-off is in the training. We expect musicians will want to train with their own stuff, and you can do it with the NMF object. The sensory percussion stuff likely works on assumptions about the types of sounds you will feed it. If we wanted to focus on traditional instruments like piano or drum kit, this would allow a comprehensive training stage that delivers accurate results and saves time for the musician, but at the expense of limiting the possibilities to those sounds. For the moment I am more curious about how far you can go by training the NMF with your own sounds, which can be much more personal than general drum sounds / gestures…

2 Likes

@weefuzzy, awesome, thanks for the detailed response.

So if I understand you correctly, something like @a.harker’s framelib could facilitate things here with the multi-resolution fft stuff?

By this do you mean combining multiple hits from a single sound into an aggregate dict of those sounds? (so in my example, creating 10ish dicts of a “center snare” hit, then combining those into a dictionary of “center snare”, which is then processed into multiple classifiers for the real-time NMF matching?

Interesting…
Is that a toolbox1 thing? (rather, would I be able to have access to this for my part of the project)
Will there be a real-time variant or is it primarily for off-line segmentation?

I would think so. And that’s another one of the limitations that would be worth pushing beyond, in terms of @tremblap’s “why open the black box” question.

I’ve not tried ‘mistraining’ it to see what kind of limits it has on things, but it does seem to be built around 10 specific sounds per drum. That being said, it is designed to work on multiple drum types (snares, toms, kick) as well as all sorts of drum heads (including silent “mesh” heads). So that covers a pretty wide range of timbres, but almost all have the same, or identical envelope (near instant attack with pretty fast decay).

It also has a nifty feature which lets you train a “rejection” sound that it will purposefully not match. In the drumset context it would be something like the kick sound for the snare or vice versa, though they show using a djembe or conga in one of the vids (i.e. something not from a drumset).

When I created the recordings for this, I also did some more of my ‘prepared snare’ type sounds, which have a wider timbral and dynamic range, to test with. I haven’t run those files through the process yet. From what @tremblap is saying, the (currently available) approach with real-time NMF isn’t suited to fast (onset-based) matching.

Also, in my case there are more “traditional” sounds in the (analytical) path, with the more unique/personal sounds coming as an additional layer on top of the acoustic/traditional sounds. So for (most of) my personal use case(s), I’m less interested in finding onsets(/whatever) in noisy synth sounds, as I would be triggering/generating those sounds based on the acoustic sounds being fed into the system.

indeed. Now I can get it.

This is promising for me, in the sense that you have the desire of low latency so exploring short STFT and classes (types of attacks) is exciting. You also have the tools to do it automagically: you can load your sources, split them with transientslice, take the first few ms to get ranks… how flexible is that toolbox? :wink:

That is very generous of you @weefuzzy although I reckon that it could be a good roadtest for a staple object in the refactor?

that will definitely be part of the second toolbox - and that training is good for explaining ML, as my teacher told me :wink:

that is part of the goodies I spoke of in the other post :wink:

the problem is it is difficult to do versatile and optimised. To use my tractor vs Ferrari metaphor: both have 4 wheels but one is optimised for speed -on-very-flat surface and the other for all-terrain-strength. To get a fast-all-terrain-strength would never match either of the optimised scenarios… hence proposing a toolbox that allows you to tailor to the task - and misuse and abuse and all. Fun stuff. And cool discussions on forums!

What I say is that you should try to do things with it. Not try to do just to replace an optimised black box. Or you can try that, but then you have to spend time understanding the limits of the algorithm and the underlying signal representation. @weefuzzy’s ideas above are good avenues. Why not 64 sample fft? on attacks only, around each transient? The piano example I built was from such bizarre mistaken assumptions, and it was fun to do…

And keep sending questions - it is good to see what you get and what you don’t (same with me) so we can try to build a good KE platform.

I’ll try some more classification experiments, but it’s a dead end (for now) as the matching is what would make this musically useful for me, with this specific idea and that’s not part of the toolbox yet.

As I said, I’ll keep playing with things as I still don’t what approach I will take. This was just one potential avenue of exploration.

Definitely, though everything is built from a perspective, so it will be more optimised for some workflows than others. A wise person (it was you) once told me there’s no such thing as a general purpose tool.

Actually, it is not dead yet - not until you try with fft size and short training (around transients only which is what you want. Try that and see what happens…

not really - that is the problem here - to keep all perspectives open, it is still too technical and neutral so we have to explain a lot. But keep on the good fight and you might be able to stir the direction of higher level objects, who knows :wink:

I just mean that the matching part will take >100ms, even if the classification is perfect.

Heh, that’s part of what I’m aiming for me. And as far as perspectives, look at the discussion of the buff.compose object to see how “not neutral” that one is! :wink:

Just mainly pointing out that real-time/fast isn’t part of what is being considered neutral at the moment.

What I meant is that it might not be! Try short fft and see :wink:

these are my obsessions :wink: Let’s see how it goes!

There was a multi-resolution tutorial made for the most recent release of FrameLib but it didn’t quit fit into the flow of the other tutorials so it was removed. I can send you some examples of it working in various scenarios if it would be useful in you teasing out this topic and doing technical tests.

In regards to everything else it seems like you’re trying to get bonk~ like behaviour with classified onsets, however, using something like that doesn’t lead you anywhere new or give you the experience of building it from a set of general tools. Perhaps, what is going on with bonk~ can inform your further investigation :slight_smile:

I never did get good results out of bonk~ (this was years ago though), but yeah, the idea is in that ballpark, with it returning the nearest match “immediately”.

Ideally there would be some richer detail than just the nearest match, but it’s a start.

Will look through Miller’s original paper in detail.

Also, it would be great to see the equivalent of what sigmund~ would be, but for bonk~.

Do you mean what the spectral analysis is doing?

I mean fiddle~ got updated/upgraded into sigmund~. As far as I know bonk~ was left in the dust.

Not sure if you’ve already covered this ground @rodrigo.constanzo but I did some experiments with detecting templates.

The fastest I can get it is to about 80ms… (too slow) but perhaps it will inform you on where to push next. The patch is messy but hopefully you can find your way around it with the basic comment instructions.

https://cp.sync.com/dl/121b50c80/txyitvam-ektcznfx-3p4i8bse-8gj49s96

EDIT:

I found that my crude benchmarking technique was flawed and reporting really late times. It seems you can get a match within 4ms and often faster. The patch should automatically be updated with that link.

Also, to get anything really useful out of this you would need to also store all the templates that you don’t want to match as putting in a click~ for example might match and give a false onset of that thing. Additional gating and tweaking would also help to filter out ‘nonsets’.

MORE THOUGHTS:

Perhaps taking some sort of ratio between the various outputs of nmfmatch~ would give you a better way of discerning if there was a definitive onset from one of the stored filters. Also, a confidence value - although I wouldn’t know where to start in calculating one.

1 Like

Interesting. I’ll cut up some audio files and give this a spin.

What would be useful is to use fluid.buftransientslice~ to detect where the onset is and then automatically analyze from there. (did you arrive at 30ms from testing or did you just pick it arbitrarily?)

Will also try training it off a few of each and then seeing how robust it is.

I guess it would be a critical thing for how it treats other types of sounds (or does it report anything that matches that filter, whether or not it is onset-y).

I guess I’ll see when I cut up some audio to test, but is it not just spamming out what it thinks it’s found? (as in, unless your two sounds are very different, from a quick look at the patch it looks like it would always return that it has matched the rank that is numerically higher (via maximum 0.).

That’s specifically where I struggled with my last version in that I couldn’t get it to tell me the now of an onset being detected, rather a perpetual stream of what it’s most like.

Ok will go back and make some fresh recordings for this as all the training I did before I did in a manner like they do for the Sensory Percussion stuff (i.e. lots of hits back to back with no silence between them).

Will report back.

1 Like

Ok, recorded some new audio files, updated my batch analysis patch (below) and tested.

It looks like there was significantly more variety in the dicts before (when training off multiple hits, and with multiple dynamics), but that could be more to do with how different the sustain is on different attacks, whereas with short windows of analysis (30ms to 50ms), snare hits more-or-less “sound the same”.

Performance-wise, I still can’t get it to respond “quickly” as the stream of stuff coming out of fluid.nmfmatch~ is kind of meaningless without some kind of onset detection function attached (which the number stream would have to be delayed to compensate for).

Here’s the updated batch analysis and matching patch:


----------begin_max5_patcher----------
6992.3oc6cs9jiZjj+yy7WAgh6h3dzil5IE3ubi2crW+g025vyd2FWXuwDzR
nV3AA5.zzyrN14u8qd.HdT.EHPR2F1NltUSAhr9UYkYVYkYV+5KewpGi+je5
Jqux5mrdwK90W9hWHuj3BuH+uewpCdeZSnWp71VE4+b7i+xpGTMk4+oL4k2E
dJX65nC6N3ksY+WrdC+gR7h9fEDX8l8wGSC9a9VXj0adNHR9Yah0a1sKq7yA
Y9IdYAwQoxGYWPH+BOdZm01jSGRe01fMYoEu03SYg9YYe9nuhxWEFjxIi+Zd
yGEjPPzSuOweSl5NbXr0fGrHT20.Fj.gPLxAhnOXQcfhVPn0fxufnSGBh3uA
YOFlewfsxNJuy+JHZ046TQLpaUbw+9Keo3GOXHdtKLl+kT788QujHuC9Jj9z
gG8S5rOu5AqUO5E8Tu8aDwQz6rIqcnLHiwvDLSzqAs5z7tbvANkveNQikjZw
EofyWVK.sKNgO3KthcaHqDwN5kv6f7w126G48Xne0ug5nIJ+podeze668xxR
Bd7Tl+4OklCq43p.8BO4Guq3xEWu5Kki1OoGgqcabrHKvKrMMV6tNvgmFPit
uE0PAtDpqcOo6iSxLfhJF4ApljbYE+ZjrbbrKKNpSFqAYp3SbDbOTjhIhT7S
ym+f6ia.LaysliYPLaaYmkhk+BLdgEvw2YQSoypWvbbjunOZA6e71HnPMtSb
nq4hPro8CEn1PAa0R2a25GxGilNqMiqff22HDk5B3n6hNymlgN5hYVOx++.y
FN4+NHpeUiLjZTEvD8bWx3YvAcnNDOec5t6tC18brkinpegjCuiZD0cwGQ+a
gVJyc3Jd6UXUucTaaIOKlKIVnfmMZVWH.r3yOysmnuNJ2fHtUDFv3VzgYNhN
LiL5NL0Y95uahObvmSvM6vuhaqxAKuSaChECx9oVYwVdVo7tRnu09fLq3cV9
da1Ksy8mid0oTeKtYuYbimSC3eiogAa7EOztfnsVbyi8yrxBN3yuUuHuvOyM
eNauOu0jzLtMFGR+4HdSbiixr1FramE2FaK9yjwegoV+K1j+0BRLLHxeS7oH
IcR6yz4bMeRKIQXInyFi5dhdXFLivLbs0uSP5V+4DOtUWBIeEFkFIWigbbkV
Z.q3xkFc8c9gezOKXi2ptQArctUNB6nwtTGFAfvDB+ZPoMAXXEDQ70uyaSSi
a6Eknf4Cl5X12QqGE8reHIdieZ52FGtc.qi1D56kDF+buyCUPCWKvZhMDh.1
.Jw1QrJCDd.EnXMLKcHuEUk.3jcCS+Eys9neRJeAjUsVek2wiUtb8UEbv6Wh
keQNk1ayoM0kNaNeh+GCJddT4U8R3vPFGCNknXh9jM4rY6Gh4HazofREExAu
bRRNLIX9ROlyhHGMKZtxh2bjy7fN4.r5OcrAmgSNivSgwa9f+1pD8p3i97IA
GS7S4SVjqptVya824cJL68UmbHssSS6ErwZarbJzWmHVmSws7TRv13HAQTar
Pb4hW2OYAk1OJ944Ni7Nh7Np4g47FbfoiFS4cxSoO5kHFpxsrtXrZUVbbX8l
JetP+cY4MeLHJpAJlEer6FSBdZeOO6iw7FOz22srkz2eJR0564bEYuWrV252
mWXX9r45e8exiubcuLegt.Y2ET1nZ0E6S2jDGFVq+pZ4iZZYKmKei+yAay1W
yC.hV32dvwBlnUkixaCdxOMq90x7dJs9URy9rBzqboSOlOK98Y9GNFx6E0ug
ZNkp5T1px8pc8dUSnySUbUrQG18kI3aJf0aTN1ppSpD8ipusZJWQUZPmcLpk
VluPSjqqTcBtx7htTezxBN1p5OQS0HU7bPg5jo.u0TuzPEi2lOXEXc17Eqcb
XXqUPjUZ0Gns5lZ8Vc1gnDChnNqQtDHxg.YtJ8LPZK8LswLpdLyA1MlAuRXV
B2TWquIIINYkQX.lqq0kJPAGWBEK8ioMZXHnC1F1smsIkutcbu7GEKmcP9Dw
zIgsGP7ZaLGj37HHaHC53vANxvfDpCPxtaPBcc.om7yD99zJ06vwAlL8PiEL
N.XAscV6xrILfKggro1BNJgEcSlkBzMZguNnEWr7N+juXIW.E2.gjr9wrx0c
VyIQcBbtJ2eoiKCBnSG3H2b1LgCl1YzbQEjMDTwWujfGC6hViX.pqK1lhx2G
h19iwXjx18liTRkcGUqlRsOUFysY5LSl6ZapiCBxYpboL.Fx4uTqHueTC2Ap
At4p6jhu3FGm9kdQnzfmhDqovT1rb.CSXq4KCkQoXFD3J4xXtSd9n8se93Qe
+OLVwXlBVZLj.B.Sm6hdy4txrf8hOU8p3PlPnwTSzz0IZaeyAmx09T28h7kA
s0+Qtc5a7sHDHXXg+lZWAhrFC4FVf3BvbgDfMR.dNSGCYWGLruEPdP4CWkyW
GvfcWrixG0RO2nDaClP+l5dcLQWRHWlhKWk0OXjZkJE+r6tLnitbeC0UZoXn
p2glfns9e5JZ74AtBNodt4UGWN1pcZEiM4YUP70QIWs.O3xktPH5i+.i62j5
yd6b24ugBa13Et4jvCYbF6LuPqsmTdhZpNaBpbuDSfcDNyiCCCx8VuCXxNdh
RuNhm5CoTdQYyojD9cH2qsAjMSrwqkdNgIkQ6BmtzYxsu6q7QoXy.ShO8zd0
lMN.BfULCTFbMWxhiCl43BTq8GdAnA31iFaR7Efg2Q9py3LFxOePLEQrcqa1
6EE4GlJbJoxKASc1DeFjjEB6vTalAdxSgHraOp4scqkHPKEJQObJLKHGotPT
hwrUSzfn011DaFxgBy8wjJjhlFhcGLqqXK3edebneNLYcRr69VwQge1ZWR7A
q+nUbh0OVv1MUTzgJ88KACWSZ6.b1zgw6foqOu2OxZabjuE2LnsBzSiMbiRM
Geoap09STqXKeUsSBgv2ALZBbQJSmKzhO67xlP5nrPWIzxd5xrv2ArN6ie15
fWzmUJ79OLBRDO7ubJMKXWvlxcGeHNJTAGEhhVyZLAzd5y+P2C1N4IB6HNe0
lvX9G1F3EF+zBBkJOLAbVCa56b1zARnysGHS8C4cSKOqcUCtlE.BI1qkB9gN
qcZtRvKvJD3cfntBiNTXYi05ncqXfRrP4jEHX5VtVlWG2vduVg6Z50TnJP7x
MgGM4N8cvrFcKpU2xUftySW9NPhqLl74i04BLZKtPWx7fQpUanrnQw.LI..e
y2veYTrazlLZjGgv.hZIF1f0HDlBALBBpTMCmrewv12AaFaf3+GbKMdn1uFD
t3V.1EbMcmyiQ27M8eqOWxoHfUu38dsC3gRltWVu8QZz7.OES1HNv0h8llZ6
foJaiu.3AwtC16vQLUyXTBau1s01HRASGnruCB7gOXIhNxfH4lbXAsFXSCGD
sHXPW7TPDYxaHMB+OHy4x8au13M5Blzwv2IQ+3Oupd9a7yqFx68X0zKFeZFm
egqQCQwEduFM8YWLzM21nfYI.FjI2tHfGIL.0AdNoLLHjq5JFHuChUVYxOkb
YSlPNHUJiwghlaBFZMapvisiY6XciTXoNRzYprzU5rz3w0kVKcjZKcmdKCmh
KcklK0R0ECR2kFo7B0UsFdoDMrRvF1tgmLzmzKFj3Klj7KCl.LFlDL8jHLlk
LLCjPLClTLClXLCjbLCmfLCljLFjnLljrLiIgY5IoYFLwY5O4Y5OAZ5OIZ5M
QZ5JYZzmPMcjTMFkXM5StllBPZJktU6CIsto9ceNm4VQRn17d5WxcmlBwMQD
y+OHxkq3WUEWDYZHkAIHarKzFH1cQt7cswpWuVBzxSQ3UseRcZ8po4qt1uEF
e2De3wfHeqC7+H0Zs0gTq2bzaqbWzPV.4+Z0K5xYzsxY1FRFGN0zq8szsnwQ
L1JtlXnk4BYDGghZQL1qjXhoCM3R6dvkLvfK51O3dJ5WhChrHqLZL3gNx7iN
QaFXsSSzl.T43Lzk5..L9U6xGZiZlDhzOXSu8fszxRq2npiQV62+UGN7UooC
B8kINhQXtZg+4aGs.rMBXQcCrPxsmKtYXiZXr80OFQ.cGAo8FEo8lyC8aadq
HJs4SVFHov++GrBYv4EWgK.tdMXWUz3DUGgX1RSMpTC4LEK6Q7nMperDL27n
0WmgvdfALJTBOh6qefNM9TxlBVyBtDq1.C2Z3rfnxkn8SmElYAs9qShIXxTH
XrTH3JSgByPMlBErRWcJjLFJrn6bUoPTNqkQTH4lQgnwPgnaAEhGCERtxTHb
LixngmKWowpt0JunTp7+hlJRYSe5Mo09+TX7idg4qnuboTc3dfW1tCbcxuZU
duxMnELioWMjhVqxPXsoMLW06jcvO8NX6yxcIB+WgBuKdY6LzYnRSpu5N8TQ
mhuC1P1G4yOezRVGprJqFUlD1H46TaweJeVskzpgxTFMIfM18BX+5Y+knWW1
O9u2E7oYYeI0V.IbldFogcu47dB+oqBDVYzYMX81vL4Z4vktD3y8BBI.zcPf
JI8mnUXu.kxmiOLhM7VSNXSmdDcgf2GQz0bGjITj8ZDe1mP9jwIweWy8X2bL
RTNf.yS4.BwHqc5zPBhyZ7z2u26AbZdPox3mDRWSsI1tUpBPfIWyjf2dDRJv
Yf3EHe9Vw+F07NFnYpSx.StVjg6Q5D45fWI7WouUtX5gpvTlO2SWQYANY9Jz
Upnrb5fezIiYcp0MD2i5K4u70+2eSsXYRY0knou2aiHatR2a8cu8qd8+UpeR
5qSh2lD7T7qe2mi175eHIVtA+u9aCO86i+duW+CxECl95ez2KTrcEVQG185h
Xlx5sdYdu9sh0i852oJvveW.+gGJr5AcUS4P130NZEOFj4eH2egq.PqWY868
ESyV+r2GEXwChe.P7q+Maexu9Uw7q9iAGR2Gmo8oHUZu8SSUsZ8mCNVuA67F
d293ShXoudqLAIlDmlZ8trfMendiN7Feq2gi9a07Bc4MJeFAAUqIHPzzd+vP
4kGe7.6N8xOP0vcz6TV7w3imJ1t+qVjX8nu+wUlImTWT5QtfpjVeYvwuEoUl
DoUn7BzBJuHpqVqFE8agZ0uEpU+VnVcmFpU7o1odO42YDL71+z+42LSAYkqy
4RVEAewQo.4tONpLHNzG.rfbykr4+GeATbynH8FK5iZygGXe1A2dvaq2luXA
sPF.V40fWUH9RoWLmECe2iN+aeYvIk4UBLSlZp.PsLatTcw1+3fS38eDO94M
g9ewxwAL+vpRfWdoX+hvQvsGGGRcgfMxhZArnfYRqADHX.yAwKetcsZ768Iy
nXOpMd58CiIZsTXIDgkgfDASATm7ThSecLebPK6e3BavbDCgmo3aiQtxwM3U
KlrXjwDOQtZu43jspUVCupA3wnHcR+TN35R4iIP3X36JPezjN7tA0GSv8UzO
utgen6XlLxtETHarX30NDNGiPA3uEzWSeCGkqRdW4gdfv2Gbv7UP9OfVhe9O
us5I1slSu6IFnIkozLydMjaIjK000Neqjv3oWU1gz6mpxN2.hiwo9eYdpPxp
8y0YMrQI2DbAU3kaOXo4L2Px+ELPc+2XHSa85.xWYsyvAJfsdXy89gEq9gdl
3vMq3POCQsOepmI+iY7XO6x3hILm0hDv.APPhSdvNBv5R.PyhBnqy3wvESeS
B0.lKQUd6tzBpOAM9Bpeu0hu14ryUAzFSf1ku6VW.ngGOnMXscq8hAmAfqUR
MoMy5TGcPBdJD..f7oTEgLlIfTWkPXC2N1pfj15b3hvQYDvTTRPlYfwY7.it
hgXuLLmO9Ta4wfyPkNCqq.XsyzilFTa3Jl5c7ZrDgqgDAZAoAGnoHAaAoBwo
OX80T2AU3.WZpvDrv1dIoBrorEKJVXJegXAFKGU.Lk6zdooBS3NYjEjJrMUb
g6RSDPCIBzRQDNFhDDmkDJXlxaBVRpvXYVK47zhu7gohEcFBwPlS6EcJhoxr
rWRMplpPEsjBNolNEAhWRpvTrXI4MolpUmtnCHnlofcW7lsEwdHX6wX9J5SK
NbkjAiAVUV5Qp5pb8+hgUKR0EU9WKY+hXxPLcoQWSXzDIBQigglna9gSsipV
vmGvF09KpppEg4vpzwwh+bI6XFYRJZogWrIRSbGh4knPRwwyjiH8VfT..QHP
hL6WWS01zB0sHFKYXIAWBZDVWRVJhvTMmKoEUXSsylsjJsvlp5DCWZpvDIZP
656J57RElZiKaIktiM0LBotyECKL0Ra3hxWXpcDLzRSEPS8hGrWEBPkVTFg1
7HQPT5OFrMpHprZTIpsY3NZaIwCvrnfDmuqL1hrJtdeFgVK8abOMgbgsR1Nk
Z01srTXgw9uZIs2FYrT7kbsfHSkehVbpvnkjhWPeGgL1FikDJL0jOBcPKY4y
qTEMB607IsPLgddSjopc8QSaKUGyTIP3kzCLH3rIGTbH9kK1Bxk0gKhxjyRy
pb4kr6.mmtCAby6NP2wrWO0WTe6fr71PZZ7lS6PWc9IMSr3BsjdwBZplDr8R
SEF4mkkz5SnoVhiWxUk.wiXCYvKEQ.G4hU6VBksMJ2vSUjbnJ2HM9K.sheEq
+WEdEChUpE6zqXmCq.YJv1QbETFgpG79jcsrMuHws0CQhHSaSbXbx6ChD4Ye
NZII+y+.1HlKd7I4yjGhO42Q9O3Btqcuda13GkU49AqcD2kC1EAsEehAbc.j
72xKaGvuhZx.+qPWF.uRd7nWUPqN1jZfSi.ape3oFYSPhS7IAEiYHJQ9IDew
VvqEo2JOxVlw1cAggkcbcATcQPfs5oDusA0OZMywLX4XsH4HYOn8Svl49u7Q
QEOJSToJnOn8S5ezhmD4BftRdKbNuk3S7KA07jdhJzhLXmXMRe8iIwGiSJKo
BbaSq8bmxhKAfZ4SQ0n.uaVg1yTGOGQyXpZDbyxweBmMFfjehiS792bwIm+c
T7ELTH6mSjcFp95BS+ABQ+lgmuZPQ9SErJemUhcMEMz2QGaqyW5ZgTn1ntUa
3zlqNXMlxfLF6bMPqwwEs1XaqVbsUjun5x05Q1UqKboRknpVwQsm.5Va37o5
TPBdMg4B.k0blVw9t1DlT64qftHEdl5q4G4DxZjKlOsPTXZDcUqm4ruOX8bR
P1EBDN7QcGhiCy1M2eSPWz3wBm4CKpEs5cTVgqk3HlVSg6AFPPlPFMiU5A.H
D1OJng62AoGEPy3Dc.3qf1qQk0HJsYrtsVdb6VGL1C1iJyy5YX97QunxpRH+
VKW02bzOR3R41ppXMj95ZpWaa0NZr8PaN6qOTckUgop+sNM30sQErVjkRLNy
F7rsalbopTXSyAnSXTgq0KqLlhGHl56dLxlHOafNq6H+nYkLBdML9LKcmk7r
4PVqv3fzvfJJJE0Eq.QGsr.pMPgMTGR33nJxFheA3bq.JjQY4GDHnb0NUEmj
5mwQDNsUgUb3Yma7h19Y9OTEJpNXtpxynLmZ3q.qSZkFVU9hiOv60AOFDFj8
4pMTVAu.sGToieLEMe5LTY2TzgcGDCWew5M7GRkrRfIkqSUSwIM5gzZuUKFo
vfzLiXjvv01LGpiiiMK+PFG2J2lFbZEjMepl6RoDds02m2OJZqZocCQKEBVq
Xs8c9gezOKXiWO5xffhy7S0oOlK0gQ.HLQr9bhsxkEUfipkKNnoPDfMep4pV
aPMqtf1CO.D3x00S3bhH9ZTQ7+jnsx6Vqta91e3qE9+EXCJpHku6Tz93Sh0a
aLWCvd7ybwy2L2+8xr1r+jiqOrSbZgwWQCfRssyqCRsOiiGzzVBYwMsUs3sX
twsnd51MJYQ80ycUGUeDFBZWdxxSb62fVrNtfYryyW01mEx9Jxh4UaBCNt2W
TZBUZO3DMmWm3hXtLYVIO.R7f9OI9mPrLWlsWxm6EujQgfpzL2ZVFVErfP.r
IYMhIQX8vWdYdd0VuLuldbPfJc3SUuGSiCOk4y6I6W55D7O58bQsB9a815mp
pvskjhXgnkxwG399PPjBS7NsMHVbkJ2PXb7w5IJGWKSTFWKy6EENyNbjtfzS
yR74clbSdp6Z7b6wkFvlFrYUiMbJ3o.NyBuG7T19BwlMSiO4RuE9INIIOCW0
7hxuoR+N2y6ohQdqjkM35D0+6IuB6pzS0hoO5HgzvSIE0Qy18hBZPjP1w4jI
p4MkdzWs1o18AIgjcJJ253exhPpeCGC3iA8CRxaIcevtNvocwggwOqbdVE5r
qceX5NOadTMKMhWyRdtFpkIrqjV4SGd77xiF9XMQ6JCcJ7iDFaCcfE5fofQ6
XEzUZYDc4FNgjtZUuj+I3z75VNlPIqI.FkqRoj4AAGsS2Pf4S4rvehBgYkJm
UtaqTD+n75VOePt3mZyvJqf.08NhHR2fxCNHwgrzC5uDlV6ZOTU7mrDEDuam
33Em+k5N3fhicc2LgUyyYtcMjP0L+DpeHwtxPh7KqdJYqFgZt4w4iSs2zXPs
PwpqXZoRHo0hYv3WEyj2DnVha0HfNV12MU+aFbQuYWS5z3a1aFtDnsiAuY14
vydxuGSFTcmANWpS0v6nSNWzL7pHtF7pbls2zPnm8b.eDSFmJ53W1axHgY.x
LzovXS6TW5aBQM3Mg.ywaBXReBNGuISAOzkpzwDYQyx7IH4p8l.1l.ejY4Ug
MdFEbVdWPidWWLaAnVtyzitC3r7lFTh9rLVgLTu6k+lLQPAscmRYtZiS8Fwa
owobSiS2l1mpMceZ1z7TrQFjNMhtPUu8747hvURuSdxp79uubU8Mh1ne8kcE
.YENaqLvwTsVDoNMhcK9RLjqsPtaYpn3RrsYzpwOZinLqRbLYKBIO4mDQJFt
1hTZFgYlFmX5iQrtiOrNhMrBun7vK6NfppGLUUY8JGL9imJ1BG8C.0CZQFEp
BcNjK1wU8IJW9XiM2lurvCMiSSlisMS9DN.nii5SDJ.0Z6K0slRFh+RjCjPa
FRMP3..PR8GtynTDUFfnm2DUjbSTq+ETcOt3yq9SG8irdmWTp067OD7n3His
7VMg0rYLMNsHKrUTPVvU5TtzYm18kl7m1TfBAsKicR67uoEg8riPm.HJ.mRR
1kRXpP3jfbUrFUbPulP8EWDYEXRQeoVnULKSH9fJ7B9K6KCOqAmXzHnkmOZo
VTcpmLFcb4ZZ7VOkXsdV5vUhoZSDIYPbTOKjUM+KNeiESUFxniI5IFOzSTh0
0PGW8YGlMJVMFmMf6Zn3Z9h4rT6p7uK7jewABkdpRmzPouSkhwKIP4w0NbAH
vj3miFMEdltnkyMwKDE96+r23IP.hqCRIJigwH0bGDDBwyOA9GR78m.EJQOw
YECVY7iP4+7Sb+O9hswqepqZTA8Sxi7QpK.3zq9BtwgLkjEnCTJ+ixQb5r1A
ja20efaZ1nAW4dOHG9obUZTIkBqJMaVHvDkAEBJzX4N1DaaraN6HK2JINv4h
6QW9hzgTKgriSgzW92e4+GfSzrDE
-----------end_max5_patcher-----------

So is the problem really the latency of onset calculation + matching calculation time?

Well both really.

There’s room to improve the training and matching parameters, but without knowing how well it’s matching, it’s hard to tell what to do.

At the moment I can see the matching multislider jump around, and it looks like it’s doing stuff. And training it on different length attacks and/or amount of hits has an impact on this, but it’s hard to tell because of the onset component.

In @tremblap’s piano example, you can have a ‘sloppy’ onset (with lots of false triggers) because it eventually settles on the sustained part and works ok. In what I’m trying to do, the only part that matters is the ‘sloppy’ part, so having to apply some crunching post fluid.nmfmatch~ makes the whole thing a bit loosey goosey in terms of getting it to return something useful.

1 Like

I think you won’t be able to leave the processing chain of onset -> template matching but there are ways to optimise it. In your patch it says you want to try 64 sample FFT’s - how fast is this? I had a look for reference at other toolboxes like MuBu and there isn’t anything they’ve developed which gets around the ‘perpetual stream’ problem you were talking about.

I guess it comes down to how well can you train and how fast can you make the matching. Ideally what would be a reasonable goal for you in terms of match speed? What does the closed source system you mentioned above perform like?

@jamesbradbury scooped me there. Training on the first 64 sample following a detection with my superfast amplitude based detector (which is so far the fastest and subtlest but I’m open to suggestions in max or sc - blackboxes are not helpful - we are hardcoding it right now so now’s the time) could be interesting. Useless for piano since they are quite similar per octaves, but for your case, might be very useful? Then you take the highest confidence/number/activation.