Basically the problem of being able to grab the correct frame from the output of fluid.nmfmatch~ to compare against. At the moment none of what I have actually works. I’m just building good (theoretical) dicts so that I can improve the likelihood of getting accurate matching from fluid.nmfmatch~, but since I have no way of assessing the improvements, I’m just shooting in the dark.
You did tease that you had some ideas for this, if I understood you right.
my last hypo on this was to no clean them - you should process what you get!
That’s a good question for @a.harker and would be a good 3rd example for the video you are doing now.
So my idea is super simple: to take the (amazing) attack detector I have done in time domain, and take that to grab the next out from nmfmatch. That should give you an array of RANK values. Check the index of the largest number, and that should tell you which one it is more likely to be. It might be that you have to wait the 2nd nmfmatch out after the attack detector since it is so quick.
I can give that a spin too. It’s just really really noisy, so I imagine(d) that that would unfavorably skew the dict creation and matching (using up some of the precious 33bins on garbage information).
I’m doing that in my patch now. I added a variable delay in there (10ms seems to work ok), but it doesn’t really work. The best working version was something @jamesbradbury got working using fluid.transientslice~, which is slower than the fast onset detection. But in what I have now, it’s correct only through random chance. (some dicts do match more consistently than others)
What I was hoping for with the “onset flag” feature request was to remove the slop of trying to line up a parallel onset detection algorithm with the deluge coming from nmfmatch.
The idea of onsetflag is a potential good one, but not on the dev radar.
Another option I had in mind is to get the fast attack I had, use it to trigger a s&h to the address of a circular buffer, then send that to bufnmf~ - that would be sample accurate trigger of the process, and the ‘fake’ real time would give you activations (from trained non-updating dicts) and that would be great I think. And fast (although still in message slop)
another audio domain is to use your dicts are filters in pfft (like in bufnmf helpfile) and run fast (and exclusive) attack detectors on the RANK (very well defined) filters. You would then get only the (short) fft delay + env…
so you have 2 other options here you can try. let us know how far you get
I understand the idea (basically the same as your “fake” real-time descriptor analysis. Works really well in context. However, I don’t really understand how to implement that with bufnmf~, which as far as I can tell, just spits out a bang. So I understand the onset detection, and using that to feed a fixed amount of time (now - n milliseconds), but in doing so, how do I determine the activations?
This one I understand less, so I can start with the other one.
Hmm. So having two bufnmf~s. One for the actual analyzing of the “real time” audio and then a second one to just report activations? Or rather, write activations to a buffer, which would then get uzi/peek'd to determine which rank had the most activations?
Do one of the examples in the helpfiles do something like this? Having a quick look (traveling at the moment) but not spotting anything.
Not exactly: the nmfmatch example listfunnel -> unpack -> gate is in the right direction, but RT version and not what you really want.
The example of JIT_NMF (formerly known as BecauseIcan on this forum and the plenary) in the new shiny ‘example’ folder is more in line. you would not trigger fixed values, nor with the count~, but with a sah~ and change~ to get the address of what you want to analyse. Then the bang after bufnmf~ would trigger a list-ification of the destination activation buffer, to find who’s the boss… but before doing this second bit, I would just look at the activations, to see if you find a consistent lag for instance. Once that first part work, then it could be fun…
I suddenly see a sexy example coming together, like
my DC updating from 3 steep bandpass on drums to train (bd, sn, hat)
then attack detection sending to the thing I said above
then post processing the activations
Would you be excited by that? It looks like a fun addition for me…
adapted the amplitude based attack detector to be spitting its sample accurate click with a time hysteresis in gen~ - next is the nmf process, but I thought I’d share this recyclable code snippets
I’d like to train on basic bass drum, snare and hat, and for the purpose of this, I’d like each of those to be basic classic synths so we can get parametric variations on them. Where would be your go-to cliché of synth source of code if you did not want to lose ages reinventing the wheel?
Would also love to see a detailed account of your approaches (even when/where they fail) ala the piano example.
If you want a robust and varied training set, wouldn’t it be better to use a bunch of different attacks from generic sample libraries? (ala Logic, Komplete, etc…)
I would think that recorded acoustic drums would have more complex and varied spectra than (even sophisticated-ly) synthesized ones.
Would also be good to see it trained/tested on something a bit more homogenous (toms, or ride/hats), as kick/snare/hat are so different in spectra that it there’s probably all sorts of (far simpler) techniques for differentiating from them.
ok I have one failing perfectly right now (untrained and unupdating) I will share this afternoon when I get back to this.
that does not exist. What we want is a robust classifier, and the spectrum covered by the ‘bass drum’ class is way too wide and culturally driven, crossing over to snare mid way (i.e. a funk fat snare will be lower than a metal bass drum!)
so I’m trying to make a ‘training’ mechanism that is quick, and reliable on a specific set. This is fun.
Yeah that’d be great to see, though I meant for the “finished” version once it ends up in a help file.
Sorry, I didn’t mean having a “generic training set”, I meant having a whole bunch of examples that you can test/compare with. Individually. So having a set of 10+ (acoustic) bass drums, 10+ (acoustic) snares, and 10+ (acoustic) hats, that were all different enough to push the algorithm, but not as straightforward in spectra and envelope as simply making an 808 kick a lower/higher pitch (to use a stupidly simple example).
Soon @weefuzzy will take ownership of the ‘help files’ and this type of examples will most probably be elsewhere (teaching, examples, web, example folder, who knows)
this is actually hard to detect for an algo, hence me pushing this bit - easy to modulate, hard to identify