@onset flag for fluid.nmfmatch~

I was going to post this in the Training for real-time NMF thread, but decided to make it a new “feature request” thread, to not clog up the discussion over there.

What I’m suggesting is having an optional flag that makes fluid.nmfmatch~ not output a constant stream of data, but instead responds to a bang to report the current frame.

One of the big problems being explored in that thread at the moment (including @jamesbradbury’s very helpful additions!) is getting a parallel onset detection algorithm to “line up” correctly with the desired frame of analysis, post fluid.nmfmatch~. That problem would be massively improved if one could just request a frame when needed, and then easily process that list to get the highest rank, or whatever else is desired.

Perhaps even having an optional second argument to @onset that sets how many frames get returned. So something like @onset 1 would return a single frame for each bang. Whereas @onset 1 5 would return the 5 most recent frames, so it’s still possible to do some statistics across a group of frames, while still knowing that you have requested the “right” frames.

This would also allow for having ‘downsampled’ data where you just chuck a metro 100 above fluid.nmfmatch~ and you can get a (very) rough idea of what’s going on.

This would also save a lot on CPU (I presume(?)) since it’s spitting out way more data than I personally need/want for my intended usage. And using the buffer-based version (fluid.bufnmf~) isn’t really a feasible alternative here as besides pinwheeling, the process takes as long as it takes (variable), so it re-introduces the problem of synchronicity.

Yes, I can see the value in this, and it’s technically quite feasible. You wouldn’t get rid of synchronicity problems altogether (in Max) due to scheduler bangs and audio frames living in diffferent universes. I guess one could, in principal, use a non-zero signal trigger to say ‘grab a frame now’ (i.e. process the last winsize samples and output a match), but the list output is still scheduler-rate, of course.

I probably need to look at the other thread in more detail, but I’m surprised that reasonable sync is hard to come by (you’re right about CPU-usage though). The process for NMFMatch is no more-or-less deterministic than for BufNMF: it’s a function of the size of the window / fft, the rank and the number of iterations. BufNMF on a very small chunk with relaxed settings doesn’t pinwheel (but does, of course, still happen in the main thread).

Anyway, consider the feature requested! Seems like a useful option to have.

1 Like

Yeah obviously it’s still in the scheduler domain, but it feels like there’s less room for slop in being able to request frames as needed, rather than trying to catch the “right” frame out of a near-constant stream of them.

Audio-rate triggering would be good too, but I guess that would open up the possibility of over-polling (as in literally requesting 44100 matches a second), which could lead to…a crash?

Ah right (re: buf vs nonbuf versions). I just presumed that since in bufnmf you could request something that takes longer than real-time to compute, that that was a big questionmark, where as nmf spits shit out (fairly) regularly.

And saving a ton of CPU doesn’t hurt too!

(hah, I feel like I should make more feature requests late at night (while @tremblap is sleeping) as it feels like he’d just stroll in and tell me how there’s no reason for this to be added as you could just do the same with fluid.bufcompose~ and four or five index~s!)

Actually while pontificating on this, could there be an audio-audio rate version of fluid.nmfmatch~? (fluid.nmfmatch~~?!)

So it takes a non-zero signal trigger to process a frame, but instead of outputing a list, it outputs the single highest matched rank in that frame (e.g. 4).

Or for more robustness, you could request what statistical measure you would want returned (maximum, minimum, median, etc…).

Well, I’d hope not :wink: . It’d certainly peg out your CPU. It’s a mistake I regularly make with Framelib, and have made with the C++ for this stuff a few times in development (i.e. getting my internal buffering wrong, and doing an STFT every sample).

For the other idea, it’d still only ever be fft-rate (so quantized to your hop size). I don’t know that the added timing precision would help much, except if everything else in the relevant network were also in the signal domain. This is something SC makes much easier, because you really know when a kr thing is happening: one reason for being so readily acquiscent to your first suggestion is that @spluta suggested that something like this would be an idiomatic approach in SC (i.e. a kr output in response to a trigger, unless I completely misread what he meant).

Meanwhile, however, I’m quite likely to come back with my own bufcompose and 4 or 5 [index~]s solution to tide you over, as it’d be a while before I can even properly confirm how feasible it is!

2 Likes

Well with @hopsize 32 (as in the current “fast” version of the patch), that’d be pretty tight.

If precision is crucial (and in some cases, it may be), it would be worth building around that. Especially with how precise the transient objects can get. Plus it would pair nicely with fluid.transientslice~ triggering away on it’s topside.

I like that idea very much. On (sample-accurate) demand.

funny, but maybe there is something there in term of interface (to be faster than scheduler slop, more like demand-rate UGens in SC like @spluta suggested to @weefuzzy)

1 Like

That would indeed be cool.

Maybe an instantiation only @attribute where if you set it to @trigger 1 it has an extra inlet that responds to either a bang and outputs the last n-frames (bang = 1 frame, bang 5 = 5 frames), and if you have a signal inlet, it outputs the highest matched frame, quantized to the @hopsize?

(is it possible to send different outputs depending on what’s plugged in, I think I’ve seen that before?)

Or maybe there’s just a completely separate version of the object (fluid.nmfmatch~~), which deals with signal input/output.

To a limited extent: you can’t flip between the same outlet being either signal or non-signal. Or, at least, I don’t think you should, as it would be weird. I’ll have a think about whether this ends up needing a whole different object or not. As far as outputting n-frames goes, I think this might give pronounced CPU spikes?

Hmm, maybe if you do want n-frames, you declare this as part of the initial @trigger flag and it computes every frame (as it presently does) but only outputs on demand?