Displaying the Mask from NMF

Here’s a max patch that allows you to view the masks for each component in an NMF. It’s not particularly efficient (although the NMF takes a while anyway - building the component view is maybe comparable). Might be of interest for some other people also.

NMF_Mask.maxpat.zip (4.8 KB)


Is it slow because of uzi > peek?

I’m not 100% sure - I think it’s probably mostly the sum of the overheads for getting between max objects. The loop for each item is (N/2+1)*L where L is the length in frames and N is the FFT size. Then to get the sum you have to loop over all components which adds another loop inside that. Each time a buffer is accessed it will have to be locked - there’s a function call (or several) every time a message goes down a cord…

My suspicion is that much more time gets spent on the overhead rather than the actual calculation (plus putting it item by item into the matrix) and added up over the matrix size that takes a lot of time. The whole thing (including putting it in a matrix and accessing) could be done in javascript and my guess is that this would be a win, but it depends a little bit on the buffer~ access. gen or an external might also be possible.

Here’s a much faster version that uses gen for the calculation - doesn’t look identical - not sure if it’s more or less accurate - I’d assume gen is 64 bit, but it sometimes looks less detailed - however, that might mean the max version has “fake detail” from calculation accuracy issues.

NMF_Mask_v2.maxpat.zip (4.9 KB)

This is fun! It is a sort of offline version of the vocoder example in the helpfile and I use that a lot, so this is opening some fun stuff for nrt… and also gave me a few ideas on smoothing the activations!

I was trying to be all clever since most of the computation is in the outer product (https://en.wikipedia.org/wiki/Outer_product) of the 2 vectors (the sexy blocking0 will show that. I recommend it highly) but then the documentation of the dot and cross product in jit.gen is so bad that I cannot get them to work even with the internet…

I cannot test your new version that uses gen without tilde which must be Max8 but if you manage to make jit.gen version with cross to spit out what it should with 2 vectors (test attach) then that should be much faster…


It’s related to the vocoder in that it gives you a display of the activations in relation to base, but the display is strictly the mask (0-1 multiplier for the raw FFT data) - so it is the like the amplitude multiplier of the vocoder normalised by the sum of all the component multiplications (which isn’t happening in your calculation patch). This mask exists internally to the process to do resynthesis but you can’t retrieve it directly so this is step one in this process. There is a possible next step of 2D smoothing (not possible working on bases and activations alone that I will explore), but it depends whether that path turns out to be creatively relevant.

The gn version works in Max8 so you can run it in that version of max. I can’t make a jit.gen version, because there’s (stupidly) no access to a buffer in that context. However, there’s zero difference between the maths in the two patches except possibly precision - the gen version is just faster. The gen version is fast enough for my current display purposes (and now quite a bit faster than the nmf itself) so I won’t look at speeding it up further unless I go further down this route. However, I’d currently have to find a way to reapply my smoothing process, so I’m not sure if I will definitely do that (I guess probably framelib is the easiest way for me personally to do that for offline usage).

ok that should not be the case as this is not what is happening in the process under the hood as @groma explained in the other thread. They are discreet values, a soft-mask, the outer product of a given base with its given activation, normalised to null-sum.

this is the beauty of matrix maths. if their cross in jit.gen was working properly, you should send in the 2 vectors (a simple jit-peak should do to generate both vectors) and you would get the matrix of (non-binary) masks (as shown in the wiki article)

anyway, I’m sure @weefuzzy will have managed to do an outer product in jitter when he is back online he’ll tell me off :slight_smile:

I think you aren’t quite following me - I said 0-1 and NOT 0/1, so yes this is a soft mask, but it is a mask, not a magnitude (which is what you calculate in the vocoder and also in your patch) - you require a normalisation step after the magnitude which you talk about and I’ve included in my code, but you haven’t in either the vocoder or the patch you posted - the outer product which doesn’t do that step. BTW - you can also save a step in your patch by using .jit.buffer~ to read the buffers directly into matrices.

I’ve just actually checked the source (as I’ve just been working conceptually) and basically what I am doing is the work of RatioMask, except that I generate the output magnitudes by summing activations * bases for all components, which I assume is equivalent to the way they are generated internally, but that’s bit more complex.

I understand what you are saying about the cross product, but I’m simply doing it manually - it’s not complicated enough to need a library function, and unless they’ve optimised that in some way (which is unknown) it won’t be faster algorithmically than manual loops.