FFT and nmf processing

jamesbradbury · October 6, 2018, 1:22pm

This isn’t sort of a usage question and sort of not.

My understanding is that the nmf process takes a single matrix and decomposes it into n number of matrices without any negative elements. What place does FFT have in this process and how does the configuration of window and hop size have to do with a perceived ‘accuracy’?

Do larger window sizes produce ‘better’ results or does the process just keep working till it reaches that non-negative set of matrices at the end.

I’m curious what kind of decisions I could/should be making around these settings depending on…

a) type of sound
b) what kind of output I am looking for

tremblap · October 7, 2018, 8:25am

This is a very, very interesting (and geek) question @weefuzzy and I have in the pipeline a sort of KE part for the FFT in general. Now I reckon you have read this page on the KE website

DISCLAIMER: creative coder understanding follows

The key misunderstanding in your explanation is that nmf takes the magnitude part of the spectrogram to process, henceforth not negative. Then it “use(s) the discovered components to generate masks for the original, complex-valued, spectrum.” So you have masks defining the components from the process, then applying them (multiplying) on the complex frame (on both components) changing the amplitude (masking) without changing the phase part.

Does it help?

So all the usual Gabor uncertainties apply, with the great addition of decoupling FFT size and frame size, which allows to oversample the FFT, where I’m still not superclear in relation to spectral precision. @a.harker @weefuzzy and @groma have explained to me many times, but I’m yet to say I understand for real. Hence the need for a FFT primer for geeks

Practically: I usually play with settings with each algo. For nmf~ I found that the algo was quite sturdy to imprecisions, so I was having fun going low even on mixed signal and hear it break magnificently. Otherwise my rule of thumb is to go large fft when significant low-end is present, and make the overlap of at least 4 if not 8 if there is also transients. But I get pre-ring, obviously, because my mask opens for too long. Does it make sense?

a.harker · October 7, 2018, 8:42am

The thing about zero padding is very simple. It does not increase resolution (which in this context has the specific meaning of the ability to resolve sinusoidal components in close proximity). What is does do is produce ideal spectral interpolation and hence finer smoothed output.

jamesbradbury · October 7, 2018, 9:19am

Thanks for the detailed Sunday morning explanation! Very interesting and now I feel like I understand the subtelties a bit more.

tremblap · October 7, 2018, 9:44am

be careful of my disclaimer! My master @weefuzzy will nuance and correct when he is back online.

weefuzzy · October 7, 2018, 12:14pm

We can be empricial about this: (this assumes you have the FluCoMa help files and media somewhere on your path)


----------begin_max5_patcher----------
3879.3oc6cs0iiaaE94Y+UH3WRKpWCdWh4oMMsA8g1fBz7V1hAx1zynDYICI
4Y2sAY+sWdQRVxitPIaYOimL6E3gRVjmuy2g7vC4g52d2cyVF+YQ5Lmu04mc
t6te6c2cmtHUA2k+62Maq+mWE5mpusYqh2tUDkMat4ZYhOmoK+GiyDeqyO8n
v4aRi2mrRj9MNeJN5axbBChDN624rTj8IgHxYcvlMhD4yv4G9gexIUjkED8P
5BmOFY9SwiV88VEuOR+7Y4EFseaPTnHS2Xf4EtyOa0ixGx8IhUYFggvXK.yc
HT2EXhGEBm6.o.UQdtK.N+27uYvZciOd4u7dHfO6PkDuOqnV.4kll8kPg91m
oJ32e26T+27SD2993s6hijW4amMPQD4gW3gwD.UIm3EH..3hKkSDnM4zsa4b
WhHU1d7yBhipVcbn5whk.KS9im6bGc8TsZlNHxOQ3jIIWGHO4rLm+TXPZljV
kEatAwCIBgS7FIyR9szRwb8UV8n+pDQlHQcQUA9a2EFjsesvYa758g5a8O2H
6C0kpYSbT1F+UhdTWdnEJEEfqQQjldhwsoiXiSG45YT8zbMy4T8nzGK2mkEG
0ISsTsglML.hVgNaPIjlwgnsARzlAIbdoIR02Zg9KPU3RQMm3uUQCtWD4uzf
GEWa4CqhCiSLsGckCKARXkVQavOwzrAfFgeSSL6K6Dlae1r4l+EnXYMnjpfn
wQUZYxGJ2Cf4ya7S5F5Yz76emDKsyROXvk8XPpymBBCcx7+Ugr7sRKnfn8Yh
z4Na8+xRgCD7dH8Ouv4e3+jzNyYkr6ekQmvewjYe4df9ffHs0EE0sQFYJ6ve
nlKvAZtPnGKu8YtftVlKuRI9tCcDYrwvm5BVf4XtGRpMHcOVLbT8yCgXcEwQ
W5wh8FKlHc6BJ4oPVeXBmORHQWOLIxbggD9HgDF.u.Cv.RuHh23PDf1Y.FFb
oQDHXrPBjuf3Q7X3dwD2QxRz8TxHtWZLgMVCGJoryDHuSHgMRHAZ5zheogD5
XgDLegpqDpWuPB8TXIT1Eu6UxXgDjb.AYmIXRuPB4T5dkZmgyShvmDYmSfAO
VfA3ZKvfOIf4x2KKZjPBgCV3UDSjtgDzIAIP5EefmwBItjxlZOPB7ZDjn+id
pWCNBQRiUcrgNzMVOwFx6pDBr+pHLL048NaTAJLcmr4m3G5H4Vwg6UTqgJ1U
lInZzyRY2C1orONOJj95WF0BrjvC..7kivWC5xDa2Ee9fNrm0PG6jgN4i9JA
cqEa72GlUFH5SAxPHqgL5ICYfKNa6ukre6TYnBXVCciyCFyCU5H8UE0Nu1nD
NzZTabt23YBglq60B0Nilm0FkumABQS4.gQhOIqjmIx6b1E5+EQRyxHxpHmy
bgKb4btm27xvQiZQFAMKiv1hZWZvCQ9gcEvNcaSjjK54x9cy1DDJdRJWJx9g
J3tY961Uo36p7UT.1uXh5m67xhBhLEgKKJQ7TPw2mTVpehDgxjvy9DS66yrh
9LTOl30hjn8A5lhoPop6cEOvRWC0wevUCgPlYFgdtUBqoJ9mgwq9USHSAEEF
uSDEDU0bp1ky4y2qhxaZv+yDkWzhFudQTfa7hQ9aMB22kDH0JE2xCIAqiiTM
hZPsp3hpS5gtd8gfzpBi9Nh720vWVxVjvRKWLUJj6SW5mnzD4AHFUbwr33v5
Wp76EJ1jke4cAQQGghYw6Z+hIAO7XGe2kwxKtsqms9Jo2uOxb06kVlY2m5+T
czNyOLL2Ts9i+y9QAa8yDYAFU.BTdQSPxeLcURbXXM40bkmZ3Jqkj3UhOErN
6QcEUkLHu8fcEjnYkZ40AOHRypWVl+Co0K4YFnxh1uL2H8d0PQgRon9MTaCA
T0hrZmZ0JuqN2p2A2x8pEu8qNKUtdU8NZpy715rChb0g30kn5hS51.V9CQN8
dW1w818rvseTcVoSOTkq7rN91DF6moVuhk9QOLq1S+nABtqXvf7dUN+XXlyx
wibDXIxAwvRjKOTqcfbn1QtZUZKADnsJ9XeIZF9eAf5OHjBUVhSp+1comG7m
wKgAObe3Otc7GOT7uREaG9WrZcWYcv5zcpwZDe8rf+bpYwQ5C4Isi7jAh77F
2UFMi4pUFcdkddx+8pKX50QKnWy9u5.bfp+zpp.0ip.5ZPd0paH+AhMaLqt0
EzSqWnFpS6TFG4360A3+KiGsMy0AR3kxMhqhLXWXsqkc42Hy85hTohPonGKc
zf0Jh41Mhw.Knx46xfxYMBOrjGlYC2IrAFIr8hfisKN7Ke0YS39f0KjNqEsc
yBUQpok5.ANeHyOQNRnz3+CxOjJmatb166VncmagJ7FpXBMVRJWMUK..hPyc
vvClnXud8qiweKC55YgT4AWamUgGo1vU0abCZCHkXVCoNzFduU0F4SOO8rqM
nmf1v8sm1XsJ1gSmsAhn0F.9wZCFnWsA6sr133gGNKZChJLVRaCJ9XaC2xO2
oJg9ZVkn176a1GEIB6v8vdlWDEYBGnZADjN5nhhrWuSHEORTS0fuxXVQXnzD
yQOYRWyD6fpnPAODRCOZuP2a8vPg4kHGDyFRTnHijzcyDAopPmKY.APBYYDj
tcC9SUnKe2l1OpgrK5Ou0BeiZYvxmdLhxGPDbPv2p9+LYyMnhtnl6OtlMjHk
0s6OPxKdMhFJZkpBrI1WD8+iK++VCeyXmsTq3PkRK1pCGWdPzZwmqrZzu..M
iGgViZdWSTCcFQMSKbriwTMpg7CamdK.vNBfEXZ3UGV3eUuPsrhuZYTc8lgN
SFAlqKyCPuScgbsHMKHpbaP7yGBX2Q231f06hkiOlVDrdS9xh3zEbIXpR+At
mmBe0QA4P40CcexZy1OAXEoXnxmZVO1KfvtDPJlnkOY2zExGDSoGDvxKbIEv
FTLMKfp3b6v5R.qpsH.xBhxzPMvjJeLKrRJK2j3dme4waHxCsK4wkyJkGr7y
51sRqI82ovp+P4Sj7XOAzaDDPNFZM+CNIxGcHh2f6.opJruNPlF4SEMV64ij
N0eU4idlXcgU1WpcSVAerr7qNez0V9Hiq1LiZeXcqxGKKuY8E55xGcsmORJk
OZM9H4JHe3moUZV9fFwaxZC1fwTSacRZCHKaC3IDGHV1FPSXa.YKg.OkLBjs
pCDYJaEVqPlRVg08vplH+f6gEgQKfEQwtmdXoW4QPXiP9vbWsyYbKFAgbkkO
5HjOUJJq15+LV+xGdZ7ngNDOZvcIeDDQObndk7La0FH.dv4ZOzhoxJSStrVJ
PcIEHJqjxgcMIDlR8vqDmtxhmHwARFh3LYCpBGB0uSum7NDYEDux55Vgwbn7
K4DlsWFYiTFqxm5SFgWWYrAZmcxXiEeI8CFYauzZrXpLXbQCwrcp5Kz054nN
kdh5ZajavSo2WE.c+XAu0VQdgE4X1LUdFs9dSNUcuZMgCVpN5zNNo4FTt+7P
X7R+v7L6oLyv5JQfNjsPu6P68rkGiliLtwcdEfwL8HkXXSKQasvlyrNKMuAS
PwCGxfLyB0P7.+Q1I9GYm3ejchib08Sj9o37QYqes+prfUNQa23rYejBGc14
mj8kONazawNsAJWc7loVc.0gQT+4R.CO3UE7BsOHzHU66.8AsubLaZVOOuEt
R2935smSu6iDF7EJznHPeWXnS9glpX8Gm4H6wSbZDGOpbZ9xoKxUmxkHdunC
Zj6vvZsb0FVR2zu.awisxZz+AQKrMy4Oqlu0e5OL7Dix0EU4X6syDP6ruQGt
NFrclmN8vHMavUWrWwlEN+39pSbi+h0bc4Yv7zEvV.4bNEpNg7505DLRqS8t
X80gQ4ojHXFLkgL6qPW0NCFz6N0j5952xrosH3vsLob5AmL5ejTJ8l1zjJmA
IURiPTaLMoraaSydYXVYZR7vJSSj97s1hAMIulMMaO6kFtoo5jXT4jKPc.TA
60MNJ5l1zTcl+obpE3YkoI9Uho4.XTmx7lLLJrBxJbCq27jlBtoITHtYVRX6
HTvaKB04w6AHGu.oVYRrULJxssi8PlAMvHaXTD5qeFU0ja4qNeHwO5W0o1xm
BhTAnU9YDw4COFuS+aXzIFWCH8fS979A3Ww4+Wy.6I5bKhTBePbuIdJAeqAe
G3kdPN5.uTwROMnUZzWBs39mEEA8BGZsYBC5wQ5Oq8G9DFTuHl3xeXbEJ16D
FHvWmSXnONJAvYmSNpZ2cWjhe1vQA2tbzSOHvn7ytaEXhf8BlPvsIGEh7pPQ
c9vlMYEkehNmS4GHqzd6CfeaxUOOAfIO+0sflRukXomnu7XRIAzBmkdo6qzk
LonoSAV7pNAVmx8bFz18bFeJSFEa2ymDvKfTChPmRnfZYNwP3ScqvpMg6TpP
rM8foSY5AQgVpPnfotUXE4bJwBhscVPmxtrn1l1XTzT2JrBKlxzFiZatWPlz
jIzZrfMkXAyVdwT1AN01MzNYJYmLqsT8lRdgsoX.aR66z1M2OgL0ZDqXmS5.
611eAaRYmHKwBFbp6uvJaD2oFKrRiLoodBc.4E+asLOQ1tyF9aNMS9kfxemP
5pVTOr47wfU4iMloIjA9txp1TeOwWWXI9qCheHId+tgJvzVd4moOdW4dsjUM
vtk0c9IR0blH4dSRCXwaQNB9vpof7fke9YuK4ZOPTMjLOAYhsE2ZYJ+jmEIr
7e+I+v85Bn4ErNHU0nWmmCMyq+WiEyYJMnhC+R4QMaGaPRq0lHlNdev7iqWN
pqbiB2xqxUTqH87NXsCqCDw1k4I0x4O2x5BTqrJ.CETMK9Lzqy2XdXu2VX5Q
QqdnXpI485CScuIwzZupyb9oDYUE5+k2+8+v6+9GkCz+n9sO6hO4+zHw17j.
gzce.rAhsMeBSOQusKeNqqw0b19Q2AU5cDAfchLzaRV2Z+Ue8Dd4g5JYUEaf
Nram32j99Qc7N7f75xiGHqp3j9n+tp4CZsk2kNdefFsOLvi7gABp5DSw4H1b
0eOxuEXg2KV4jyEzgmFWg9AZQmOzL2qyWlsuUFYtw0QdfHZ9.ycinna5wkMq
M+OFrJN78+y33c+82+uFynwDvgAi6wgb3TNVr9x0WHTCHcbTGxgpmGsgZqlQ
KQZvqRLAel5v1ZBRsslfmXM4Arnlv0B70wmWMLcPALCNl+R9.4o12pMT7gIw
V6HG5Bz7ACr42XwOu4COoluKwhluK6YLJCU9nSdCUkbzItwQm1FO+j1n8SYi
iOgMj07u+t+O684zq
-----------end_max5_patcher-----------

weefuzzy · October 7, 2018, 12:33pm

So:

The density of frequency bins has a more profound effect on how the algorithm decomposes than the density of temporal frames. This stands to reason – for the ‘plain’ NMF algorithm here – because it gives the process more details on which to discriminate bewteen different things. That is, plain NMF only works on a frame-by-frame basis (no dependencies over time), and only uses STFT magnitudes.
It follows that there will be a sweet spot for particular material: too coarse a frequency grid, and you’ll find things that should be percpetually separate get grouped. Too fine, and (again) the decomposition will make less perceptual sense.
The main effect of increasing temporal density (besides increasing processing time) is on the way that the decomposed components are amplitude modulated.
The role of the STFT in NMF is just to provide a representation of the signal that is nonnegative (i.e. the magnitudes), can be inverted (we make masks over the original spectrogram) and has some bearing on our perception (although we know that the STFT is pretty rough in that regard). Other things can be used, like constant-q transforms, and you’d expect these to have more radical effects on the outcome. Invertibility and / or perceptual correlation are more poorly understood for many other representations.
There are extenstions to NMF for modelling temporal dependencies as well. With these, you’d expect things like hop-size settings to have greater consequences.
Some of the bleeding-edge research into NMF-like techniques try and learn a signal representation / transform directly from the training data, which is quite cool.

jamesbradbury · October 7, 2018, 1:01pm

Thanks for all this juicy information, Owen.

The help patch certainly puts into perspective how the decomposition works under different circumstances. Definitely something to consider when using the tools in the future.

tremblap · October 7, 2018, 1:27pm

Indeed a super useful patch! Once we start sorting the ranks it is even easier to find sound comparisons (although they don’t strictly align it is even more didactic to hear similar-ish components compared)

tutschku · October 11, 2018, 2:37pm

Thanks Owen for the help path. Though, for me only the first component makes sound.
Both source buffers are loaded and the max window reports that the calculation finished, but components 2-10 are silent for all examples.

drums.defaults: done
drums.finetime: done
drums.finespec: done
bells.defaults: done
bells.finespec: done
bells.finetime: done
didactic nmf funtime party: All Processed

What am I doing wrong?
Hans

jamesbradbury · October 11, 2018, 2:38pm

Mine worked Hans, do you have the fluid.~ media files from the plenary somewhere in your search path?

James

tutschku · October 11, 2018, 2:52pm

yes, otherwise the buffers would not have loaded

jamesbradbury · October 11, 2018, 3:03pm

Apologies, I didn’t read properly.

tremblap · October 11, 2018, 7:36pm

Hans: Are the helpfiles working? If yes, this is very strange, since I too just copy pasted and boom, it worked.

If that is the case: let us know versions of OS and Max and we’ll see if we can reproduce.

p

saguaro · December 22, 2018, 4:58am

Pls include ‘have a cup of tea’ in all future demo patches which require waiting. Thx.

saguaro · December 22, 2018, 5:10am

Super helpful comparison patch @weefuzzy (changed polybuffer~ to buffer~ for 2a).
Actually, sorry, just getting back into looking at this. Why did you use polybuffer~ in the first place in this specific example?

tremblap · December 22, 2018, 9:42am

Because @weefuzzy loves them: they can grow programmatically. I agree with him, but we had to drop support for the time being because of code efficiency: the benefits were not worth the code mess. One can still use them by naming instance (mybuf.1 etc)…

jamesbradbury · December 26, 2018, 6:26am

Is the next logical stage not multi channel buffers inside of polybuffers?