Nmf question (bases mode)

Hey all,

I am playing around with NFM. I have a question about bases.

I have an 8 minute file that I am breaking up into “grains”, each about 1 second long. I am then doing an NMF on each of these grains in order.

Tell me if I am understanding this correctly: If I set the bases mode to 0, then each of the sound files will have its own NMF analysis, and the order of the channels in the resynthesis will be rando. If I set the bases mode to 1, and pass in the analysis of the previous file, each file with “morph” the NMF bases and the resulting resynthesis, giving me a gradually changing spectrum in the bases/resynthesis over time.

Correct or no? Code is below. I guess another question is when to set the bases mode to 1. Does it need to have a “good thing going” already?

Sam

(
var files;
e = Buffer.new(s);
f = Buffer.new(s);
g = Buffer.new(s);

files = PathName("/Volumes/Rugged/Mahler/NMF/MahlerNorm/Left/").files;

{
files.do{|item, i|
a = Buffer.read(s, item.fullPath);
s.sync;

FluidBufNMF.process(s, a, resynth:e, bases: f, activations:g, components:20, action:{|resyn, bases, act|
(“done”+i).postln;
resyn.write("/Volumes/Rugged/Mahler/NMF/MahlerNorm/Lnmf/Lresyn_20_"++i++".wav", “wav”, “float”);
bases.write("/Volumes/Rugged/Mahler/NMF/MahlerNorm/Lnmf/Lbases_20_"++i++".wav", “wav”, “float”);
act.write("/Volumes/Rugged/Mahler/NMF/MahlerNorm/Lnmf/Lact_20_"++i++".wav", “wav”, “float”);
});
s.sync;
a.free;
s.sync;
}
}.fork;
)

Yeah thats right. In mode 1 the iterative process is seeded with the bases that you feed with the buffer. If you feed slice 2 with the bases buffer of slice 1 then you’ll get a morphing set of bases that might be something like averaging over each slice.

morphing is probably what it looks like indeed, with the caveat that you keep all the past. So grain 100 will have 99 pasts… that said, there are 100 iterations between each grain so the past gets diluted. An experience to do to see how impact or not your results is to run, on the same file, grain 100 after 99 grains, vs grain 100 after grain 99, the latter being in mode 0 (aka from noise, not from grain 0 to 98)

I am not certain of what you want to do with that weighted past so understanding its impact this way can be fun. It is also interesting to see if you use less iterations in mode 1 after the first 1. Grain 0 in mode 0 with 100 iterations, then subsequent grains in mode 1 with 50 or 20 or 10 iterations…

All of this is speculative, I have no idea how each would sound or behave, or even if the behaviour would be consistent. But all fun to play with I think

Passing the NMF works well. There seems to be some consistency across each file.

The idea is to use NMF for time stretching and diffusion. I see you all have a paper on this for short files. I am trying to do long files, stretched very long.

I tried this:

  1. broke the file into 1 second chunks
  2. ran nmf on the chunks using the previous chunk to seed the
  3. broke out the nmf into separate files per channel
  4. ran granular time stretch on each channel

This sounds OK, but the phase stuff with the grains kind of kills the vibe

I am now going to try to rebuild the channels with BufCompose, but I think there will be an audible change when I move between the 1 second buffers.

I would know already, but I have a dying hard drive that is causing all kinds of problems.

Sam

@laurens piece at Electric Spring did something along those lines - I think his trick was over-dividing (many components) and very short ffts (to avoid too much smearing) but hopefully he will chime in here.

The fun part of the story is that I did not even notice it was NMF despite having tried similar processes and being put off by some of the artefacts. So his work on fft settings and number of components and initial material made a very convincing non-idiomatic sound… and it was a good piece too!

Thanks!
I indeed used nmf on a number of shortish mono recordings/improvs that I wanted to explode over the HISS (32 channels in this case). @jamesbradbury hinted that it may be better to use short ffts to minimise a smearing effect. I was definitely aiming at as little artefacts as possible, really just using the algorithm to get ‘into’ the sound. I was a bit afraid short ffts would lead to a lack of detail, but that wasn’t the case. I tried out a number of @fftsettings and landed on 512 256 512. Much smaller and less overlap than I initially thought I would ‘need’.
Bear in mind that the source material is quite harsh and glitchy, being noisy improvs on some sort of no-input mixer.
Hope this helps!

3 Likes

I have a different Nmf question, but think it can fit into this thread.
I experimented with splitting quite complex sounds into just 5 components and did not get too well defined ‘classes’. I then split them into 100 channels, where obviously many seem to carry ‘similar’ materials. This led to the following thought I have not yet implemented. I’m just floating the idea to see if this makes sense or if it is actually duplicating what Nmf is doing under the hood anyhow.

I’m thinking of running mfcc or melbands on each of the 100 channels and then cluster them. The closest would be mixed together in order to obtain again a small number of new components. I hope for a slightly less ‘black-box’ result with some parameters to play with the clustering etc.

any thoughts on that?

This is good, and it certainly makes sense from the standpoint of perception. I was similarly having a thought of using something as simple as spectral centroid to isolate noise and pitch, but MFCC would probably work better.

The thing I am trying to do is take a 10 minute track and stretch it to 2 hours. NMF provides a funny paradox:

I could analyze the entire 10 minute track with 1 NMF pass, get my 20 tracks, time stretch each and then be done. The problem there is that the software will try to make 20 bases that fit the whole piece, and I don’t think that will work.

The other solution is to change bases in the middle of the piece and crossfade between them. But I found that each different NMF analysis is very different spacially, and crossfades are a bit phasey. The solution may be as easy as finding better edit points. I am exploring that now.

Also, maybe Hans’s idea of doing 100 bases and then changing the clumping of the bases over time is a good solution.

BTW - I was able to get into my office today, so I brought home our 2013 Mac Pro. It is pretty awesome to run 4 simultaneous NMF analyses and have this 12 core beast running at 1000% cpu.

There’s a really simple/basic version of that in a few of the example patches. Like @tremblap pick finder one, which I think merges everything else into two components. @jamesbradbury also did something similar in the CV processing thread a while back.

There’s probably a point of diminishing returns if you plan on breaking things down to then build them back up, though the decomposition and reconstitution operate on different paradigms/algorithms.

1 Like

We did indeed talked about this in the plenary. As @rodrigo.constanzo says, this is what the guitar pick example in the helpfiles is doing in effect, but by centroid. I called it oversampled NMF, but I’m not sure if there is another official name somewhere. Over-splitting?

One thing you could also try is to analyse the time series (the activations of each basis) and cluster them, but that is harder. A LPT approach to that could definitely be experienced with indeed as it tries to have some indications about the time series…

it is also a quasi-exponential growth so you would have to wait a lot. a. lot.

try novelty. large kernel, high threshold, and adapt the patch that finds x amount of slices. that should give interesting feedback to bounce against, more than true points, but I love this kind of back and forth with ‘artificial intelligence’ (or ‘imperfect algorithms with pretence agency’)