Separating and panning reef crackle

alicee · November 5, 2020, 3:45pm

Hi - I’ve been getting deeply absorbed by some reef crackle recently, and interested in ways to mess with the panning.

I’ve had some success in scientific context using SI PLCA 2D (Mike Casey’s algorithm, not implemented by me!) to separate fish from voices : mono, and stereo with more fish

So we have short, freq-bound, sharp transient reef crackle, vs low freq, harmonic, gestural fishes grunts and purrs.

I’m interested musically in being able to isolate the crackles and pan them, ideally on basis of some feature. What’s the easiest way to achieve this in TB1? NMF perhaps? or some other decomposition? how can I isolate transients and spin them about?

jamesbradbury · November 5, 2020, 3:59pm

I’ll respond to your provocation with two approaches in max patch format. It requires pre-baked file seeding an NMF so I’ve zipped it rather than copying compressed.

uses a classic transient extractor - the defaults work pretty good straight out of the box
is a more roundabout approach, using the nmf bases of an impulse response separation into two components as a seed for an additional pass.

reef_scouring.zip (20.9 KB)

alicee · November 6, 2020, 1:38pm

Nice, thanks James (super brain high speed) Bradbury.
I like the musical control you can get over sound of extracted transient using the FluidBufTransients (esp with skew param), but the residual sounds exactly like the source to me. As PA said, this is likely because the transients extracted are just that, er, transient ; )

alicee · November 6, 2020, 1:39pm

Here’s a verbose SC example using transient extraction (de-crackle)

(
~monoReef = Buffer.read(s, "/Users/alicee/Documents/SC/FluCoMa/AudioFiles/REEF/reefMono.wav");
~stereoReef = Buffer.read(s, "/Users/alicee/Documents/SC/FluCoMa/AudioFiles/REEF/reefStereo.wav");

~transient = Buffer.new(s);
~residual = Buffer.new(s);
)


// with default params
// listing all params for pedagogic clarity : )


(
Routine{
	    t = Main.elapsedTime;
	    FluidBufTransients.process(s,
		// ~stereoReef,
		~monoReef,
		numFrames: s.sampleRate*5,
		transients:~transient,
		residual:~residual,
		order:20,
		blockSize: 256,
		clumpLength: 25,
		threshBack: 1.5,
		threshFwd: 2,
		clumpLength: 25, // 5 gives
		windowSize:14,
		skew:0);
	    (Main.elapsedTime - t).postln;
}.play
);

// listen
~transient.play;
~residual.play;
~stereoReef.play;

alicee · November 6, 2020, 1:42pm

And the NMF approach.
For what I was after, I think I prefer the transient extraction, but this stacked NMF is a nice move (I think I’ve replicated what you did)

It all sounds much more interesting with an original stereo, which infact kinda voids the question, but it was a useful exercise …

// APPROACH 2
// ----- ----- ----- ----- 
// Use NMF. Step 1, extract 2 components from impulse response (to separate click from tail),use the bases to seed another NMF

(

~source = Buffer.read(s, "/Users/alicee/Documents/SC/FluCoMa/AudioFiles/REEF/reefMono.wav");
~ir = Buffer.read(s, "/Users/alicee/Documents/SC/FluCoMa/AudioFiles/REEF/Studio2.wav");

~resynth_ir = Buffer.new(s);
~bases_ir = Buffer.new(s);
~activations_ir = Buffer.new(s);

~resynth_reef = Buffer.new(s);
~bases_reef = Buffer.new(s);
~activations_reef = Buffer.new(s);
)

// first attempt to extract click from tail in impulse
(
Routine{
    t = Main.elapsedTime;
    FluidBufNMF.process(s,
		source: ~ir,
		resynth: ~resynth_ir,
		bases: ~bases_ir,
		basesMode: 0,
		activations: ~activations_ir,
		components: 2,
		iterations: 200,
		windowSize: 2048,
		hopSize: 1024,
		fftSize: 2048);
    (Main.elapsedTime - t).postln;
}.play
)

~resynth_ir.plot;
~bases_ir.plot;
~activations_ir.plot;

// now use bases in bases mode 1 to seed factorisation of reef
(
Routine{
    t = Main.elapsedTime;
    FluidBufNMF.process(s,
		source: ~source,
		resynth: ~resynth_reef,
		bases: ~bases_ir,
		basesMode: 1,
		activations: ~activations_reef,
		components: 2,
		iterations: 200,
		windowSize: 2048,
		hopSize: 1024,
		fftSize: 2048);
    (Main.elapsedTime - t).postln;
}.play
)


~source.play
~resynth_reef.play

//look at the resynthesised components, the bases and the activations
~resynth_reef.plot;
~bases_reef.plot;
~activations_reef.plot;



// play the components spread in the stereo field
{Splay.ar(PlayBuf.ar(2,~resynth_reef,doneAction:2))}.play

//play a single component at a time
{PlayBuf.ar(2,~resynth_reef,doneAction:2)[0].dup}.play
{PlayBuf.ar(2,~resynth_reef,doneAction:2)[1].dup}.play

weefuzzy · November 6, 2020, 3:27pm

Another NMF based possibility is to emulate the approach taken with the pick separation example in the FluidBufNMF help file, and take a large number of components, sorted by their spectral centroid.

This is cribbed from messier code, so I hope it works

~audiofile = "/Users/owen/Downloads/190712-naturalReef_crackle.wav";
~source = Buffer.read(s,~audiofile,action:{"Source loaded".postln});
~bases = Buffer.new(s);
~activations = Buffer.new(s);
~components = Buffer.new(s); 
~numComponents = 200; //coincidentally puts us *just under* SC's undocumented allocation limit for a Buffer (2**31 - 1 samples)

//Resynth here. It will take a while (200 * 3.5 minute inverse STFTs). Make some tea, have a nap
FluidBufNMF.processBlocking(s,~source, resynth:~components, bases:~bases,activations:~activations,components:200,action:{"NMF done".postln});

//Sneaky centroids: we already have the spectra, so let's not use FluidBufSpectralShape to do needless work
//Instead, we'll use FluidBufStats to do a weighted mean of an array of freqs
~weights = Buffer.new;
~freqs = Buffer.loadCollection(s, Array.iota(~bases.numFrames)); 
~stats = Buffer.new;

(
~meanvars = []; 
fork{
numComponents.do{|i|
        ((i+1) + "" +/+ numComponents).postln; 
        FluidBufCompose.process(s,~bases,startChan:i,numChans:1,destination:~weights,action:{
            FluidBufStats.process(s,~freqs,stats:~stats,weights:~weights,action:{
                ~stats.getn(0,1,{|a|
                   ~meanvars =  ~meanvars ++ a; 
                }); 
            }); 
        });
        s.sync; 
    }
}
)
//Now we have an array of centroids, let's assume that higher centroid == more crackle.
//Iteratively build a stereo buffer with FliudBufCompose, using only components above a certain threshold, and use the centroid to scale the width 
~panned = Buffer.alloc(s,~source.numFrames,2); 
~panned.zero
fork{
    ~meanvars.normalize.do{|x,i|
        i.postln ; 
        if(x > 0.2) {            
            var pan = (0.5 * (1-x)).cosPi;// * (x/513);
            var dir = [pan, 1- pan].choose;  
            dir.postln; 
            FluidBufCompose.process(s,~components,startChan:i,numChans:1,gain:dir, destination:~panned,destStartChan:0,destGain:1); 
            FluidBufCompose.process(s,~components,startChan:i,numChans:1,gain:1-dir,destination:~panned,destStartChan:1,destGain:1); 
            s.sync; 
        };
    }
}

{ PlayBuf.ar(2,~panned)}.play;

weefuzzy · November 6, 2020, 3:31pm

Before noticing that you wanted a TB1 solution, I also played around with the new objects (sorry to those who can’t grasp them yet), to try using K-means as a way to sort crackle from non-crackle (or, with more clusters, to get different layers of crackle). You’d still be left with the problem of how to work out the panning though.

jamesbradbury · November 6, 2020, 4:22pm

This is so neat and properly indented, thanks!

I like the transient extractor too for this kind of thing. I find having those parameters at my fingertips more playful than doing nmf whispering. I’m curious to see @weefuzzy’s approach also.

alicee · November 6, 2020, 9:07pm

Oh I didn’t want a TB1 solution in particular, I think our Dearest Leader mentioned TB1 so I edited my original. Tis good to have a think of different solutions. Thanking you.

alicee · November 6, 2020, 9:10pm

Ha. This is not at all what I meant, but it sounds totally nuts, and quite like the inside of my brain atm tbh. But it does an amazing job of taking something incredibly soothing and radically brutalising it. bravo.

Is that some kind of panning trick or is the Pi in cosPi erroneous?

Thanks OG.

weefuzzy · November 6, 2020, 9:36pm

Sticking to my shtick

Is that some kind of panning trick or is the Pi in cosPi erroneous?

It should get equal power panning, in a kind of off-the-top-of-my-head sort of way. x.cosPi is cos(πx), so I’m hoping I get a half cycle cosine as my panning function there.

weefuzzy · November 6, 2020, 9:37pm

Hehe, look how many wrong ends of stick I can grasp. I’ll try and SC-ify my clustering thing as well, more the merrier etc.

weefuzzy · November 9, 2020, 12:39am

Here’s the clustering idea. Basic idea is to take many NMF components, and then try to group them via K Means using some criterea or other. Here I’ve played with

using the windowed maxima of the activations to try and group components with similar temporal profiles
using the spectral profile from the NMF bases
combining the two

You can compare and play with the number of clusters below:

//NMF Clustering Experiments for separation
(
~reef = "/Users/owen/Downloads/190712-naturalReef_crackle.wav";
~source = Buffer.read(
    s,
    ~reef,
    action:{"Audio Loaded".postln});
)
// 1 Use FluidBufNNSVD to seed NMF with lots of components, which we'll try and cluster later
(
~bases = Buffer.new;
~activations = Buffer.new;
FluidBufNNDSVD.process(
    s,
    ~source,
    ~bases,
    ~activations,
    maxComponents:300,
    coverage: 0.95,
    action:{"Seeding Done".postln});
)

// 2 Develop the seeds with NMF
//We don't reynthesise yet, because it'll take ages and we don't need or want ~290 channels of audio
//How many components?
(
~nComponents = ~bases.numChannels;
FluidBufNMF.process(
    s,
    ~source,
    bases:~bases,
    basesMode:1,
    activations:~activations,
    actMode:1,
    components:~nComponents,
    iterations:10,
    action:{"NMF Done".postln});
)

//3 Make some datasets to cluster on

//3a Temporal: We'll try and group components together by their activations. At ~20k points, a brute force pair-wise comparison would be slow, but quite possuibly useless because for these sorts of over-decomposed NMFs the activations may well be correlated on a longer time-scale but have lots of 'holes' from frame to frame

//Let's make a 'feature' by using the normalised peak amplitdue across windows of 100 frames: hopefully this will give us a basis to let K-means make sensible distinctions
(
~activationFeature = Buffer.new;
~tmpPoint = Buffer.new;
~temporal = FluidDataSet(s,\temporal);
FluidBufLoudness.process(
    s,
    ~activations,
    features:~activationFeature,
    kWeighting:0,
    truePeak:0,
    windowSize:100,
    hopSize:100,
    action:{"Peaks Found".postln});
)

//This gives us 2 * nComponents channels, because FluidBufLoudness returns both a mean(ish) and peak. We'll throw each peak channel into a dataset

//Doing like this, with no syncs and lots of buffers is much (much) faster than syncing each call, but it does make the
//order of insertion into the dataset non deterministic (which doesn't matter here)
(
~standardizer = FluidStandardize(s);
~temporal.clear;
~tmpPoints = (~activationFeature.numChannels/2).asInteger.collect{Buffer.new};
"Making points...".postln;
~counter = ~tmpPoints.size;
~tmpPoints.do{ |b,i|
    FluidBufCompose.process(
        s,
        ~activationFeature,
        startChan:(2*i)+1,
        numChans:1,
        destination:b,
        action:{
            b.normalize;
            ~temporal.addPoint(i,b);
            b.free;
            ~counter = ~counter - 1;
            if(~counter == 0){
                ~standardizer.fitTransform(~temporal,~temporal,{
                    "Done making temporal dataset".postln;
                    ~temporal.print;
                    ~standardizer.free;
                });
            };
    });
};
)

// 3b: Spectral
(
~spectral = FluidDataSet(s,\spectral);
)

(
~standardizer = FluidStandardize(s);
~spectral.clear;
~tmpPoints = ~bases.numChannels.collect{Buffer.new};
~counter = ~tmpPoints.size;
~tmpPoints.do{|b,i|
    FluidBufCompose.process(
        server:s,
        source:~bases,
        startChan:i,
        numChans:1,
        destination:b,
        action:{
            ~spectral.addPoint(i,b);
            b.free;
            ~counter = ~counter - 1;
            if(~counter == 0){
                ~standardizer.fitTransform(~spectral,~spectral,{
                    "Done making spectral dataset".postln;
                    ~spectral.print;
                    ~standardizer.free;
                });
            };
    });
}
)

//3c: Spectro-temporal
(
~spectrotemporal = FluidDataSet(s,\spectrotemporal);
~joiner = FluidDataSetQuery(s);
)

(
~spectrotemporal.clear;
~joiner.addRange(
    start: 0,
    count: ~bases.numFrames,
    action: {
        ~joiner.transformJoin(
            source1DataSet: ~spectral,
            source2DataSet:~temporal,
            destDataSet:~spectrotemporal,
            action:
            {
                "Done making spectrotemporal dataset".postln;
                ~spectrotemporal.print;
            }
        );
    }
);
)



//4 clustering

// Run k Means to find a given number of clusters, and then group together the bases and activations from NMF based
// on the cluster assignments. Finally, run NMF again for a single interation on the grouped buffers to resynthesise a k-channel
// decomposition of the sound

//We want to do this for temporal, spectral and spectrotemporal things, to compare, and to experiment a bit with k, so here's a
//function to swallow the boilerplate:
(
~renderClusters = {|k,data,rendered|
    Routine({
        var clustering = FluidKMeans(s,numClusters:4);
        var labels = FluidLabelSet(s,\labels);
        var activationsGrouped = Buffer.new;
        var basesGrouped = Buffer.new;
        s.sync;
        "Clustering...".postln;
        clustering.fitPredict(data,labels,{|x| x.postln;});
        s.sync;
        labels.dump({|d|
            var data = d["data"];
            data.postln;
            data.keysValuesDo{|row,cluster|
                FluidBufCompose.process(s,~bases,startChan:row.asInteger,numChans:1,destination:basesGrouped,destStartChan:cluster[0].asInteger,destGain:1);
                FluidBufCompose.process(s,~activations,startChan:row.asInteger,numChans:1,destination:activationsGrouped,destStartChan:cluster[0].asInteger,destGain:1);
        };
        });
        s.sync;
        "Resynthesising".postln;
        FluidBufNMF.process(s,~source,resynth:rendered,bases:basesGrouped,basesMode:1,activations:activationsGrouped,actMode:1,components:k,iterations:1,action:{
            clustering.free;
            labels.free;
            activationsGrouped.free;
            basesGrouped.free;
            "Rendered audio".postln;
        });
    }).play;
};
)

//Render temporal clusters
~resynthesis_temporal = Buffer.new;
~renderClusters.value(4,~temporal,~resynthesis_temporal);

//Render spectral clusters
~resynthesis_spectral  = Buffer.new;
~renderClusters.value(4,~spectral,~resynthesis_spectral);

//Render spectrotemporal clusters
~resynthesis_spectrotemporal  = Buffer.new;
~renderClusters.value(4,~spectrotemporal,~resynthesis_spectrotemporal);


//We'll reuse this to listen to results
~synthfn = {|buf| buf.postln; {|chan = 0| Select.ar(chan,PlayBuf.ar(buf.numChannels,buf)).dup}};

//pick a decomposition
~synth = ~synthfn.value(~resynthesis_temporal).play;
~synth = ~synthfn.value(~resynthesis_spectral).play;
~synth = ~synthfn.value(~resynthesis_spectrotemporal).play;

//channel surf
~synth.set(\chan,0);
~synth.set(\chan,1);
~synth.set(\chan,2);
~synth.set(\chan,3);
~synth.free;

alicee · November 9, 2020, 12:21pm

Nice, thanks @weefuzzy - I’m going to go study this … meanwhile, the cheapest approached proved quite sonically efficacious (to my ear anyway!). Remove transients, then remix with random panning ; ) (posting here, cos whilst trivial, others have expressed an interest in developing SC chops)


(
~monoReef = Buffer.read(s, "/Users/alicee/Documents/SC/FluCoMa/AudioFiles/REEF/reefMono.wav");


~transient = Buffer.new(s);
~residual = Buffer.new(s);
)


// with default params
// listing all params for pedagogic clarity : )


(
Routine{
	    t = Main.elapsedTime;
	    FluidBufTransients.process(s,
		// ~stereoReef,
		~monoReef,
		// numFrames: s.sampleRate*5,
		transients:~transient,
		residual:~residual,
		order:20,
		blockSize: 256,
		clumpLength: 25,
		threshBack: 1.5,
		threshFwd: 2,
		clumpLength: 25, // 5 gives
		windowSize:14,
		skew:0);
	    (Main.elapsedTime - t).postln;
}.play
);

// listen
~transient.play;
~residual.play;


// ---------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
// now play it back with random panning
(

SynthDef("reefPan",{ arg out=0,bufnum1=0, bufnum2 = 0, rate=1,  startPos=0, loop=0, noiseFreq = 0.0, balance = 0.5;
var crackle, fish;

	crackle = Pan2.ar(PlayBuf.ar(1,bufnum1, BufRateScale.kr(bufnum1)*rate, BufFrames.ir(bufnum1)*startPos, loop),
		LFNoise0.kr(noiseFreq));

	fish = Pan2.ar(PlayBuf.ar(1, bufnum2, BufRateScale.kr(bufnum2)*rate,  BufFrames.ir(bufnum2)*startPos, loop), 0.0);
	Out.ar(out,(crackle*balance)+(fish*(1.0-balance)));
}).add;

)

r = Synth(\reefPan, [\out, 0, \bufnum1, ~transient.bufnum, \bufnum2, ~residual.bufnum, \noiseFreq, 5, \balance, 0.5]);

// mess about with balance and frequency of random panning (next up, make this event based) 
r.set(\balance,0.8)
r.set(\noiseFreq,5)

tremblap · November 9, 2020, 1:03pm

Wow, @weefuzzy example is so clear. I’m studying it now and I like the abuse of loudness to find peaks of activations! I wonder why not use the ‘rms’ if the loudness filter was turned off instead of peaks, I’ll try and see if I get anything out of it, but that opens so many ideas!

Thanks @alicee for the trigger question, @james for the nice answer, and @weefuzzy for the inspiration!

tremblap · November 9, 2020, 1:06pm

oh and @tedmoore might appreciate some clever ninja 290 parallel buffer processing - I should have done this for my process - just watch me mod’ing all my SC code now

alicee · November 9, 2020, 1:07pm

it’s lightening fast!!

tremblap · November 9, 2020, 1:11pm

@weefuzzy isn’t line 164 supposed to be:

        var clustering = FluidKMeans(s,numClusters:k);

or am I missing something?

weefuzzy · November 9, 2020, 1:30pm

Yes, it should

jamesbradbury · November 9, 2020, 5:40pm

This is quite similar I think in approach to my own ClusteredNMF analyser in FTIS.

I avoided the activations approach (although I think perceptually sometimes these might make sense to compare in more circumstances), because the samples I was concerned with did not have meaningfully similar envelopes in any sense and I instead mirrored your approach of comparing bases. I found that smoothing the bases helped a lot in matching/merging result. To do this I use a Savtizky-Gavol filter which preserves some of the nuance even when smoothing quite aggressively.