FluidDataSet .fromBuffer ... append?

jthompson · January 21, 2022, 9:10pm

Working in SuperCollider, I have folders of audio files. I generate MFCC’s for files in a folder into buffers. Thus, I have lots of buffers representing descriptors. Works fine. However, when I want to place all these buffers into a single dataset, things get tricky.

It seems FluidDataSet.fromBuffer clears the dataset before adding the points from the buffer. Is there a way to append additional data to the dataset (using .fromBuffer) instead? It seems the only way to do that is with addPoint, but that leads to a really convoluted set of things to make it work … and it is slow.

tedmoore · January 21, 2022, 10:20pm

Hi John!

You should check out the .merge method for FluidDataSet. You could just .fromBuffer on all the different buffers and then merge them into one FluidDataSet.

You could also consider “merging” the data at the buffer stage using something like FluidBufCompose and then once it’s all in one buffer use .fromBuffer.

Let me know if either of those ideas help!

Cheers,

Ted

jthompson · January 22, 2022, 3:26am

Ah! Thank you Ted!

I tried merge earlier, but due to my FluidLabelSet having labels that are not unique for each point, the resulting dataset was somehow smaller than it should be. If I make each label in the FluidLabelSet unique, then I get the correct number of points in the set.

I must be misunderstanding something here. I think for my current purpose, the labels are not very important, but I would still like to understand what is happening.

Here is some code that might shed light on what I am talking about.

Routine({

	~allFeatureBuffers.keysValuesDo({
		arg label, arrayOfBuffers; 

		arrayOfBuffers.do({
			arg buffer, bufIdx;
			var localLabelSet;

			buffer.postln;

			// make one label for each datapoint
			localLabelSet = FluidLabelSet(s);
			buffer.numFrames.do({
				arg count;
				// a unique identifier for this datapoint
				var identifier = label + bufIdx + count;
				// identifier.postln;
				localLabelSet.addLabel(identifier, label); // IF LABEL IS UNIQUE THIS WORKS.
			});

			localLabelSet.print;

			~dataset2.fromBuffer(buffer,0,  localLabelSet);
			~dataset.merge(~dataset2, 0);
			// localLabelSet.free;
		});
	});
	~dataset.deletePoint(\dummy);
}).play;

tedmoore · January 24, 2022, 5:59pm

Hi @jthompson ,

Can you give me a sense of what you’re up to with this code? How many buffers? What features are you analysing? What are the keys in the dictionary (the ‘label’)? Generally, what is the task you’re setting out to do?

Thanks,

T

jamesbradbury · January 24, 2022, 6:18pm

I sense your goal is to collect all of your per-sample-buffer-MFCCs and put them into a single dataset? Is that true? In the end a single identifier (or a key if you will) will be attached to a single file’s MFCC analysis. If that sounds right let me know so I can help further along with Ted

jthompson · January 24, 2022, 7:37pm

The idea here is to take a set of violin sounds, divided into folders (col legno bounces, col legno arco, pizz behind bridge, etc.) and use them as the source of granular clouds. I would like to use live violin input to determine which sources are chosen for the grain playback. My thought was that I could create a timbre space of the analyzed sounds and then measure the distance of descriptors of the live input sound to the pre-analyzed points in the timbre space.

Rather than always remaining close, I would like to determine if sounds should be near, somewhat near, or far from the source sound.

The method I embarked upon was to:

Record the sounds of the violin
Separate the recorded sounds based on transients (I did this in a DAW)
Put those separated sounds into folders appropriately named
Create a Dictionary with keys representing the folder name (ColLegnoBounces).
Load each audio file as a Buffer into an array associated with the key (ColLegnoBounces). There are around 800 audio files.
Create a second Dictionary with keys representing the folder name (ColLegnoBounces).
Analyze each audio buffer (using MFCC with a window size of 2048) and store the feature buffer in the feature dictionary, which has a similar structure to the audio buffer dictionary. Each buffer is 13 channels (corresponding to 13 coefs … starting at 1). Each buffer has a different number of frames (the files are not the same length).
I then want to put each of these buffers into a dataset. Each should have a label indicating what it is (Col Legno Bounces, for example). Each identifier is unique and contains the Name of the Folder + Buffer Index + counter. The number of points in the dataset should be around 10,800.
I want to fit a KDTree against that data set and then be able to find nearest neighbors to a live sound input. When that nearest neighbor is found, it determines the audio buffer that is used for playback.

jthompson · January 24, 2022, 7:44pm

Here is the SC code that does the above (minus the KDTree). A link to the audio I am using is in the SC file.

Editing in the right file here:

https://drive.google.com/file/d/1DTMy3oIrnC-tEhcD3KE0Yy5oLLap0MJP/view?usp=sharing

jamesbradbury · January 24, 2022, 8:30pm

Lots to chew on there. I’m a little bogged down with a few other things right now but I should be able to help out in tandem with @tedmoore soonish.

tedmoore · January 24, 2022, 8:53pm

Is this the right file? I see the file path/buffer collecting, but I don’t see any Fluid* objects in here. Beautiful code though!

Cheers,

T

jthompson · January 24, 2022, 9:12pm

What in the world, lol. Sorry, let me try that again.
https://drive.google.com/file/d/1DTMy3oIrnC-tEhcD3KE0Yy5oLLap0MJP/view?usp=sharing

tedmoore · January 25, 2022, 2:11am

Let me know if this is what you’re going for. See the code below (sorry, I Ted-ified a few spots!). I’m guessing that you want each of the frames in all of these buffers to end up as a datapoint in your FluidDataSet?

Notice that when you add a label to the localLabelSet ( localLabelSet.addLabel(count, identifier); ) the identifier in the label set is actually the frame index and the “label” is the identifier that you want to end up in the FluidDataSet.

When FluidDataSet.fromBuffer uses the FluidLabelSet it looks at the FluidLabelSet identifiers as buffer indices and assigned the label from the FluidLabelSet to that buffer index when it gets added.

Let me know if you have any questions!

Cheers,

T

(
~labels = [
	"BodyHits",
	"BowingOnBody",
	"ColLegnoArco",
	"ColLegnoBounces",
	"ColLegnoOnG",
	"FrictionSeparated",
	"HarmOnBridgeBb",
	"PizzBehindBridge",
	"PizzGliss",
	"PizzOnA",
	"PizzOnD",
	"PizzOnG",
	"PizzOnE",
	"ScratchySeparated",
	"TailPieceLowerSeparated",
	"TailPieceOnStringsSeparated",
	"TrillD-Eb-Separated", // there's a typo in the folder, the sub folder only has one "l"
	"TrillA-Bb-Separated",
	"TrillG-Ab-Separated",
	"TrillE-F-Separated"];

// store buffers in a dictionary
~allAudioBuffers = Dictionary.new();

// may need a little more buffers
s.options.numBuffers_(16384);
s.boot;
)

// 1. Run this code and choose "ViolinSoundsSegmented" folder
(
~loadAudioToBuffers = {
	FileDialog.new( okFunc: { arg pathArray;
		var path;
		path = pathArray.at(0);
		path.postln;
		Buffer.freeAll;
		~labels.do({
			arg label;

			// read audio file into audio buffer
			~allAudioBuffers.put(
				label, (path +/+ label +/+ "*").pathMatch.collect({
					|file| Buffer.read(s, file);
			}));
		});
	},fileMode: 0);
};

~loadAudioToBuffers.value;
)


// 2. Analyze sounds and store the descriptors in a dataset
(
// create empty buffers for each audio file
~allFeatureBuffers = Dictionary.new();
~allAudioBuffers.keysValuesDo({
	arg label, arrayOfBuffers;
	// create an empty array for this label
	~allFeatureBuffers.put(label, []);
	// append empty buffers
	arrayOfBuffers.do({
		arg buf, idx;
		~allFeatureBuffers[label] = ~allFeatureBuffers[label].add(Buffer.new(s));
	});
});
)

// peek at feature buffers for PizzOnD
~allAudioBuffers["PizzOnD"];
~allFeatureBuffers["PizzOnD"];

(
// fill feature buffers with MFCCs!
Routine({
	~allAudioBuffers.keysValuesDo({
		arg label, arrayOfBuffers;

		arrayOfBuffers.do({
			arg audioBuf, bufIdx;
			var featureBuf;
			t = Main.elapsedTime;
			featureBuf = ~allFeatureBuffers[label].at(bufIdx);
			FluidBufMFCC.process(
				s,
				audioBuf,
				windowSize: 2048,
				hopSize: 2048,
				numCoeffs: 13,
				startCoeff: 1,
				features: featureBuf,
			).wait;
			(Main.elapsedTime - t).postln;
		});
	})
}).play;
)
~allAudioBuffers.size
~allFeatureBuffers["PizzOnD"].size


// look at the feature buffers now that they have been filled
~allFeatureBuffers["PizzOnD"][0].numFrames // these are the timesteps
~allFeatureBuffers["PizzOnD"][0].numChannels // these are the features (each channel is one MFCC), e.g. numCoeffs

// what should we have here:
(
~allFeatureBuffers.keysValuesDo{
	arg k, v;
	var nframes = v.collect{arg buf; buf.numFrames}.sum;
	// k.postln;
	// v.postln;
	"%\t\t\t\t\ttotal n frames: %".format(k,nframes).postln;
};

"\ntotal frames that should end up in the dataset: ".post;
~allFeatureBuffers.collect{
	arg bufarr;
	bufarr.collect{arg buf; buf.numFrames}.sum;
}.sum.postln;
)

// fill up the dataset!
(
~dummyBuffer = Buffer.alloc(s, 13, 1);
~dummyBuffer.loadCollection(Array.fill(13, { rrand(0,1) }));
~dataset = FluidDataSet.new(s);
~dataset.addPoint(\dummy, ~dummyBuffer);
~dataset2 = FluidDataSet.new(s);

Routine({

	~allFeatureBuffers.keysValuesDo({
		arg label, arrayOfBuffers;

		arrayOfBuffers.do({
			arg buffer, bufIdx;
			var localLabelSet;

			// buffer.postln;

			// make one label for each datapoint
			localLabelSet = FluidLabelSet(s);
			buffer.numFrames.do({
				arg count;
				// a unique identifier for this datapoint
				var identifier = "%_buf-%_frame-%".format(label,bufIdx,count);
				// "identifier: %".format(identifier).postln;
				localLabelSet.addLabel(count, identifier); // if label is unique the dataset ends up being complete
			});

			// localLabelSet.print;

			~dataset2.fromBuffer(buffer,0,localLabelSet);
			~dataset.merge(~dataset2, 0);
			// localLabelSet.free;
		});
	});
	~dataset.deletePoint(\dummy);
	"data set made".postln;
	~dataset.print;
}).play;
)

jthompson · January 26, 2022, 4:57am

Alright, this works! I mostly understand now. There is still confusion in my mind about labels, but I think it will become clearer to me as the work progresses.

Now I have a FluidKDTree fit to the dataset. Is there a computationally optimal way to search for the nearestNeighbor of a FluidMFCC of an incoming sound? Here is how I am doing it currently, but I wonder if I should do more inside a synth (~tree.kr somehow).

// Fit the tree;
(
Routine({
~tree = FluidKDTree.new(s, 20, 0);
s.sync;
~tree.fit(~dataset);
}).play;
);

// Test it out
(
Routine({

SynthDef("mfcc", {arg bufnum, bus;
	var mfcc, fft, input;
	input = PlayBuf.ar(1, bufnum, doneAction: 2);
	mfcc = FluidMFCC.kr(input, 13, startCoeff: 1, windowSize: 2048 );
	Out.kr(bus, mfcc);
	Out.ar(0, input);
}).add;

~bus = Bus.control(s, 13);
s.sync;

	100.do({
		Synth("mfcc", [ \bufnum, ~allAudioBuffers["TailPieceLowerSeparated"].at(2), \bus, ~bus ]);
		5.0.wait;
	});

}).play;

Routine ({
	{
		var arrayOfMFCCValues, tmpBuffer, key, index;
		arrayOfMFCCValues = ~bus.getnSynchronous(13);

		tmpBuffer = Buffer.loadCollection(s, arrayOfMFCCValues, 1, {
			~tree.kNearest(tmpBuffer, { arg a; if(a.sum !== 0, {~nearest = a}) });
		});

		key = ~nearest.at(0).asString.split($ ).at(0).asString;
		index = ~nearest.at(0).asString.split($ ).at(0).asInteger;
		// ~nearest.postln;
		 key.postln;
        0.1.wait;
    }.loop
}).play;
)

tedmoore · January 26, 2022, 5:41pm

Hi @jthompson,

This is the general shape of what you’re probably looking for. I commented out the .kNearest version and put in the .kr version. This keeps everything on the server.

Note that when it is on the server, the FluidKDTree can’t return the identifier (because it’s a Symbol). Instead the KDTree is given a lookupDataset that will most likely have different data than the original dataset but all the same identifiers. This could be something like info on how to playback the slice of audio associated with the MFCC analysis. That way that info can be retrieved and used all on the server.

For example, in the synth below, ~tree reads real-time MFCC analysis out of ibuf and finds the nearest neighbour to the MFCC analyses in ~dataset. Let’s pretend that the identifier of that nearest neighbour is “slice-7” (which is what you’d get returned by using .kNearest in the language). ~tree.kr then goes to the lookupDataset (which is passed as an argument to the .kr) and gets whatever data in there has the identifier “slice-7”–then writes that into obuf. This can be a different number of dimensions than the MFCC analyses. Below they’re both 13 because there isn’t a different FluidDataSet to use for lookup yet.

I hope that’s all helpful! Let me know how it goes!

// Fit the tree;
(
Routine({
	~tree = FluidKDTree.new(s, 20, 0);
	s.sync;
	~tree.fit(~dataset);
}).play;
);

// Test it out
(
Routine({

	SynthDef("mfcc", {arg bufnum, bus;
		var mfcc, fft, input;
		var ibuf = LocalBuf(13);
		var obuf = LocalBuf(13);
		input = PlayBuf.ar(1, bufnum, doneAction: 2);

		// maxNumCoeffs determines how many channels the krStream output of FluidMFCC will be
		// if numCoeffs is < maxNumCoeffs, it will fill the rest of the channels with zeros
		mfcc = FluidMFCC.kr(input, 13, startCoeff: 1, windowSize: 2048,maxNumCoeffs:13 );

		// write these 13 mfccs into "ibuf" (for input buffer) so that it can be read by the ~tree
		FluidKrToBuf.kr(mfcc,ibuf);

		// each time the ~tree receives an impulse it will find the nearest neighbor to the point that
		// is in "ibuf". it will then go to the "lookupDataSet", find a point with the same identifier and write
		// the datapoint into "obuf" (for output buffer). here it's just looking at dataset, so what we're getting
		// in obuf is the mfccs of the nearest neighbor (not too useful maybe), but if the "lookupDataSet" were
		// instead something like ~dataset_playback_info that had information like which buffer to play out of,
		// where in the buffer to start playback, and how long to play for, then this synth could playback the
		// nearest neighbour, all on the server.
		~tree.kr(Impulse.kr(10),ibuf,obuf,1,~dataset);

		// read out of obuf to a krStream so it could be used to drive playback or anything else.
		FluidBufToKr.kr(obuf).poll;

		// Out.kr(bus, mfcc);
		Out.ar(0, input);
	}).add;

	// ~bus = Bus.control(s, 13);
	s.sync;
	//
	// 100.do({
	Synth("mfcc", [ \bufnum, ~allAudioBuffers["TailPieceLowerSeparated"].at(2), \bus, ~bus ]);
	// 5.0.wait;
	// });
	//
	// }).play;

	// Routine ({
	// 	{
	// 		var arrayOfMFCCValues, tmpBuffer, key, index;
	// 		arrayOfMFCCValues = ~bus.getnSynchronous(13);
	//
	// 		tmpBuffer = Buffer.loadCollection(s, arrayOfMFCCValues, 1, {
	// 			~tree.kNearest(tmpBuffer, { arg a; if(a.sum !== 0, {~nearest = a}) });
	// 		});
	//
	// 		key = ~nearest.at(0).asString.split($ ).at(0).asString;
	// 		index = ~nearest.at(0).asString.split($ ).at(0).asInteger;
	// 		// ~nearest.postln;
	// 		key.postln;
	// 		0.1.wait;
	// 	}.loop
}).play;
)