Linking spectral descriptors to slices

Hello all,

I’m looking to be able to playback slices according to a spectral descriptor (e.g. rising centroid) in SC.
How could FluidBufSpectralShape compute spectral descriptors for onsets/slices, coupling the respective spectral info to each slice?
Would start and numFrame of FBSS be the most straightforward way to access the onsets?
Or maybe theres a wholly other approach with less frame calculations ?

Thanks,

Jan

1 Like

You might find this example I wrote useful. The idea is to show how different batch processing workflows can be achieved in Pd/Max/SC. Just select the SuperCollider tab and download the script. Let me know if I can explain anything any further but that should be an okay starter for seeing how to iterate over and analyse slices.

https://learn.flucoma.org/overviews/batch-processing#analysis-of-segments

It is a very useful example, thank you!
So the way to do this coupling of slices/spectral infos is storing it in a DataSet. If i understand correctly in this case we have the two identifiers mean and centroid and the respective data in Hz.
So for playing back all the slices in some special spectral order (e.g. ascending centroid) ill need a synth to somehow access this data (ideally as rearrangeable arrays?).
Also its not entirely clear to me where the respective slices frame infos remains in the Data Set in the example?

Once you have the slices and the spectral information you are free to store it however you want really :slight_smile: In thise case I chose a dataset to show off as much FluCoMa as possible, but as you are likely aware, you have access to all sorts of goodies in SuperCollider like Dict and Array where you can keep this information for later (such as for giving over to a Synth).

Something like this. The way I do it in Max is I sort the list and grab the ordinal. I’ll try whip up an example when I’m back on my computer.

The dataset identifiers are numbers which tell you the index of the slice that it corresponds to. For example. So the first slice’s centroid is stored against 1 in the dataset. Does that make sense? I can re-explain or draw it out which is sometimes easier for getting our heads around this sort of abstract data storage thinking.

Ah i see! :slight_smile: its surely good to learn about Dataset as long as i also understand how the info then gets available for use within synths. Otherwise surely arrays are more familiar!
An example would be really helpful, im still only beginning to get familiar with the specifics of these abstractions! Thanks so much!

1 Like

Indeed! In this case, the DataSet sounds like it’s not really what you want. If you were say, going to do some machine learning with FluCoMa objects then DataSet would be your friend :slight_smile:

I’m really no SuperCollider expert (thats @tedmoore) but you can very simply store the descriptor values for each slice in an array and then get the order of those values as a lookup table for scrubbing through them from lowest to highest spectral centroid value.

(
// Return an array of indices that would sort the collection into order. You should get [0, 2, 1, 3, 4]. You can use these values to look up the slices and get an ordered set of times that correspond to the centroid values.
[5, 200, 100, 300, 500].order;

// If you want to reverse sort you can pass a custom function :)
[5, 200, 100, 300, 500].order({ |a,b| a > b });
)

(
// Imagine these are spectral centroid values
~ordinal = [3000, 500, 200, 9000, 100].order;
~ordinal.postln;
// You could get the slice index (4) with the lowest spectral centroid (100)
~ordinal[0];
)

Thank you James! For me “very simply store the descriptor values for each slice in an array” is still not so obvious though;)
i will need two arrays for descriptor data + slice data and an array of indices (on a “higher level”) that links and orders those two “subarrays”. The example you linked to does this with DataSet. Would you suggest to e.g. try to rewrite the example substituting DataSet with arrays (or first buffers) to get those arrays to be used for the synth ?
Sorry for the lack of fluency in grasping this!

sorry, the “higher level array” for indexing is surely the ordinal!

Don’t apologise - this is part of the fun learning and sharing process :slight_smile: . I just spent this morning with my head in my hands trying to get my head around some Indian classical music theory so we’re all learning something somewhere.

If you have your slices like this in an array or buffer:

0 100 200 300 400

and we also have some spectral analysis for each consecutive pair of slices:

4000 353 30 9000

We can, in our minds attach each pair to those values.

0 100 = 4000 <-- slice 0
100 200 = 353 <-- slice 1
200 300 = 30 < -- slice 2
300 400 = 9000 < -- slice 3

and we know that the first value (4000) belongs to the first slice (0 if we 0 count) and we know the start and boundaries (because all the info is stored in the buffers somewhere).

If we sorted this collection of data by the descriptor values we would get something like this:

200 300 = 30 <-- slice 2
100 200 = 353 <-- slice 1
0 100 = 4000 <-- slice 0
300 400 = 9000 <-- slice 3

Now, our collection is sorted by the descriptor values, they go from 30 to 9000 and we remember the slice that these descriptor values belong to. But hold on, remembering which slice belongs to each descriptor value is in our heads, not the computer, right? To keep that relationship in the code we can infer the slices from the position of elements in an array, as well as the ordinal (which in the above example is derived from the .order function called on any Sequencable Collection.

So in this case our ordinal of the above collection would be [2, 1, 0, 3] because if we had to sort this:

4000 353 30 9000

we would rearrange those elements into an array by taking the 2nd, 1st, 0th and 3rd element in that order. You can think of it kind of like a lookup table.

So our ordinal ([2, 1, 0, 3]) actually tells us what order to grab the slices, based on the descriptor values.

That is the order we would have to put things for those descriptor values. If we referenced back to our slices:

[0, 100, 200, 300, 400]

The slice with the lowest spectral centroid would be 2, (the first element in our ordinal list), or 200 and 300 for start and end times of that slice.

Does that make more sense? Maybe I can ping the wise @tedmoore for some SuperCollider magics in which there is some playback based on descriptor values?

2 Likes

Hi @jan,

Let me know if this gets at what you’re looking for. And let me know what questions you have about the code. I’ll be happy to answer them!

Cheers,

Ted

// load some audio (replace with a path to the audio you're interested in)
~audio = Buffer.readChannel(s,"/Users/macprocomputer/Desktop/_flucoma/code/flucoma-sc/release-packaging/AudioFiles/Nicol-LoopE-M.wav",channels:[0]);

// create a buffer for holding slice points
~slicepoints_buffer = Buffer(s);

(
// slice according to spectral onsets, tweak the 'threshold' until you're getting about how many slice points you think you should be seeing (lower threshold will allow more slice points
FluidBufOnsetSlice.processBlocking(s,~audio,indices:~slicepoints_buffer,metric:9,threshold:0.4,action:{
	"done slicing!".postln;
	"found % slice ponits".format(~slicepoints_buffer.numFrames).postln;
});
)

(
// get the slice points out of the buffer and put them in an array
// notice that i'm putting "0" on the beginning of the array and the number of frames in the buffer at the end so that *all* of the audio in my buffer will be accounted for: the audio before the first slice point, and the audio after the last slice point;
~slicepoints_buffer.loadToFloatArray(action:{
	arg array;
	~slicepoints_array = [0] ++ array ++ [~audio.numFrames];
});
)

// inspect the slicepoints:
~slicepoints_array.dopostln;

// analysis:

(
// first we'll make a buffer for storing the spectral analyses in:
~features_buf = Buffer(s);
// and a buffer for storing the statistical analyses in:
~stats_buf = Buffer(s);
// and a list for collecting the average centroids:
~centroids = List.new;
)

(
fork{
	~slicepoints_array.doAdjacentPairs{
		arg start_frame, end_frame, i;
		var num_frames = end_frame - start_frame;
		FluidBufSpectralShape.processBlocking(s,~audio,start_frame,num_frames,features:~features_buf);

		// notice that we're only analysing the first channel, this is because that's were the
		// spectral centroid is and that's all we're interested in!
		FluidBufStats.processBlocking(s,~features_buf,startChan:0,numChans:1,stats:~stats_buf);

		// now we'll get index 0 out of the stats buf because thats where the average spectral centroid will be
		// if we wanted to get the median instead, we would look in index 5 (see the FluidBufStats helpfile for more
		// options)
		~stats_buf.loadToFloatArray(0,1,{
			arg mean_centroid;
			~centroids.add([i,mean_centroid[0]]); // here we'll save the mean centroid *along with* it's index so
			// that we can use it later to look up where it's start and end points are.
		});
	};
	s.sync;
	"done with anaylsis".postln;
}
)

// inspect centroids list:
~centroids.dopostln;

// sorting:
~centroids.sort({arg a, b; a[1] < b[1]}); // sorting by the value in index 1 because that's the centroid

// inspect again now that it is sorted:
~centroids.dopostln;

(
// playback in a sorted order
fork{
	~centroids.do{
		arg array;
		var index = array[0];
		var centroid = array[1];
		var start_frame = ~slicepoints_array[index];
		var end_frame = ~slicepoints_array[index + 1];
		var num_frames = end_frame - start_frame;
		var dur_seconds = num_frames / ~audio.sampleRate;

		"playing index: %\tcentroid: %".format(index,centroid).postln;

		{
			var sig = PlayBuf.ar(1,~audio,BufRateScale.ir(~audio),startPos:start_frame);
			var env = EnvGen.kr(Env([0,1,1,0],[0.03,dur_seconds-0.06,0.03]),doneAction:2);
			sig.dup * env;
		}.play;

		dur_seconds.wait;
		1.wait;
	};
};
)
1 Like

Thank you both @jamesbradbury & @tedmoore for your concise explanation and example! this really seems to be it and i do better understand the logic of this procedure. i will take my time to delve into the example and adapt to my own code to experiment with! this is gold for learning, and really exciting to find a thread in the (sometimes overwhelming) maze of possibilities flucoma offers! ill get back if any question comes up i cant be able to answer myself!

Do let us know if you end up feeling stuck, we really value that kind of interaction so that we can learn how to help others :slight_smile: Good luck!

1 Like

Hi @tedmoore,
your example has been really helpful so far and i’ve been able to explore quite a bit with own variations!:slight_smile:
Now im trying to go a step further and get the mean of all the spectral shapes (not just centroid) computed statistically and sorted, but am unsure on how to retrieve those from the BufStats into arrays. One would increase the numChans to 7 (all shapes), but im stuck regarding the indexing/frame count to retrieve the right statistics.
It would be this portion of the code:

(
fork{
	~slicepoints_array.doAdjacentPairs{
		arg start_frame, end_frame, i;
		var num_frames = end_frame - start_frame;
		FluidBufSpectralShape.processBlocking(s,~audio,start_frame,num_frames,features:~features_buf);

		// change to all shapes
		FluidBufStats.processBlocking(s,~features_buf,startChan:0,numChans:7,stats:~stats_buf);

		~stats_buf.loadToFloatArray(0,1,{
			arg mean_centroid;
			~centroids.add([i,mean_centroid[0]]); 
		});
	};
	s.sync;
	"done with anaylsis".postln; 
}
)

// inspect centroids list:
~centroids.dopostln;

// sorting:
~centroids.sort({arg a, b; a[1] < b[1]});

Thanks for your help!

1 Like

Hi @jan,

This is a common challenge to sort out.

I think this will do what you’re looking for. I’ve copy and pasted your code but just changed a few sections. You’re absolutely right that we want FluidBufStats to analyse all the channels of our ~features_buf, since those channels contain all the different analyses that FluidSpectralShape offers.

When we go to get those values out of the ~stats_buf with .loadToFloatArray we can tell .loadToFloatArray to just give us everything from the buffer using the arguments 0 and -1 (these are also the defaults). 0 is where to start (the beginning) and -1 specifies to load the values all the way through the entire rest of the buffer.

Next, you’ll see in the code how I “clump” the big float array into sub-arrays that now contain the different statistical measures from FluidBufStats. To get a specific statistical analysis for a specific spectral analysis, one should index into this buffer first by statistical analysis, then by spectral analysis, as seen in the post window in my examples below the “table” of values.

Let me know if this is what you’re looking for and whatever questions you have!

(
fork{
	~slicepoints_array.doAdjacentPairs{
		arg start_frame, end_frame, i;
		var num_frames = end_frame - start_frame;
		FluidBufSpectralShape.processBlocking(s,~audio,start_frame,num_frames,features:~features_buf);

		// change to all shapes
		FluidBufStats.processBlocking(s,~features_buf,startChan:0,numChans:7,stats:~stats_buf);

		~stats_buf.loadToFloatArray(0,-1,{// 0 = start at frame 0, -1 = load all the values all the way to the end of the array;
			arg stats_values;
			/*
			loadToFloatArray just spills out all of the values from the buffer into one big one-dimensional array, so we need to "clump" it back together in it's frames. we can do this by calling .clump(numChannels), which create sub-arrays inside of this big array that is numChannels long, essentially "clumping" the values in the 0th frame from all the channels to gether
			*/
			stats_values = stats_values.clump(~stats_buf.numChannels);// we know it is 7 but this makes it a little more flexible in case something changes

			"\n\n--- slice: % ---\nthe 'columns' here are spectral descriptors:".format(i).postln;
			"\t\t\tcentroid\t\tspread\t\t\t\tskewness\t\tkurtosis\t\trolloff\t\t\tflatness\t\t\tcrest".postln;
			"mean:     %".format(stats_values[0]).postln;
			"std dev:  %".format(stats_values[1]).postln;
			"skewness: %".format(stats_values[2]).postln;
			"kurtosis: %".format(stats_values[3]).postln;
			"min:      %".format(stats_values[4]).postln;
			"median:   %".format(stats_values[5]).postln;
			"max:      %".format(stats_values[6]).postln;
			// stats_values.dopostln;

			// for example to get the mean spectral flatness do:
			"mean spectral flatness:   %".format(stats_values[0][5]).postln;
			// 0 gets you the mean values and 5 gets you to index 5 which is where spectral flatness is

			// for example to get the minimum spectral rolloff do:
			"min spectral rolloff:     %".format(stats_values[4][4]).postln;
			// 4 gets you the min values and 4 gets you to index 4 which is where spectral rolloff is

			// for example to get the median spectral centroid do:
			"median spectral centroid: %".format(stats_values[5][0]).postln;
			// 5 gets you the median values and 0 gets you to index 0 which is where spectral centroid is

			// from here you can pull out whatever values you might want for this particular slice and then sort them below!

			"-----------------".postln;
		});
	};
	s.sync;
	"\ndone with anaylsis".postln;
}
)
1 Like

Thank you @tedmoore for this really wonderful explanation, it made it much more accessible!
I do have a minor follow up question. i added the following line

~allstats.add([i,stats_values[0,1,2,3,4,5,6]]);

to get a List as in the initial example in which there is an index + 7 spectral analyses so as to allow for their sorting. i cant really wrap my head on how to sort all of these features though?

~allstats.sort({arg a, b; a[?] < b[?]});

i think this should be the last step to reach the goal:)

Hi @jan,

Cool. I’m glad it’s working for you. Can you be more specific about what you want to sort? Which statistical analysis are you interested in using and which spectral descriptor are you interested in sorting? Even if you just suggest a hypothetical for now, that could help me make a demo!

Cheers,

T

Of course! My idea is to extend the initial example with centroid to all spectral shapes. so that the list(s) order the spectral features from low to high, and the respective shape indices with it. Ideally this would mean one list with all spectral features which i can then translate to 7 individual arrays. (i assume the mean might be the most useful in most cases but i guess this is flexible?) Like this it would be possible to read the slices in the synth along the array ordered in a perceptually relevant fashion (theoretically at least). i hope this is more clear?

What i still cannot envision though (only now im in the position to even consider it) is how to combine different parameters: e.g. play slice which has the lowest centroid and above a specific threshold of spectral flatness etc…

1 Like

Hi @jan,

I think this is what you’re looking for. I’ve picked up from where the code above leaves off.

Let me know what questions you have and if there’s any more I can do to help!

Cheers,

Ted

(
fork{
	
	~analyses = {List.new} ! 7; // an array of 7 indices (for the 7 descriptors) that is each a list, so that we can add the analyses on this list.
	
	~slicepoints_array.doAdjacentPairs{
		arg start_frame, end_frame, i;
		var num_frames = end_frame - start_frame;
		var stat_analysis = 0; // statistical descriptor to use, here 0 indicates the mean.
		FluidBufSpectralShape.processBlocking(s,~audio,start_frame,num_frames,features:~features_buf);
		
		// change to all shapes
		FluidBufStats.processBlocking(s,~features_buf,startChan:0,numChans:7,stats:~stats_buf);
		
		~stats_buf.loadToFloatArray(0,-1,{// 0 = start at frame 0, -1 = load all the values all the way to the end of the array;
			arg stats_values;
			/*
			loadToFloatArray just spills out all of the values from the buffer into one big one-dimensional array, so we need to "clump" it back together in it's frames. we can do this by calling .clump(numChannels), which create sub-arrays inside of this big array that is numChannels long, essentially "clumping" the values in the 0th frame from all the channels to gether
			*/
			
			stats_values = stats_values.clump(~stats_buf.numChannels);// we know it is 7 but this makes it a little more flexible in case something changes
			
			"\n\n--- slice: % ---\nthe 'columns' here are spectral descriptors:".format(i).postln;
			"\t\t\tcentroid\t\tspread\t\t\t\tskewness\t\tkurtosis\t\trolloff\t\t\tflatness\t\t\tcrest".postln;
			"mean:     %".format(stats_values[0]).postln;
			"std dev:  %".format(stats_values[1]).postln;
			"skewness: %".format(stats_values[2]).postln;
			"kurtosis: %".format(stats_values[3]).postln;
			"min:      %".format(stats_values[4]).postln;
			"median:   %".format(stats_values[5]).postln;
			"max:      %".format(stats_values[6]).postln;
			// stats_values.dopostln;
			
			"-----------------".postln;
			
			stats_values = stats_values.flop; // transpose the table of values so the rows become the columns and the columns become the rows.
			// now we have a new table:
			
			"the 'columns' here are statistical descriptors:".postln;
			"\t\t\tmean\t\t\tstd dev\t\t\t\tskewness\t\tkurtosis\t\tmin\t\t\t\tmedian\t\t\t\tmax".postln;
			"centroid: %".format(stats_values[0]).postln;
			"spread:   %".format(stats_values[1]).postln;
			"skewness: %".format(stats_values[2]).postln;
			"kurtosis: %".format(stats_values[3]).postln;
			"rolloff:  %".format(stats_values[4]).postln;
			"flatness: %".format(stats_values[5]).postln;
			"crest:    %".format(stats_values[6]).postln;
			
			"-----------------".postln;
			
			// now we can take just the mean of each of these spectral descriptors
			// if you want to sort by a different statistical analysis, change the param 'stat_analysis' above
			stats_values = stats_values.collect{
				arg spec_desc_stats;
				spec_desc_stats[stat_analysis];
			};
			
			"for each spectral descriptor, just the selected statistical value (%):".format(stat_analysis).postln;
			"centroid: %".format(stats_values[0]).postln;
			"spread:   %".format(stats_values[1]).postln;
			"skewness: %".format(stats_values[2]).postln;
			"kurtosis: %".format(stats_values[3]).postln;
			"rolloff:  %".format(stats_values[4]).postln;
			"flatness: %".format(stats_values[5]).postln;
			"crest:    %".format(stats_values[6]).postln;
			
			// add this value to the appropriate list in our analyses array (with the slice index! so we know which it is later):
			stats_values.do{
				arg val, stat_i;
				~analyses[stat_i].add([i,val]);
			};
			
		});
	};
	s.sync;
	"\ndone with anaylsis".postln;
	
	"\nfor spectral descriptor, we have a list of each slice's statistical analysis value".postln;
	"\tthe arrays in this list are: [ slice_index , spectral_descriptor ]".postln;
	"centroid: %".format(~analyses[0]).postln;
	"spread:   %".format(~analyses[1]).postln;
	"skewness: %".format(~analyses[2]).postln;
	"kurtosis: %".format(~analyses[3]).postln;
	"rolloff:  %".format(~analyses[4]).postln;
	"flatness: %".format(~analyses[5]).postln;
	"crest:    %".format(~analyses[6]).postln;
}
)

(
// do the sorting:
~analyses.do{
	arg spec_desc_list;
	spec_desc_list.sort({arg a, b; a[1] < b[1]});
};

"\nSORTED:".postln;
"centroid: %".format(~analyses[0]).postln;
"spread:   %".format(~analyses[1]).postln;
"skewness: %".format(~analyses[2]).postln;
"kurtosis: %".format(~analyses[3]).postln;
"rolloff:  %".format(~analyses[4]).postln;
"flatness: %".format(~analyses[5]).postln;
"crest:    %".format(~analyses[6]).postln;
)

(
// do something with them in order:
"\nslice indices ordered by spectral centroid:".postln;
~analyses[0].do{arg arr; arr[0].postln;};

"\nslice indices ordered by spectral rollof:".postln;
~analyses[4].do{arg arr; arr[0].postln;};

"\nslice indices ordered by spectral flatness:".postln;
~analyses[5].do{arg arr; arr[0].postln;};
)

hi @tedmoore,
thanks again for this valuable help. i need to find a moment to experiment it in my own context but it looks very promising!:slight_smile:
a question that remains for me is though whether this approach using arrays hits its limitation when trying to combine multiple spectral criteria, e.g. find slice with lowest centroid and with spectral flatness above x, etc.? as the arrays are sorted “parallel” according to single spectral shapes, multiple criteria would have to presuppose another ordering entirely?
greetings,
jan

Hi @jan,

This could be modified to achieve what you describe. For example if you’re looking to only consider slices with spectral flatness above a certain value, the first step would to remove anything that doesn’t fit the criteria (in our case, you could use SuperColliders Array method .select). Then sort according to the spectral centroid and take the lowest (see code below).

In both of these cases (using .select and .sort) you’ll first need to decide what statistical summary you’ll use. Remember that the “mean centroid” is not the average across all slices, but the average centroid within a single slice. In this way it is a summary of that slice (since each slice is comprised of many centroid values across time). Similarly, the min that comes out of BufStats is not the minimum of all the slices, but the minimum of a single slice.

Also, you should check out the object FluidDataSetQuery, which will allow you to do some similar queries with a FluidDataSet.

Le me know how this works for you!

Cheers,

T

// load some audio (replace with a path to the audio you're interested in)
~audio = Buffer.readChannel(s,"/Users/macprocomputer/dev/flucoma/flucoma-core-src/Resources/AudioFiles/Nicol-LoopE-M.wav",channels:[0]);

// create a buffer for holding slice points
~slicepoints_buffer = Buffer(s);

(
// slice according to spectral onsets, tweak the 'threshold' until you're getting about how many slice points you think you should be seeing (lower threshold will allow more slice points
FluidBufOnsetSlice.processBlocking(s,~audio,indices:~slicepoints_buffer,metric:9,threshold:0.4,action:{
	"done slicing!".postln;
	"found % slice ponits".format(~slicepoints_buffer.numFrames).postln;
});
)

(
// get the slice points out of the buffer and put them in an array
// notice that i'm putting "0" on the beginning of the array and the number of frames in the buffer at the end so that *all* of the audio in my buffer will be accounted for: the audio before the first slice point, and the audio after the last slice point;
~slicepoints_buffer.loadToFloatArray(action:{
	arg array;
	~slicepoints_array = [0] ++ array ++ [~audio.numFrames];
});
)

// inspect the slicepoints:
~slicepoints_array.dopostln;

// analysis:

(
// first we'll make a buffer for storing the spectral analyses in:
~features_buf = Buffer(s);
// and a buffer for storing the statistical analyses in:
~stats_buf = Buffer(s);

~slice_analyses = List.new;
)

(
fork{
	~slicepoints_array.doAdjacentPairs{
		arg start_frame, end_frame, i;
		var num_frames = end_frame - start_frame;
		FluidBufSpectralShape.processBlocking(s,~audio,start_frame,num_frames,features:~features_buf);

		FluidBufStats.processBlocking(s,~features_buf,stats:~stats_buf);

		~stats_buf.loadToFloatArray(0,7,{ // take just the first 7 values, as those are the means
			arg mean_spectral_analyses;
			~slice_analyses.add((
				slice_index:i,
				mean_centroid:mean_spectral_analyses[0],
				mean_spread:mean_spectral_analyses[1],
				mean_skewness:mean_spectral_analyses[2],
				mean_kurtosis:mean_spectral_analyses[3],
				mean_rolloff:mean_spectral_analyses[4],
				mean_flatness:mean_spectral_analyses[5],
				mean_crest:mean_spectral_analyses[6],
			));
		});
	};
	s.sync;
	"done with anaylsis".postln;
}
)

// inspect analyses list:
~slice_analyses.dopostln;
~slice_analyses.size;

// first select (keep) only slices where the spectral flatness is above -12:
~slice_analyses = ~slice_analyses.select({arg slice; slice.mean_flatness > -12});

// inspect analyses list:
~slice_analyses.dopostln;
~slice_analyses.size;

// then sort by spectral centroid:
~slice_analyses = ~slice_analyses.sort({arg a, b; a.mean_centroid < b.mean_centroid});

(
// post and see what the order is
~slice_analyses.do{
	arg slice;
	"slice: %\tmean centroid: %".format(slice.slice_index,slice.mean_centroid).postln;
}
)