I am trying to segment some pretty onset-less audio. Below is a lovely mel-spectogram of the stuff. The audio in question is this file here - beware its very loud so turn down if you want to listen. libLLVMAMDGPUDesc.a.wav.zip (229.2 KB)
To me, there is some quite interesting visual differentiation in the melspectogram that correlates to the different motifs which I hear in the audio file as a whole. The question now is, how can I get the computer to look for such changes and return me a bunch of slice points.
My first inclination was spectrul flux,
fluid.bufonsetslice~ @function 2 to look for differences between successive frames but it seems to be very hit and miss and to detect some of the minor changes but not the major ones.
I’ve been looking at things like self similarity between frames, and constructing things like recurrence matrices in Python which returns data which looks something like this:
Its fairly clear from this that there is some degree of representation for what we can see visually and it might be interesting to segment this data to get some times back. At this point I am stabbing int he dark a little and playing things that are beyond my technical understanding.
As such, this brings me back to the
fluid tools, something that I think has more flexibility than the Python analysis and is more transparent to me in what it is doing. Any ideas @weefuzzy @groma @tremblap on what tools I can investigate further for pre/post processing such wonky audio to achieve some sort of segmentation of motifs?