Immediately going to apologise here as this isn’t toolbox related but perhaps interesting to those who are getting their hands especially dirty with the objects now. I might also be covering ground that the team have previously discussed, so apologies if that is the case.
Obviously part of this project is finding new ways to segment your sounds which has resulted in me following the infinite youtube/google hole (why does that sound dirty?) toward a process called Matrix Profiling. It seems relatively fresh and there is some good noob friendly papers to make it clear. In so far, I’ve been testing it with Python and R as there aren’t many simple C++ solutions out there. The basics of it are, as far as I can tell, that you take any time series and apply a sliding window across it, kinda like an FFT. It then does magical math and stats to calculate a Matrix Profile and a Matrix Index. Crudely put, these two outputs give you an indication of the normalised euclidian distance between a sequence in that window and its distance to a similar sequence elsewhere in the time series. (Note I have no formal training in maths/stats and you’re better off reading the article for a more correct and comprehensive understanding).
https://www.cs.ucr.edu/~eamonn/MatrixProfile.html <-- Good info hub but lots of info so quite overwhelming
https://www.cs.ucr.edu/~eamonn/Matrix_Profile_Tutorial_Part1.pdf <-- really good to get the gist of it.
https://www.cs.ucr.edu/~eamonn/Matrix_Profile_Tutorial_Part2.pdf <-- more
https://github.com/target/matrixprofile-ts <-- Python Code
https://franzbischoff.github.io/tsmp/ <-- R Code
I ran some experiments on
jongly.aif to see what kind of segmentation it produced. Disclaimer: these calculations took f!@#$%^ ages to do, and in some cases I only did the first 44100 or 88200 samples.
This first image is a graph depicting
jongly.aif with its Matrix Profile on top. What is interesting about this is where the lowest points in the matrix profile appear, which is quite close to reasonably expected amplitude onsets. Conversely, high points indicate ‘anomalous’ patterns that are harder to find in the time series, or don’t exist at all and these seem to correspond with the stuff in between onsets. This is of course, quite relative, for example at around 16000 we have a global low point which matches up with a big transient, but later on around 30000 we have relative low points.
To me, this provides a mechanism allowing segmentation of the matrix profile itself, returning a number of regions where motifs and discords can be separated from each other.
Once you have your matrix profile there are other things you can do to make meaning of the data. This next image shows a process of finding motifs, although this process is a bit more black boxed and I wasn’t able to pick apart the code to figure out how it was working. To me, it seems like the sliding window I described earlier was limiting how large the motifs could be which is somewhat anti-musical if you were trying to create chunks of useful sound. I think that the start point of each motif is also not compensated for in the display, or maybe it is as some seem bang on - I don’t know.
- 1 parameter
- ‘fast’ <- but maybe not as fast as what DSP needs.
- The Matrix Profile that you get out of the process can be massaged into a number of different analyses (Motifs, Discords, Sub-sequences)
- Works on vectors so potent for SIMD/Multi-threaded stuff. This is demonstrated in the R implementation.
- Its an anytime algorithm (depending on which algo you run) so you can interrupt it quite early for fun subversive results. On the flip-side the longer you run it the more it will converge on a result.
- To me it seems too slow to be useful on-line, but possibly usable off-line.
- The window size drastically changes your data and so you have to be quite fluent with the algorithm.
Anyway, this is just some side-fun I’ve been having and thought I would share it here as it seems aligned with the interests of others here as well as the existing toolkit.