Sorry this is not necessarily a code share as the code is a bit noodly right now, but here’s a project I did last month using UMAP and a Traveling Salesperson Problem solver to organize audio slices into 1D paths and the reconstruct them through time.
- I started with a bunch of audio and video recordings of some music that I composed.
- I sliced all that audio into 100 ms slices.
- I analyzed those slices and extracted audio descriptors (714 columns by the time the 1st derivative and all statistics were done!).
- I reduced that to 11 principle components (and was able to retain 99% of the variance!).
- I then used two ways of reducing this further to 1D, the first was a UMAP down to 1D and the second was a TSP solver algorithm (in python… python-tsp · PyPI FWIW)
- Then with each of these 1D sequences of slices, I reconstructed the audio-video file.
The tsp-solver solution is quite jittery, frequently jumping between different audio files and not staying in one place for too long:
One can kind of see that in this plot. Each dot is a 100 ms slice of audio. The X position shows the slice’s time position in the reconstructed sequence (the youtube link). The Y positions shows what source file it came from and where in that file (bottom of the “file’s bar” is at the beginning of the file, top is at the end).
The UMAP 1D solution is more smooth, showing crossfades of density as it transitions from one source file to the next:
That can also be seen here:
While creating the final piece, I selected some passages from each of these but didn’t end up using all the material, just the passages I liked! You can watch the whole thing here (password: framerate). The piece isn’t supposed to be public yet… so it’s behind a password…
I hope it’s interesting, maybe creates some ideas or inspiration!