Hybrid ("vertical") layered resynthesis

I was thinking about this again yesterday and I want to revisit this idea with my batch analysis stuff.

I’m thinking of keeping the end component count low due to memory limits and the sheer amount of combinations/permutations possible, but I wanted to bump this to get some thoughts on how to best do this.

Even though having an estimation of amount of nmf components via fluid.bufnnnsvd~ would be great, having a low amount of components that is consistent across all the files would make the plumbing and such more functional for the patch.

//////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////

So to reiterate the initial idea. It would be to take each individual sample and split it into four layers. At the moment I’ve done two via HPSS and two via NMF. This could be more or less layers and/or different types, but I’ve had good results with these algorithms.

For now I’m combining all of those into a single 4-channel buffer with the same name and duration as the original as this makes other stuff easier as well (playback/etc…).

This much is working so far.

Before I got sidetracked about how quickly I’ll run out of ram (thanks for the example patches @tutschku!), I was wondering how to best go about analyzing how layers might combine to be query-able.

On thinking about it this morning, I thought I can just fluid.bufcompose~ every possibel combination of layers to create accurate analyses of how these would sound together.

So if I have three samples (sample1, sample2, sample3) and I run them through the decomposition process, I’d then have twelve samples (sample1A, sample1B, sample1C, sample1D, sample2A, sample2B, etc…).

The brute force thing to do would be to analyze every combination/permutation of that (sample1A + sample2B, sample1A + sample3D, etc…). I could do that for every permutation except the combinations which form original samples (so no sample1A + sample1B since that would be the sum of HPSS).

Off the top of my head I don’t know how to calculate the combinatorics here, but I know enough to know that that will become a really really really big analysis set. (ca. 1700 samples, 4 layers each, every possible combination up to 4 simultaneous sounds).

AND

That only takes into consideration starting all the files at the same time. So having stuff where sample1A starts at 0 and sample3C starts at 120ms, is out of the question.

//////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////

So other than kind of spitballing verbose about the complexity of this, I wanted to see if there was some better way of pre-analyzing what or how possible combinations of sounds might work here.

Or if there is a way of figuring out how multiple layers of sounds might fuse together…without having to manually fuse them and (re)analyzing them (i.e. from loads of descriptors/statistics).