FluidSpectralShape resynthesis

jan · April 17, 2022, 12:13pm

Hello all,

im aiming to resynthesize sounds on the basis of FluidSpectralShape. I was wondering if there are examples on how the spectral descriptors can be formulated as (re)synthesis parameters?

Thanks for any suggestions!

Jan

jamesbradbury · April 18, 2022, 6:16am

I think you will be hard pressed to re-synthesise something that accurately portrays the input sound just off those seven descriptor values as there is quite a severe loss of “resolution” which describes the spectrum. That said, there are many techniques where you “rebuild” a sound from another (concatenative synthesis). You might also construct something more straightforward, such as using the values to drive a filter (see the help file for SpectralShape.

jan · April 18, 2022, 7:55am

hi @jamesbradbury,

yes im aware that theres surely a lot lost in translation, the aim is not an accurate portrayal but a controllable synth based on these descriptors. i was specifically wondering whether/how skewness, kurtosis and flatness could be reconstructed, and if these necessarily imply fft or such.
greetings,

jan

tedmoore · April 20, 2022, 3:02pm

Hi @jan,

Spectral flatness often correlates to how “noisy” a spectrum is–the flatter the spectrum the noisier the sound.

As for skewness and kurtosis, these get more abstract in terms of how they might be perceived and “resynthesized”. Also, they might start to compete with other parameters that you’re using to drive filters or whatever (as suggested by @jamesbradbury) such as the spectral centroid and spread, leaving strange results. Maybe just choosing a few of the analysis parameters to drive a filter and/or some other analyses to drive some other synth params will be more straight forward.

Another thought is you could use these analyses as the input to a FluidMLPRegressor and train the neural network to predict synthesis parameters for whatever synthesis algorithm you want to use. In order to do the training you would need to pair a bunch of synthesis parameters (probably ones that are chosen by and important to you) with the SpectralShape analysis that those synthesis parameters produce. Then train the neural network using the analyses as the input and the params as the output.

Example file attached. This one uses MFCCs instead, but I’d be curious to see how it works with SpectralShape! It might be better! Also this is using just FM Synthesis, so putting a more interesting synth algorithm in there would be cool as well!

Let me know if you give this a shot, I’d be happy to help!

regressor_descriptors_step_by_step.scd (3.8 KB)

jan · April 20, 2022, 8:00pm

Hi @tedmoore,

i see that the spectral descriptor resynthesis entails some challenges. i thought id ask around as centroid was also the only obvious parameter to me, e.g. in the setting of an additive model.
but what you suggest is very enticing. i hadn’t looked into the regressors at all, but am really looking forward to do so!
one question regarding the 4h block of code: the model gets trained only continuously in realtime or also in non-realtime? and the advantage of realtime training would be the ability to tweak the parameters to basically point to algorithm in a specific direction (kind of seeding so to speak)?

thanks a lot!

jan

tedmoore · April 20, 2022, 9:02pm

Good question. Basically, you should train the neural network until you observe the “error” decrease a bit and then flatten out, or stop decreasing very much. Then run the line of code: ~continuous_train = false; to stop training.

The reason this recursive,

if(~continuous_train,{
			~train.();
		});

thing is used is so that you can watch in the post window how the error is changing. You’re right that during the training you can change things like the learning rate and/or other params to see how it affects the error. For a bit more info on training and the error and such you can see this video using the regressor in Max, or the classifier in SC.

To start using the trained neural network, run the block below (sorry–I notice they’re both called “4.”)

I hope that’s helpful. Let me know what other questions arise!

spluta · April 21, 2022, 9:32pm

I made this piece pre-FluCoMa, at about 2 P.F. as we say around here:

http://sampluta.com/compositionMatrixForGeorgeLewis.html

The score has all the code. If you just search the score for “centroid” you can see how I’m using the centroid to control parameters and the video lays out some of the control parameters. Maybe not what you are looking for, but…it works kind of.

Sam

tremblap · April 22, 2022, 9:11am

indeed… but in all cases, don’t forget to use log scaling (@unit and @power both to 1) to have something muc more perceptually relevant!

jan · April 22, 2022, 3:26pm

Thank you @tedmoore @spluta @tremblap for the input and helpful advice!