Audio rate MLPRegressor

Amazing!

So what kind of input/output will your Max thing do? Buffers?

I was wondering, after posting this, if audio writing/reading can happen at audio rate, with the right threading.

Working on a project with a buddy for the upcoming NIME about using a genetic algorithm to crawl a synth space, then have a “timbre analogy” created from realtime input (basically a slightly more lofi but more generalizable version of this approach using DDSP).

As part of that he’s built a thing where he can train up a regressor in Python and then dump out a .json that can be loaded into the FluCoMa stuff, with the structure/activations all matching.

Once that’s up and running will share that as well.

1 Like

Audio or a list. It is designed for DSP processing so the MSP object will output multichannel audio. The Max object will output a list. In pd you snake the audio together then unsnake the output. Hopefully that will be as easy in Max.

I like a triggerable audio object though. I hadn’t thought of that. That would still send audio as its output, but it will only run inference when triggered. So you could get a changing output faster than the control rate, but not have to do thousands of calculations every sample. Seems reasonable. Is that what you are imagining?

Sam

1 Like

Yeah exactly.

The use case I’m working on just needs to happen once per onset, so it would be ideal to have the whole signal path stay in the signal domain for speed/timing.

Though full-blown audio rate would be great.

MC audio is a decent way to handle that I think (in Max).

Independent of the technical-side of things here, curious what kind of musical results and/or funky algorithms you’ve been able to cook up with this, if you have a version working in your setup already.

I can picture some crazy feedback processes where there’s NNs in the signal path.

The shape isn’t unusual, but you’d need to do some wrangling to write / read the weights and biases into a json file of the expected format, which shouldn’t be too hard to do in python.

It’s an sklearn-to-flucoma-mlp script but it might be a useful reference for formatting the data for the flucoma json. also if a pytorch-to-flucoma gets made, it might be nice to fold this into some Python Scripts for FluCoMa package. I have some python-NMF ones I use when teaching but are a good extension of using the flucoma nmf stuff.

3 Likes

@spluta Your project looks very interesting! Do you already have any resources on how to train models for RTNeural for timbre transfer (similar to rave)? I dug through the github repo, but could only find examples for control rate training.

Thanks for your interest in my project. I just got most builds made for SC, pd, and max and uploaded to the repo. The only one that I haven’t cracked yet is the Max Windows build, which gets a strange error deep in the RTNeural source code. Anyhow, the builds are here under the Releases. (These include Rod’s requested feature of being able to run at audio rate, but only run inference when receiving and audio rate impulse.)

RTNeural_Plugin is more of a general purpose inference engine, so it really comes down to whether you can get your nn into a state that can be loaded by Jatin’s library. The list of available layers are here..

I don’t know the exact shape of Rave models. I know they are autoencoders, but the devil is certainly in the details.

The future goal for this plugin would be to load any pytorch training. The problem is that libtorch is not real-time safe (which is why the Rave stuff crackles). ONNX is not particularly efficient. Maybe a new library will come along that is both of those things, and that would be relatively easy to implement from where I am at.

Sam

4 Likes

This is interesting stuff. So you are training your models presumably with keras and tensorflow to use with this? Or have you cracked the pytorch problem by this point?

The Windows build for Max now works. I had to build with MinGW, not MSVC.

No, actually, I need to document this better, but to see what is going on you need to download the RTNeural_python directory from the release builds. This is where all the trainings happen. There is a README in the root directory on how to set up the virtual environment and then READMEs in the subdirectories on how to do specific trainings. This is by no means exhaustive. The plugin can do way more than what I am showing how to train for.

In most cases I am training in pytorch and converting the trainings to the keras format that RTNeural wants. Trainings should save a .pt file as well as the _RTNeural.json in case their is a future where .pt files can be loaded directly.

So, most of the training is in pytorch.

Sam

Yeah okay! That sounds pretty manageable. I did see in the original source mention of these JSON files, so that all makes sense.

Might also be interesting to you. People from TU berlin

Hmm. Might be easy to integrate their code. The interface with the RTNeural inference engine is barely 200 lines of code in my plugin!! That was the easy part. I’ll give it a shot at some point. There are good reasons to use LibTorch and ONNX and reasons not to. The reasons not to are that LibTorch is not real-time safe and neither are anywhere near efficient enough to use in a real-time setting. But having them available would be good because they would just load any .pt or .onnx file without having to convert to RTNeural format.

Sam