Variational Autoencoder (VAE) in Flucoma?

jack.lion · January 30, 2023, 3:35pm

This past weekend I sat down and learned about fluid.mlpregressors~ autoencoder. The pedagogue offered a nice place to start with regular autoencoders as well as the video in this topic Autoencoder topic/video. I soon became curious about VAEs and so I went about learning more here VAEs vs regular autoencoders.
Has anyone experimented with VAE with flucoma? Given that it is possible to create an autoencoder with the tools I feel like a VAE would be the next logical step however I am a little confused about the so called ‘code’ portion of the VAE.

[…]making its encoder not output an encoding vector of size n, rather, outputting two vectors of size n: a vector of means**, μ** , and another vector of standard deviations, σ .

They form the parameters of a vector of random variables of length n, with the i th element of μ and σ being the mean and standard deviation of the i th random variable, X i, from which we sample, to obtain the sampled encoding which we pass onward to the decoder

Is this possible with flucoma? When the say a vector of size n of means mu and standard devs, what exactly does that mean? I am familiar with the mathematics behind neural networks so this is somewhat familiar to me already. But in this case how is n defined? How are there multiple mu’s for the nth values? Same question goes for standard deviations? How is this sampled back into the decoder?

The explanation seems sound but remains a bit too vague for me to fully grasp. But ultimately I am wondering, can we derive or compute these numbers in max/flucoma in order to achieve a working VAE network?

weefuzzy · January 30, 2023, 5:04pm

Hi,

The short version is that the C++ code itself would need to be altered to implement a VAE.

This is the key bit behind ‘variational’: instead of learning scalars for the weights, which is what happens in an MLP, a VAE is instead trying to learn probability distributions that model the data. In this article it’s a normal distribution (which is standard, AFAIK), which means you need two quantities (mean and standard deviation) to represent it. The idea is that you get something that is both generative (because you can sample from the learnt distributions to generate new data), and more noise robust.

Making it happen in code involves some changes to the neural network architecture (to store the additional numbers, and do the sampling), and to the training procedure (because the loss function – the measure of ‘wrongness’ – needs to change).

jack.lion · January 30, 2023, 5:30pm

That makes sense, I’ve noticed a few similar limitations with the mlp such as not being able to override the loss that it uses to backpropagate as you would need to in order to train a generator from a discriminator in a GAN scenario (correct me if I am wrong on this).
Do you have any idea if these modifications are on the agenda for future flucoma versions? Feels like the reward would be worthwhile.

weefuzzy · January 30, 2023, 7:51pm

We don’t have any real plans for this. However, because the project is ending its funded period, we’re very receptive to any contributions of code and ideas from the community, to help things keep chugging along.

This particular thing presents quite a design challenge, in terms of finding something that could work across Max, Pure Data and SuperCollider. It may well be that the only real way to get the desired flexibility whilst experimenting is to make bindings for one of the Python ML frameworks’ C APIs so that people can design in Python and then do inference in Max / PD / SC. Don’t know how simple that is, though.

jack.lion · January 30, 2023, 8:23pm

Well it’s funny that you mention that because I actually wrote and tested a max external in both C and C++ using tensorflows C/C++ API recently. All you have to do is link the header and library pathways within your IDE. Once the external is built without error you can copy and place the tensorflow.lib file in the same path as the max.exe file (Windows GPU only ).
It only prints out the version of tensorflow embedded within the external to the max console but proves nonetheless that it can be done.
I would post that here but the .mxe64 extension is not available to upload to this forum and neither is .c or .cpp but if anyone wants to give it a shot I will gladly share my work if it helps to light the way for more development.
Can be done with the windows CPU only version as well as on mac CPU only (no mac GPU support). I’m assuming that this means pytorch could work as well since they have their own C/C++ API.

I would love to write my own mlp algorithm to contribute but I don’t feel very confident in undertaking that right now, not without some proper help or guidance at least. I am still learning the basics on how to build externals and write tensorflow code. Maybe if I get sick of waiting and no one else wants to give it a shot by the time I become more adept at this.