Stefano Fasciani

This person’s work is interesting and perhaps: the tools would be of interest to someone

Yup yup, interesting stuff.

Although it’s obviously a ton more complicated, it would be amazing to have dimensional reduction of dsp processes (I guess fed with finine but representative audio inputs).

I emailed @tremblap about some machine learning stuff last year, and after a couple of emails with @weefuzzy, my inquiry kind of stalled.

What I want to eventually do is to create 1, 2, and maybe 3 dimensionally reduced versions of the control schemes for each of my dsp effects, so if I want to use any effect in a complex/mixed setup and don’t want to faff around with hand coding many-to-many mappings that only relate to that one setup, that I could tap into, say, a “2 parameter” version of a given effect which will give me reasonably musical control over most/all of the sonic output of the effect with the amount of available control parameters I have available.

I basically created a hard-coded version of this for my gamepad patch where I can control each loaded effect with a single analog control pad (X/Y control). Works and sounds great, but is not really robust, and requires a lot of tweaking and fucking around each time you make a new effect.

Like many things, the actual answer is machine learning, but in a general sense I’ve been apprehensive of taking on any ML stuff as you can’t really “in for a penny” with it, where you do some offline heavy lifting to compute stuff, and then take your magical machine learned nugget with you back to “normal” Max land. Why oh why…

The pain is measuring results for all the combinations of input parameters, and in the case of audio processing disentangling the difference the parameters make from the audio input (that one is very hard problem). There are lots of techniques of dimensionality reduction, some of which are pretty easy (like the self-organising map).

Reading the Cut Glove stuff properly is really helpful for getting a handle on what it is you’re trying to do. As ever, impressive documentation!

‘Machine Learning’ covers an extremely broad range of topics, and the vocabularly is moving very fast within its various disciplines, which makes life hard for outsiders like us: part of our mission over the next 24 months is to try and make sense of this. [Aside: I was watching sessions from an industry machine learning conference last week: some practitioners were using ‘machine learning’ to denote only a specific group of techniques, others were using it more widely: this would have been impossible to pick up on without the brief immersion I’ve already had).

ISTM there’s a number of different things you’re wanting to do. An incomplete list:

  • Dimensionality reduction. As Alex says, there’s a bunch of ways of doing this for arbitary data, the tricky bit is finding things that makes perceptual sense (especially from audio). Some of the techniques for this are really heavy and require lots of training, others not so much (but with trade-offs in idiosyncrasy etc.)

  • Gesture Learning Mapping particular patterns to outcomes, as with your special moves with Cut Glove. There’s some Max stuff already out there for this (e.g mubu.hhmm and mubu.xmm but they’re not exactly well advertised or intutive). I’m quite interested at the moment in an emerging idea called ‘conceptors’, which uses a reservoir of feedback-y, non-linear things as a medium for disitiguishing between different temporal patterns. I’ll keep you updated with where that goes.

  • Non-linear Regression In an ideal world for doing a complex mapping, we’d just supply the idiot-box with a few representative examples and trust it to fill in the gaps sensibly. Even better, if the results aren’t quite right, we can add more examples, and have some different options for explaining to idiot-box what counts as sensible. Again, there’s lots of way of approaching this; the trade-offs seem to be between intuitive-ness and computational tractability.

I’m quite interested to think on how previous attempts to bring some of this stuff to Max/SC/etc have failed. One striking thing is how low-level some things have been (e.g. just some barebones objects that map to things in comp-sci textbooks, but not necessarily to things musicians might want to do), or how under-documented.


The idea stemmed from that, but after seeing how useful it is, is something I want to apply for future modules with the idea being that if I wanted to repatch/play, it doesn’t require a ton of coding to get that up and running.

Dimensionality reduction is of particular interest, with training time in many hours not being an issue for me (as it would only have to happen once per effect really). And even with idiosyncrasies they would become part of the “effect” in terms of control.

The 2nd one is also interesting, but I’m wary of black-boxing that side of things since having a clear link between action and outcome is of importance (ala learning complex gestures and being to execute them “correctly”). I haven’t spent much time with the mubu stuff after being very disappointed a .01 update broke a bunch of the stuff I had built with it.

Aside from these specific use cases I’m really interested to see where you guys fall with all the ML implementation as I do agree that a bunch of the tools I’ve come across have not been very useful or friendly (or musician-y at all). That being said, I know that I fall on the ‘blacker box’ end of the spectrum…