The very short version of this answer is that it’s non-trivial, especially for something whose range of sounds is as broad as your pulsar generator. The current public FluCoMa release will certainly have some helpful stuff, insofar as it has a range of decompositions and descriptors available, but the stuff in the second set of tools we’re currently developing (for building and exploring databases of sounds etc) will provide a fuller framework, once they’re available.
The general question of mapping the space of possibilities of a synthesiser or processor is still open (and, IIRC, @rodrigo.constanzo has raised it before on the forum). There’s some papers about it, but not so many. Stefano Fasciani made a tool using an approach called a Extreme Learning Machines (https://ro.uow.edu.au/dubaipapers/752/) and there was also a paper at NIME last year (https://www.nime.org/proceedings/2019/nime2019_paper085.pdf).
There’s a number of moving parts to getting something like this to work. Broadly, you’re trying to learn a useable mapping between two multidimensional spaces – the controls of the synth on the one hand, and some set of descriptors that usefully captures the range of possible sounds on the other. The really hard bit, I reckon, is finding that set of descriptors.
IIRC, Fasciani takes a sort of kitchen sink approach, and collects loads of features and then does some dimension reduction to try and mitigate the redundnacy that this tactic involves. Whilst not aimed specifcally at this problem, we explored a different approach for NIME last year, comparing trying to learn the features themselves using a small neural network, against using more generic MFCCs (https://pure.hud.ac.uk/en/publications/adaptive-mapping-of-sound-collections-for-data-driven-musical-int, https://github.com/flucoma/FluidCorpusMap). The code for this is a mixture of SC and Python.
What’s still a very open question is how adequetely account for the morphological character of sounds. With something like the Pulsar Generator, some of its states are characterised by emergent rhythms and so forth, yes? In which case, per-frame features on their own can’t describe the temporal relationships we might hear. One thing I’ve been playing with semi-privately (with some input from @jamesbradbury and @d.murray-rust) is using the autocorrelation of features to try and capture rhythmic differences, alebit inconclusively.
Concretely, given what’s currently available, one way to get started on this would be to start exploring different features, and seeing whether you and the computer can come to an agreement about how well they describe different types of sound from Pulsar Generator, perhaps by using something like the existing KMeans quark.