Hey there, newish FluCoMa user here.
First of all, I would like to start by saying how impressed I am with this package. I had been working with MuBu for some time for this my project and I was finding it very difficult to wrap my head around it and to actually implement what I wanted to do. Finding out about FluCoMa made everything easier, so thanks a lot to everyone involved! Also, the documentations provided is impressive and useful, and the implementation of the package in the main “host” is really well done (with MuBu I often felt like I needed to re-learn how to patch, while with FluCoMa it feels like I’m just using Max/MSP).
Now, I would like to take this opportunity to describe the project I’m currently working on, both because I have quite a few (actually, too many) ideas for it and I’m struggling with the implementation, and to get feedbacks about it and (hopefully) spur an interesting discussion!
So, the main idea is to have a patch than can analyze raw field recordings files, slicing and labeling different parts of them in terms of “compositional” elements (so I’m not really thinking about “dog barking” or “car engine”, but more like “hits”, “gestures”, “textures”, and so on). I would then use these data to train a neural network that, once trained, can automatically do this classification on any new file I add. Then, I would have some sort of algorithmic compositional strategy telling how to treat each sound category in order to get a final piece stemming out from the different field recordings files. So far so good, and it seems like FluCoMa has all the tools I need to implement this. In fact, it seems there’s basically already a tutorial on how to do that!
I have actually started out from the “Classifying sounds with a neural network” help file, although it turns out it works pretty well with simple, easily-discernible-kind of sounds (such as the oboe or trombone used in the example), but once I put a raw 20-minute-long field recording it gets messy. The first issue I have encountered is that it’s hard to get slice points that make sense. I have tried a few objects and parameters, but the main idea I came up with is to use two (or more) slicing techniques simultaneously and then confront their results (possibly using Javascript code) so that I only accept slices which are present (well, reasonably present…) in more than one buffer. I guess this would be pretty straightforward to implement but I have yet to try, so I appreciate any tips about this! Not to mention that coming up with a way of finding good descriptors for “musically interesting” real-world-sounds is another daunting task (I’m currently thinking of mainly using MFCCs as analysis data but, again, I might combine more than one analysis). Then I would have to reduce the analysis data to a more manageable size (again, plenty of tutorials about that!) and use this new data to train the neural network, right?
And, finally, I would have to come up with the algorithmic part for the composition rules, which is something I haven’t even started to think about yet. I just know I’m not really interested in a mosaicing approach (if I understood that correctly) like “find me and reproduce the closest slice to the one that’s currently playing”, but I would say something like “these are the sounds available for this category, apply these processes to them (or to some of them)”.
Does this make any sense? Is there any obvious flaw in the ideal process that I’m overlooking? Am I aiming too high? I feel like FluCoMa is just the right tool to do all this but it’s such a vast field I periodically get lost in it.
Again, thanks a lot! I’m eager to see what ideas you all will come up with.