I began using Flucoma in a project a few months ago, first replacing what I was doing manually before discovering this library ( slicing & classification by hand ) my next step is to do some instrument design, where I compare arbitrary live input ( contact mic with prepared guitar, with different types of excitation, voice through a sm57 )
I read a lot on this forum, but as a linux user I donāt have the possibility to read max patches on my computer in order to fully understand all the informations, Iām left with many questions.
One is how do one prepare / calibrate an arbitrary live input before trying to match it with
different corpuses, so more of the live signal can match efficiently.
@tremblap@rodrigo.constanzo searching the forum I found out you are both quite experts at this, if I could learn from your experience that would be great.
I attached an example patch for slicing & matching corpus aimed at eventual pd users that would stumble upon this post.
I guess it depends on what you are trying to do. Are you trying to analyze and trigger samples based on onsets (e.g. an āeventā is detected, then you find the nearest match in a corpus) or trying to do continuous matching ala concatenation?
In either case, Iām not a pd expert by any means, but I did port a bunch of the SP-Tools abstractions over to pd:
You can find the pd abstractions at the bottom here:
These are pretty well optimized (their Max counterparts at least) to do robust analysis with low latency, and output list and buffers depending on what you need.
Thereās also examples of classification in there as well.
What the SP-Tools examples donāt do is play any audio (something Iām not knowledgable in in pd since itās so different from Max).
I could have sworn there were more native pd FluCoMa examples, but it looks like the /examples folder only has a couple.
There are, however, some pd FluCoMa video tutorials as of a few months ago:
(not exactly the same subject, but thereās a couple of these pd vids which may cover some things)
thanks for the reply, I went through most of Flucomaās documentation, half of the conferences & podcasts , all tutorials , and also through all your videos about Sp-tools.
I also opened your sp tools for pure data as they contained nice information and ways of doing things.
What could be useful to me right now is more precise , it would be some explanations about your ways of matching āefficientlyā a corpus and live audio input that have not much in common, in a video about sptools 0.2 you call this " scaling", when sound analysis are not much āoverlappingā
Once youāre in that territory things get a little fuzzier and material-dependent.
Scaling is one way to do this, but doesnāt always work well and can be problematic for some parameters (e.g. pitch).
The workflow I do in Max to set this up is analyze a corpus, just as you have (more-or-less), then do a similar process for what I expect to be my input. I just went looking as I drew myself a diagram of all the data steps as it got a bit tricky to follow, but it was basically robust scaling everything, then doing the matching on that (and in my case I throw a bit of regression/prediction to match longer timeframes than is possible in realtime).
Then obviously the descriptor choice matters a lot. I see youāre using melbands as your main descriptor which works ok. If timbre is what youāre after, I think MFCCs are better all around.
Lastly, I would suggest trying fluid.bufampslice~ over onsetslice as it is much more temporally accurate, so if you have sounds with sharp transients and/or short analysis windows (as in SP-Tools), having the kind of temporal slop you get with onsetslice can throw things off.
Cheers, I guess the key object here is ārobust scaleā , as an example, if Iām to use my voice (live audio input ) to control a glitch noises corpus, I do a robust scale of my analysis, then compare it to the descriptors I get from my voice ? I understand that the corpus will be scaled, what about the live input (voice) how would you go about informing the patch about what kind of sounds I will make so I can scale it in advance ?
( the regression/prediction for realtime mathcing of longer frames is fascinating, but my brain is already a bit too hot to process this! )
fluid.bufampslice~ seems great, I havenāt used it yet ( but I read its documentation) , I remember reading in a post that using āfluid.bufampslice~ā in conjunction with ānovelty sliceā could help having good temporality plus benefits on the noveltyslice method.
This is also something I want to try, this time I just have difficulties with the way I would patch it ā¦
Firstly, I mean ārobust scaleā as in fluid.robustscale~ (another flavor of standardization) as opposed to meaning scaling, that is robust.
The approach I use is to give the computer some idea of the world in which I will be operating in. I call this a āsetupā in the context of SP-Tools, but thatās not really a standard term. So with my approach you give it a couple minutes of examples of the kind of thing you may do, which is then used as the to create some robust scaling for what the input would be, exactly the same way you are doing for your corpus. In my case I save this as a separate .json file that I load later along with the corpus stuff.
Then when you get new input, you run it through robust scaling to turn it into the same kind of standardized space that your corpus is analyzed in. Ultimately you compare two scaled spaces to each other.
So in the end you have:
corpus ā robustscale ā robustscaled corpus
input ā robustscale ā robustscaled input āfitā
input ā robustscaleād input ā kdtreeād robustscaled corpus ā nearest match
I did experiment with this quite a bit, and in the end just went with fluid.bufampslice~ since it gave me the best results overall. The basic idea was to run fluid.bufonsetslice~ on everything, then going back and running fluid.bufampslice~ over the āhopā in which the onset was found to narrow it down further. If fluid.bufampslice~ didnāt find a more refined one, then default to the fluid.bufonsetslice~ version.
In principal this works, but I feel like thereās a lot that fluid.bufonsetslice~ doesnāt catch at all, and the tradeoff for ānon amplitude-based transientsā wasnāt worth it.