Ah good point to do an approximate scaling thats applied the same to both items.
I am taking a slow step by step approach with your advice in mind.
My first hurdle, is that the fluid.bufmfcc (which I set to work with 1x channel/mono) is returning 3x channels, from what I understand, 3x MFCC Coefficients.
But the fluid.mfcc object outputs 1x channel, even though it has the same input parameters.
I think I get it:
- fluid.mfcc is outputting each Coefficient as a list in one “channel”, each list output represents a segment in real time / a block? I’m not sure what settings control this time segment, and assume its doing some sort of averaging internally?
- fluid.bufmfcc is storing each coefficient per channel, with the value for each coefficient stored per sample of the specified slice.
so going forward, this is why the bufstats~ is needed for the buffer workflow, to reduce down the entire slice of samples to one sample representing their average. and for the real time input I can use fluid.stats with matching sample/coef count as the argument, and @history to smooth it.
At this stage, in theory I have the same formatting from the buffer and the real time audio, the only difference is how they are formatted, say I’m going with 6x coefficients for the mfcc,
- the buffer, after bufstats should result in 6x channels with 1x sample (I set it to output mean only)
- the live input from fluid.mfcc will be 1x channel with 6x samples, and in theory not need any statistics done to it, as its returning the 6x coefficients for whatever the time block is that it analyses in real time.
So the next node in the buffer pipeline is fluid.bufflatten, which at this stage should format my buffer in the same format as the live input, since bufflatten attaches the channels one after the other, and they are 1x channel and 1x sample, the result should be in the same format now?
For the umap stage, as you said, I can use it for both pipelines, it will map from x amount of “samples” per channel to 2x samples that I can then scale accordingly for the 2D table in the same way for live and buffer values. so the find nearest coordinates i send to kdtree should be accurate.
At first I was really confused but I think writing out my question has helped a lot already, I’m going to try this out now, let me know if my approach is correct!
Thank you for your time!