I decided to revisit this, as I’ve learned a bunch from the melbands range and smoothing/reduction in this thread, as well as some problematic fft settings in this thread.
After some quick initial testing (took me a bit to figure out the new syntax/messaging) I still get better results from the MFCCs vs the melbands. I did, however, now get good results from the MFCCs when using a smaller analysis window of 256 (using @weefuzzy’s suggestion of @fftsettings 256 64 512
). I think I get slightly better matching using a larger analysis window of 512 but the tradeoff in latency isn’t worth it.
So at this point it’s looking like I’m going to be extracting every bit of juice of those 256 samples (“normal” descriptor stuff/stats, 40 melbands, 12mfccs+stats), each for a separate functions later on.
Also wanted to give a bump to this section. I don’t remember any specific mention of the semi-supervised tweaking/updating in the last release(s), but the objects have been refactored, so it’s possible it could have happened and I didn’t notice.