Fluid.bufnoveltyslice~ pitch algorithm jumps between generating 2 slicepoints and ALL the slicepoints

scottmclaughlin · April 3, 2024, 4:40pm

[this is partially answered in another thread with Rodrigo but not fully, or maybe things have changed]

I’m using fluid.bufnoveltyslice~ to find pitch changes within files. With default settings, the pitch algorithm gives radically different response when I cross a threshold tipping point of about 0.33. See these pics for comparison.

I’ve tried cranking up the kernel and filter size (and raised FFT to 2048) to no avail; though it would be useful if the helpfiles could indicate some viable ranges for non-programmers to try out, I have literally no idea what size kernel/filters to try.

Any suggestions on strategies to make this better? In my ultimate use case I’m going to be throwing a lot of very different field recording files at it, so tuning this per file in realtime isn’t going to be an option. Any suggestions for what approaches might work better for me?

FWIW, the spectrum and MFCC algorithms are more pliable, but not quite what I’m after.

cheers,
Scott

tremblap · April 3, 2024, 8:25pm

Hello Scott

I think the best intuitive way to understand novelty and pitch and how they work together would be to read this:

https://learn.flucoma.org/reference/noveltyslice/

and to play with the helpfile of fluid.(buf)noveltyfeature~ to see how it reacts/peaks on a small subset. i find that seeing the descriptors help me tweak and get a sense of what makes a difference.

for more technical explanations, there are references at the bottom of the learn page. Pitch is complicated in term of changes.

Can I ask why pitch, and not chroma? Field recordings are not usually super monophonic and pitch is good with monophonic…

If you post a link to a sample file, I’m happy to help brainstorm how to segment and/or find areas of change.

rodrigo.constanzo · April 4, 2024, 12:22am

That’s some spicy behavior!

From re-reading the thread, it seems like Pitch, in specific, is handicapped by the (internal) normalizing of pitch and confidence in a way that is unavoidable, which is a bummer.

The thresholding being all over the place for different algorithms doesn’t help, but it seems like in the case of some algorithms in specific, it’s near impossible to get novelty slicing to work “properly”.

tremblap · April 4, 2024, 7:15am

it works on mono pitch vs no pitch - it is quite experimental

scottmclaughlin · April 4, 2024, 9:41am

Thanks, I did read the reference but not the linked papers: TBH I thought they’d be too technical for me but actually now that I look at they are useful (if still a little over my head in some ways).

@tremblap you’re right, I should try chroma instead. My test file is a very clean clarinet melody (I wanted to get the basic patch working before trying field recordings), but I may as well dive into what the real world situation will be.

tremblap · April 4, 2024, 10:19am

Hello @scottmclaughlin nice to see you here!

the thing is, segmentation is always signal dependent. The more you do it, with various tools, the more you get a hunch of which tool will be best / less worst to do a given task… hence me offering to talk at a higher level with audio example - what is it you are trying to isolate in what kind of environment/context. that way, we can start to unpick what tool might be best

I hope this helps