Onset detection that focuses on pitch?

rodrigo.constanzo · July 4, 2025, 1:35pm

So at the moment I’m working on some stuff with a melodica, and I’m having a really hard time with all of the onset detection algorithms (ampslice works, but is erratic, onset isn’t much better, and novelty has a weird issue (more below)).

Ideally I’d want to have an onset detected when a new pitch is detected. As far as I can tell, this is only explicitly available in fluid.noveltyslice~ with the pitch algorithm (@algorithm 3)).

Sadly there’s some fluid.noveltyfeature~ plumbing that normalizes pitch/confidence in a way that is inseparable (as outlined in this thread).

Specifically this bit:

I initially thought about bumping that thread to see if @weefuzzy had a chance to:

But as a broader concept, I figured it was worth making a ‘best practice’ inquiry.

For context, here’s a bit of the kind of audio I’m working with:
shortexample.wav.zip (1.5 MB)

There’s a few moments with large amplitude changes corresponding with notes, but there’s a kind of weird offset in the audio that corresponds with those moments which often makes the amplitude differential go down when an onset happens, rather than up.

But short of an updated fluid.noveltyfeature~, what would be a good way to determine onsets with changes in pitch?

And/or is there a reason that there isn’t a pitch-based algorithm available on fluid.onsetslice~/fluid.onsetfeature~?

Lastly, while wrapping up this post, I was reminded of the Hz algorithm, which I posted about here. I’ve not tested that out yet, nor messed with any pre-processing of the melodica signal either, which would probably help.

tremblap · July 6, 2025, 10:38am

quickly because I’m running between things, but I’ve had good results using chroma in novelty on the bass.

more soon

rodrigo.constanzo · July 6, 2025, 12:07pm

That does look better overall as chroma is intrinsically “normalized”.

re:latency, is there a difference between fluid.noveltyslice~ and fluid.onsetslice~ with comparable fft sizes? In my mind novelty is always “slow”, but that could just be because of how I’ve seen it used.

And/or is it possible to take the output of fluid.chroma~ and manually compute a difference in consecutive frames (ala what fluid.onsetslice~ is doing for the spectral features).

tremblap · July 6, 2025, 12:22pm

there is - there is even a parameter that tells you - in effect you need the kernel size to ‘compare’ a local similarity/novelty.

out of chroma~ and manual difference is a possibilty to explore, you can do it with a sort of weighted average (aka centroid) I reckon. but then why not use pitch~…

share along your experiments.

p

rodrigo.constanzo · July 6, 2025, 12:35pm

I know they report latency, but I’m not entirely sure what “novelty” vs “difference” means, in a temporal domain.

Because it’s “broken” (i.e. normalized pitch and confidence means it response very poorly in context).

For now I’m still using ampslice and just dealing with the weirdness, but it’s not ideal for general use cases.

tremblap · July 6, 2025, 12:40pm

I mean fluid.pitch~ instead of fluid.chroma~ that you proposed to use instead of novelty.

tremblap · July 6, 2025, 12:40pm

I thought that by now, you would have stopped in ‘general use cases’ - there is no such a thing

tremblap · July 6, 2025, 12:45pm

learn.flucoma has a good reference for both novelty is this idea that you create local similarity matrices (a context, the kernel size) then you check if how the new frame is deference to that context. difference is now vs just before, and in onset, it is made for onset, so looking for various assuptions in spectrum of what an attack is: whiter spectrum than the frame before, in various flavours.

yogi · July 6, 2025, 1:33pm

Hello, wouldn’t using fluid.onsetslice on the output of fluid.pitch work in this case?

rodrigo.constanzo · July 6, 2025, 1:47pm

Ah right. I guess some difference computation on the confidence would be indicative of something, but not entirely sure that on it’s own that would be enough (e.g. a single sustained note glissing won’t vary in confidence much I would think).

Well, one part is I’m working on a video/teaser idea where I’m using melodica, so for the purposes of this video, I’m just going with ampslice, but being able to segment by pitch is something that I think would be good to optimize.

In that case, I suppose novelty is the better choice here since it would be better suited to a broader/varied difference. The thresholding just makes it near impossible to use (pitch/loudness have near 0 values for just about all material (pitch in particular is awful)).

In this case I’m more interested in determining that a “change in pitch” has happened at all, and fluid.onsetslice~ does a really poor job of it anyways. I spent a bit of time tweaking the settings and trying different algorithms, but none of them worked even remotely.

To be fair, I think that the melodica audio is particularly problematic for some reason, so I imagine some aggressive pre-processing is in order, but I would assume that something would be able to work even a little.

tremblap · July 6, 2025, 7:03pm

we cannot yet do non-fft… but doing ampslice (detrending) could be a fun experiment. just remove the filter and it should work (famous last words)

yogi · July 9, 2025, 8:04am

Oops , I meant ampslice yes, then I find it a bit tricky to convert pitch to db.

tremblap · July 9, 2025, 9:04am

db is log, so you can run it on midicents output (also log) and that might help

Onset detection that focuses on *pitch*?

Onset detection that focuses on pitch?