Best buf-based slicer for keeping track of onsets *and* offsets?

So I’m starting to work on a slicer/juggler thing that will be built around the idea of taking incoming audio (in a long rolling buffer) and segment it based on some criteria, then play back a sequence/pattern of the events that are found.

Simple enough, but what I’m not sure about is which of the ...buf...~-based slicers would give me good segmentation, critically including the offsets. Ideally I’d want to isolate “notes”/“events” that may be spaced out in time, so treating the time between slices as the whole segment isn’t ideal here.

I was thinking of using fluid.bufnoveltyslice~ as the core slicer, as I’d like the segmentation to be fairly robust across a range of material, and perhaps chasing that up with fluid.ampslice~ to find more precise boundaries, but neither of those spits out an offset. Presumably fluid.ampslice~ knows what its internal “offset” is due to the hysteresis algorithm, but I can only access that with fluid.bufampfeature~, which I could then re-threshold to find onsets/offsets(?).

There’s fluid.ampgate~, but having absolute thresholding makes that pretty useless for arbitrary input.

Looking at fluid.bufnoveltyfeature~ there’s a curve output, but I guess the algorithm has more of a big spike where novelty is detected, with little taper afterwards.

I guess I could do some parallel analysis and determine what “offsets” are in a different way, but this seems like something that should be a solved problem, or at least, solved ways of approaching the problem.

Any thoughts/suggestions?

1 Like

Very much an unsolved problem! I think you’re right that a mixture of approaches would be the way to go. Perhaps a combination of the novelty feature and amp feature (which is the one from ampslice~), and maybe also ampgate as a last resort…

1 Like

2 passes with the 2nd being ampgate with threshold set relative to the overall loudness of a slice?

so many interesting ways to go about depending on the actual sounds and applications!

1 Like

Hmm, I would have thought that “segmentation” took this kind of stuff into account, but I guess the expectation is more to have things be done butt-to-butt.

Having to make multiple passes isn’t the end of the world, it’s just a matter of trying to assess the same criteria on both ends. I guess what’s good about novelty is that it looks at stuff other than just loudness, so to use that for the “onset” and then loudness for the “offset” could lead to odd edge cases.

I’ve not really played with novelty enough to know what tweaking all the parameters meaningfully does, but is it always so “spike”-y like in the help file? I would have imagined/pictured something more like an envelope follower-esque curve where it rides around a bit, where I can then set some schmitt trigger thresholding to get the start and end of points of novelty (with perhaps a separate fluid.ampslice~ step aftwards anyways to get the timing (potentially) more precise).

Audio DSP has solved fewer problems than you might think. But this is also one reason I prefer ‘slicing’ to ‘segmentation’. Basically, the whole notion of detecting the end of a sound is just much trickier. For one thing, it’s not especially clear that there’s an obvious perceptual mechanism to try and mimic: I don’t think we’re often terribly alert to the moment when a sound stops, unless we’re concentrating on that particular thing.

IIRC the help file uses quite a small kernel size so that the time scales are pretty short, hence spiky. The bigger you make the kernel, the smoother the curve (but greater the latency). Novelty slice uses a pretty simple peak picker, just looking for inflection points where a sample is greater than either of its neighbours, which is basically [zl stream 3] -> [expr .... Offhand I don’t know what a minima in the novelty curve might denote, offset-wise, but it might be useful as a switch type of thing to start then looking for when the signal amplitude goes below some point.