Determine note length (durations) of Trumpet and Saxophone notes

marker · June 6, 2025, 3:56pm

Hi everyone,
Is there anyone who has experience in getting the duration of notes played in a trumpet or sax solo.?
I’ve tried some simple things with onset and transient detection but the results are not that optimal. note onsets are detected quit good but when long notes are played I can’t really get a system to detect when it ends.
I know this is quite general stated question but any help or direction is welkom.
In the project im working I’m trying to get 'interesting data out of live played jazz to drive realtime visuals in Touch Designer.
Im using the Flucoma lib for Max/MSP. Im quite handy in Max so there’s no real problem there.
Thanks a lot for any answer.
Best,
Mark

tremblap · June 6, 2025, 4:04pm

hello!

Did you try the noveltyslice on chroma? I am using both (onset for attacks and chroma for note changes) and it works quite well for the changes in a legato move. Gates are good for note ending, obviously, otherwise the attack/note-change is the end of one slice and start of another slice.

Do I understand you well?

marker · June 6, 2025, 7:39pm

Hi Pierre, thanx for you quick reply.
I just tried that and it seems to be somewhat more accurate.
Do you mean with noveltyslice on chroma… using the chroma algorithm setting in the noveltyslice object?
I also tried fluid.loudness~ to make a gate logic myself but probably the the fluid.ampgate~ is way more suitable.
I think you understood me well.
I I understand you correct… your advice is an combination of noveltyslice and ampgate right?

I would also love to look into (and did a bit) in training / making a corpus of trumpet (or sax etc.) notes and then feed a live played trumpet into that to match / map those to a visual output based on the position in the 2d corpus space).
But that’s another approach all together I suppose.
Thx again.

tremblap · June 7, 2025, 10:26am

yes, you can use what descriptor noveltyslice works on with “algorithm” parameter.

https://learn.flucoma.org/reference/noveltyslice/

ampgate is only that with some other clever features (high pass filter, smoothing, schmidt trigger, and recovery) but used basically it is a gate.

https://learn.flucoma.org/reference/ampgate/

depending on how you want to train it, you can use an audio query to a kdtree for instance. you describe a corpus with classes, and the ‘most similar’ will come out. you can also do that with a neural net like in the online tutorial (mlpclassifier). all of this depends on how well you describe your elements/points/objects/slices…

marker · June 7, 2025, 3:29pm

Thx for your clear answers.
I looked at the online tutorial (mlpclassifier) and also some of the lectures and courses.
I am also looking into the sp.tools by Rodrigo Constanzo. Amazing work.
Although mainly seems to focus on drums and musical output it looks like a lot of data handeling is already optimised.
I wander if that will work for trumpet, sax and piano in more ‘traditional’ jazz standards.

tremblap · June 7, 2025, 3:44pm

@rodrigo.constanzo might have insights. in the meantime, starting with the sptools and bending them (you can open them all and they are incredibly well documented) will bring you forward while staying in the music… which is the way I started FluCoMa’s funding - I played and modded the old (ftm-based) CataRT with the author (the fantastic Diemo Schwarz) and then I got stuck and that snowballed into interface reseach with the tremendous team we had here.

rodrigo.constanzo · June 7, 2025, 4:01pm

Note offsets can be a tricky thing.

What I do for all the “gate” outputs in SP-Tools is use the amplitude thresholding to get the point it crosses that criteria going up (the onset) and then again going down (the offset). For short sounds this tends to give you a decent indication as to how “long” the note is, but for more sustained instruments, that doesn’t really work.

I personally haven’t found novelty of onset slice to work too well outside of very specialized circumstances, though I imagine chroma could be useful.

In any case, I think the best approach would be to narrow it down to the descriptor(s) that best capture the changes you are after (between amplitude, onsetslice, chroma, novelty) and then looking at the feature version of them where possible (e.g. fluid.ampfeature~, fluid.onsetfeature~, etc…) and then apply your own thresholding after that.

For the note offsets, it may also be beneficial to have a parallel loudness tracking going on and/or an envelope follower, which may help massage things for when notes “end”.

marker · June 9, 2025, 9:39am

Thx you both for your replies and advice.
I’ve been reading in and following both your work for quite some time now and really appreciate it and the way your’re driving and building to these open communities.

I will first make a rough system for the note on / offset detection… I’m especially interested in longer notes played by a trumpet and/or sax and maybe trombone. Im focussing on trumpet first. Would it be possible (or fruitful) to create a class of longer notes ? i.e. medium and short notes? I guess it can be done combining the on/offset with an adaptation of the sp.classcreate?

Next to the note length detection there are a couple of other things we want to try to ‘get’ from the live music. I started out creating a little system that detected ‘nice’ / ‘clean’ notes and ‘noisy’ / halfvalve based notes played on trumpet. It kinda worked when deliberately playing that but did not really work in a live played improv within a standard composition. (I think also because of the miking situation etc.)
We also want to get some data from a vocalist.
I’ll keep you posted on the project and gladly wil explain more about it.
The again.

rodrigo.constanzo · June 9, 2025, 10:06am

IMO, anything longer than an fft window size (~256-2048samples) will “look” the same the same to the algorithm in that a medium or long note will be the same, just take longer. In order to differentiate them you can just take the time between them really. As in, a short note may have a time between the onset/offset be ~500-1000ms, and a long note may have a time difference of >2000ms. In which case you don’t even need to use classification (though you could still do something similar as sp.classcreate is pretty agnostic as to what it receives), you could just have [> 2000.] on the output of the note duration part of your patch and use that to determine the “class”.

With this you can probably go quite far with just dk.descriptors~ and looking at centroid and flatness. Maybe combining it with sp.filter to send only some sounds down one path and the others to another path.

But yeah, report back how you get on!