PESTO - pitch detection algorithm

Much like the Hz thread from a while back I came across this newer algorithm today by Sony called PESTO : Pitch Estimation with Self-supervised Transposition-equivariant Objective.

Basically a neural pitch detection algorithm.

Someone’s made a Max external, which is handy: pesto~.

I had a quick play with it and although at smaller block sizes it’s quite shit, at 1024 (the default it seems), the tracking is really nice. Quite a bit better than Yin.

Obviously there’s so many other things that would be interesting to have (if/when dev time comes up), but just wanted to post this here in case it’s of interest as more cutting edge pitch detection algorithms (whereas I don’t think Hz is so cutting edge, more just use-case-optimized).

2 Likes

That’s cool. Can you give us a bit of quantization on this, maybe a plots and stats comparing the two?

1 Like

Not gone that deep with it, just did some A/B-ing with it in patches and listening.

Don’t really know what kind of material would be suitable for quantifying the difference.

Actually here’s a quick screen capture:

1 Like

nice. thanks for the demo!

1 Like

can you compare with pesto @conf 0.70or fftyin with threshold at 0.98? What I hear is yin spitting pitch with low (0.7 is low) confidence…

it does look fast though