Much like the Hz thread from a while back I came across this newer algorithm today by Sony called PESTO : Pitch Estimation with Self-supervised Transposition-equivariant Objective.
Basically a neural pitch detection algorithm.
Someone’s made a Max external, which is handy: pesto~.
I had a quick play with it and although at smaller block sizes it’s quite shit, at 1024 (the default it seems), the tracking is really nice. Quite a bit better than Yin.
Obviously there’s so many other things that would be interesting to have (if/when dev time comes up), but just wanted to post this here in case it’s of interest as more cutting edge pitch detection algorithms (whereas I don’t think Hz is so cutting edge, more just use-case-optimized).
It seems like the confidence in pesto is way stickier/better (even with this funky vocal sample), and also seems to track the microtonal stuff even better (by comparison the yin sounds a bit sharp most of the time).