WaveNetEQ for conference calling

jamesbradbury · April 22, 2020, 12:12pm

A fun and interesting read on audio conferencing tech at Google.

Using neural networks to fill in the gaps of stuttering audio. Skip to the examples - it’s very cool!

tremblap · April 22, 2020, 12:31pm

it is impressive, but it does sound like compressor pumping during the silence with the lowfi ones… maybe they should use ampgate~ seriously it is quite impressive and the WaveNet does sound much more ‘natural’

tremblap · April 22, 2020, 12:34pm

Also I wonder how long the training on the 100 speakers in 48 different languages took… that is our challenge here: small, divergent datasets, and small user patience

jamesbradbury · April 22, 2020, 1:23pm

Humongous. I tried running the RNN code on my GTX 1080. I think it managed to complete a couple of buffers of 64 samples overnight. The speed comes from them providing the training in this circumstance which almost all the time we can’t do… Perhaps it’ll be the kind of thing that is taken for granted if computation and price catches up. I want to see what happens what it starts going in another direction rather than going for the realistic solution.