… dream about how we can do this on sounds?
… are slightly afraid?
Have you seen the research with WaveNet, RNNs and WaveGan? You need some serious gear to get some pretty crappy results. Why is visual stuff so far ahead!?
I heard them, hence the word ‘dreaming’ I don’t know for sure why the visual world is so ahead, and @groma is the specialist here, but i think it is both a cognitive bandwidth (the eye is easier tricked) and commercial incentive (the cinema world has A LOT of money)…
Click through a bit and you see mouths and such repeat. So, while the person did not exist, the body part did, I think. More of a Mr Potato Head kind of vibe than completely reconstructing waveforms.
Also, I could listen to the generated talking on this page all day:
a good metaphor for concatenation indeed
Speaking of speaking of concatenation, the WaveNet article has not been evaluated by musicians I reckon… the blue lagoon comparison sounds worse than the concatenative example yet they rank it superior…