Quite impressive in the surface, yet it feels like we can pinpoint the various source, so they must use a kind of macroscopic scale for the matching. Anyways, in line with style transfer examples, but this time a Rick Astley mashup…
Not sure what’s happening, but it’s crazy how the AI version has it’s own chorus too. Like, it knows enough that a chorus is a thing and for it to happen.
On a V100, it takes about 3 hrs to fully sample 20 seconds of music.
@ 10000 dollars per card that is a lot of compute time / money