Time stretch in Max

tutschku · July 4, 2020, 1:50am

@spluta, thanks for showing your time stretch example today - very inspiring indeed. I looked around the net to find some max implementation, and there is one by Volker Boehm, loosely based on Paulstretch.

https://vboehm.net/downloads/

the object is called vb.stretch~

pasquetje · July 4, 2020, 11:00am

Hello Hans !

@danieleghisi implemented an experimental Paulstretch.
Thank you Paul… Thank you Daniele…

Otherwise, Zynaptiq’s Ztx stretch is within Max with stretch~, sfplay~, groove~ etc. It can sound almost the same.

danieleghisi · July 4, 2020, 11:31am

Ahahah… yes, it’s a barebone-sketch implementation, though, but I can definitely share it if you need it.ears.paulstretch~.mxo.zip (543.3 KB) ears.paulstretch~.maxhelp.zip (3.9 KB)

tremblap · July 5, 2020, 4:55pm

For FFT time stretch I would use the amazing @a.harker framelib, especially that it has buffer~ access, so you can do the bufnmf~ resynth then do the stretch with random phases (à la ps) and a lot more crazy stuff too! @jamesbradbury has made a lot of cool tutorials so the entry point to that universe is now quite affordable I think. But I don’t know what is its state in SC so for @spluta it might not be the ultimate option.

spluta · July 6, 2020, 4:04pm

The main thing I would want to add to mine is some way of dealing with transients, but I think that is just adding some kind of TransientSlice prior to time stretching and keeping the transients somehow. Sounds easy. I know it isn’t.

rodrigo.constanzo · July 6, 2020, 4:32pm

I imagine you might have tried this already, but I’ve had good results from HPSS-ing stuff and treating the ‘P’ in that way. Even though it’s not as temporal an algorithm as the transient slice one, most of the time it behaves that way, giving you a transient-y thing at the start of the file, then loads of silence (for the percussive component).

tutschku · July 8, 2020, 8:24am

Hi Daniele, I hear a pitch change when running it at 48 kHz. Did you hardcode the sample rate or do I need to provide some parameters?

danieleghisi · July 8, 2020, 9:27am

No, the sample rate is not hard coded, it should be taken from the sample rate of the buffer. (I’ve always had some troubles, though in managing buffer sample rates in Max.)

t_atom_float ears_buffer_get_sr(t_object *ob, t_buffer_obj *buf)
{
return buffer_getsamplerate(buf);
}

spluta · July 14, 2020, 12:16am

Hey Hans and all,

I’m not sure if you have kept working on this, but I think I have finally optimized the algorithm. (A friend of mine was working on this in Python (it is really his algorithm) and we now have SC and Python versions up there (though the python version is not quite as developed (can’t deal with big files))):

There is even a help file in there now. Anyhow, the main improvement to the stretch (in addition to it now using 4 frame overlap rather than 2) is that it now splits the original sound file into 9 discrete frequency bands, and uses a decreasing frame size to correspond to increasing frequency. Starting with a largest frame of 65536, the algorithm will use the following frequency/frame size breakdown:

0-86hz : 65536, 86-172hz : 32768, 172-344 : 16384, 344-689 : 8192, 689-1378 : 4096, 1378-2756 : 2048, 2756 - 5512 : 1024, 5512-11025 : 512, 11025-22050 : 256

For pop music, with important high end information, this is real freeky. For something like classical orchestra music, I have a setting to use the 4096 buffer for everything above 1278hz, and that gets rid of some obnoxious aliasing.

Sam

tutschku · July 14, 2020, 1:43pm

Hi Sam,

thanks for sharing. I had initially the hope to implement something along those lines in Framlib-land. But time constraints for now don’t allow for it. I’ll keep your code as reference for later.

tremblap · July 14, 2020, 1:57pm

actually maybe @jamesbradbury will want to do something like this in our still missing example of multiresolution fft processing in the tutorial, no?

spluta · July 14, 2020, 3:43pm

I feel like maybe I am an old man who just discovered 90s hip hop (no, really, the Notorious BIG is amazing!), but here is the algorithm on a Katy Perry song:

https://drive.google.com/file/d/1fyF3syfPvfAHcJv4ZejcVN7Y7HkQOeBC/view?usp=sharing

I guess this is the point when my music just becomes drones for the rest of my life. But to me this is pornographic.

jamesbradbury · July 14, 2020, 5:56pm

A good example is still needed. I will investigate…

jamesbradbury · July 14, 2020, 8:03pm

Can I bug you over e-mail with some questions about this? I’m terribly incompetent at picking apart the sc code and I think this could be a great musical example for FrameLib’s multiresolution FFT.

spluta · July 14, 2020, 8:18pm

of course.

jamesbradbury · July 14, 2020, 8:20pm

Cool! I’ll try dissect your code then get back to you with anything that I get stuck on.

spluta · July 14, 2020, 8:41pm

It is basically this:

which I couldn’t get to work, but at least the code is there in your beloved python

BUT the big difference is there are 9 of them going in parallel, each with their own bin size, and inside the FFT, I am zeroing out the bins that do not correspond to the frequency range of that bin size. So, for the 65536 window, I zero all the bins above 86hz, then for 32768 I zero out all the bins below 86hz and above 172. etc. SC has a really nice PV_BrickWall that makes this easy.

One of the last things I figured out was that I had to delay all the FFTs by 65536-bin size so that they all lined up. I thought it was being silent of the first bin…but that is just how FFTs work, haha.

jamesbradbury · July 14, 2020, 8:54pm

The delay part makes sense to me but I think a FrameLib implementation would have you keep all the bins then filter at the sink rather than inside of the FFT with the brick wall. I’ll check this code out

jamesbradbury · July 14, 2020, 8:55pm

Oh god no, its one of those code bases with no spaces. This is not what Guido wanted.

spluta · July 14, 2020, 10:14pm

There will certainly be a sonic difference between a brick wall and wall of bandpass filters. I don’t know which would be preferable. The brick seems cleaner, and I don’t hear any artifacts, but I will leave it up to my more dsp knowledgable colleagues to tell me if I am wrong. My biggest concern is whether there is a bin of overlap. I am fairly certain there isn’t (had to look at the SC source), but I can’t be certain.