The problem of samples (vs milliseconds)

Now that we’re on the other side of the gig, I’m able to reflect on some of the other things that happened leading up to it.

The one that’s relevant here is that at one point while I was doing some coding on a plane, I noticed I was getting all kinds of errors in one of my patches which I wasn’t getting before. It turned out it was because my headphones set my sampling rate to 48k, which broke a lot of the time-based settings/@attributes/etc… in my patch.

I decided to somewhat sanitize “time” for that patch by putting in a loadmess followed by a mstosamps~, which is easy enough, but it struck me that this will start to become much more complicated as we move forward into TB2 and/or start dealing with musically-relevant units of time (i.e. @tremblap’s APT (ATP?!) stuff).

It seems that if one of the central ideas of the project is to have interoperability, having patches that break when other sampling rates are used isn’t ideal.

Obviously there are some simple things one can do, like setting analysis windows (which are conceived in ms) in ms and dynamically setting them in the patch based on the sample rate being used. Easy peasy.

Where it gets more complicated is where statistics come into play.

For example, in @tremblap’s APT patch, there is a section that is computing statistics for different ranges of frames:

startframe 2, numframes 6, bang, startframe 7, numframes 9, bang, startframe 15, numframes 31, bang

This is only correct at 44.1k. If you go to 48k you’ve “sped things up”, and if you go to 192k you’ve all but reduced the meaning and usefulness of the overall idea.

This can be programmed around, but it requires a lot more surrounding code, for each individual object (ms->samples->frames/offsets->attributes), and also adds more surface area for errors, since I don’t know about you, but I don’t often test my patches at other sample rates.

//////////////////////////////////////////////////////////////////////////////////////////////////////

Something I remember mentioning very early on (I think at a plenary, as I was unable to find it on the forum) is that we tend to think (and speak) in units of time. Like the onset of a sound being within the first 50ms or whatever. Or analyzing 1000ms of audio. An onset doesn’t happen within the first 2205 samples of audio, as that’s a non-specific amount of time. This is obviously critical when trying to decompose and/or reconstruct audio based on temporal units.

I understand the usefulness of having frames/samples as consistent units of processing and audio, since fft stuff happens in that domain, since the alternative can be potentially setting the (presumed) @starttime @numtimes in fractional times (i.e. 3.5818ms start time and 13.45351 analysis window) would be super awkward.

But now we have a situation where a significant part of the shared/included code, and nearly all the code that is time-series-specific, is presumed to be at 44.1k. In the best case we get something where the process just happens “faster”. Not great, but not the end of the world. But the bigger concern is where things break and/or throw error messages, as happened to me in one of my patches.

So this thread is mainly about talking about a ‘best practice’ that we can establish, to make patches sample rate agnostic (particularly when it comes to pesky @startframe / @numframes in statistical contexts) and/or coming up with a programmatic solution to this (perhaps something like a @relative flag in buf-based objects (so that @numframes / @startframes are computed relative to 44.1k).

Maybe this is the job for abstractions, tagged onto every fluid.buf... object, but if that’s the case, it would be worthwhile establishing some kind of ‘house style’ for that, to make things easier to deal with in the future.

This is interesting actually, and I hadn’t thought about the implication of changing SR affecting the grains of analysis in that way. I never leave 44.1khz so I have never come up against this kind of issue, so I’m not sure how big of a problem this would be for the health of the objects into the future. @units could be a solution, so every object that has a startframes, numframes actually just becomes start, end and then respects the @units attribute. This is my just riffing on @rodrigo.constanzo’s ideas, but I actually like that kind of configurability per object and without having to abstract with other max objects. Again though, it adds learning overhead for anyone who hasn’t been on the journey with the objects so far and adds another level of user management which might be better solved by creating dynamic modification depending on sample rate for those things that need it. my 2c

Yeah it’s a tricky one. The samps as units is pretty entrenched in the workflow, and as I mentioned I tend to stay in 44.1k as well, but it’s a huge assumption to make, as even 48k is fairly common.

A @units flag could be a cool solution, and that way you deal with everything in the manner that makes sense for you.

I’m also halfway thinking about raising the sample rate as another approach to minimizing overall latency, so having sr-agnostic code is a useful step towards that.

I think your title is misleading: it should me ‘My problem with samples’ :wink:

There are pros and cons to have both, but a lot more pros to stay in samples. For instance, if you do batch processing of many files, each with various SR, then you would need to be aware of that… so a little pain for a lot of potential freedom is what we do, as usual.

If you stay in 44.1, it is no issue
if you move, it is a larger issue than you say, so you’ll have to be disciplined depending on what you want to do with it.
if you move between CCE, for batch processing, that is even truer.

We have investigated the SC way (in second) and it creates (for me) at least as many issues that way, so we’ll stick with samples, which is the only true measure. If you make a simple division with a simple query (which we have to do in SC the other way round) then it is sorted everywhere for your use case.

Obviously there is no perfect solution. I’m just highlighting the fact that when we (including and specifically you) speak, we speak in units of time.

And more importantly, changing sample rates breaks a lot of things.

So at minimum, the helpfiles and examples should be coded to be sr-agnostic.

Ooof!
So how many samples is an onset? Or a second?

My tuppence: I, too, like the ability in FrameLib to be able to specify different units, because we (I) tend to think in different units of time for different species of task. However, it certainly doesn’t stop me making mistakes (most commonly, addressing something in samples that expects ms).

For Max attributes, there is actually a host-provided feature (that came in with transports etc.), which allows you to select the units of temporal attributes. See, e.g., the inspector for groove~'s loop start and end points. Personally, I think this is quite neat, but nobody else seems to agree with me :man_shrugging:
AFAICT, this would allow us to keep the ‘natural’ units in samples, but give (Max) users a mechanism to address in units that were more comfortable.

Certainly, if we are (ever) to address this, I’d rather do it at the host level, than start to muddy up algorithm code with lots of conditionals on time units.

1 Like

Yeah something like that would be great!

Is that just a wrapper flag you can enable somewhat easily, or is it an overhaul-y thing?

A little of column A and a little of column B. At the back end, we’d need to flag which parameters represent units of time in a way that doesn’t break the existing code, and then there are some mild acrobatics to do in the Max wrapper. Given that I also want to explore ways of being able to reduce some of the noise by grouping related parameters / attributes, I’d probably experiment with both at the same time and see if either has any actual use.

1 Like

That is neat. Kinda like when I found out you can use constants a lot of the time too… So is this a thing that you could pull over to FluCoMa, or is the API hidden?

This was also a source of confusion in the beginning, but I love the freedom to tweak it.

It’s not hidden, but I don’t understand it yet :smiley: In any case, we’d need a bit more scaffolding from the FluCoMa code to make it useful. All in good time…

1 Like

Giving this a bump as I was reminded of this for the 3-way splitting stuff in the other thread..

It strikes me that if any other sample rate is chosen for that patch, it would break. Or rather, the initial ‘click’ I’m after IS 2ms. It isn’t 88 samples. 88 samples can mean anything from 2ms to 0.46ms in terms of real-world time. Now, I can work around that by using loadbang->mstosamps~->fluid.bufwhatever~, which gets messy really quick, and mitigates the massive usefulness of @attributes for all the objects.

Plus, some things get really really messy when you’re analyzing x amount of frames of time, and offsets of those frames of time, which are all dependent on sample rate.

So yeah, wanted to bump this as there was promising discussion of @units and/or leveraging the Max API to handle that without overly complicated the implementation of this.

Thanks for the bump. Yes, I can still aprecciate the usefulness of this, but it’s not going to happen very soon. Also, I’m not sure it would even help in a robust way when dealing with buffers, whose sample rates can be different from the host sampling rate: I don’t know if the API bits for temporal units are clever enough to catch that.

I appreciate at this point, as far in as things are, that this wouldn’t be a priority as the “paradigm ship has sailed”, as they say…

I would think that having files that have different sample rates is a known problem that a user can work around or should be aware of (similarly, if they have really long files, that would come with some overhead and maintenance). The @units kind of thing is something that can break things pretty badly in a way that would be transparent to a user, even if only going to 48k.

1 Like