Internal/random buffer~ creation ala ears.stuff~

So based on @danieleghisi’s presentation at the Friday geek out session, it struck me that the way that ears.stuff~ handles buffer~-based operations is suuper elegant. Even seeing him prototype an idea on the screen share took no time… and more importantly, no manual creation of buffer~s.

This also relates a bit to what @a.harker brought up as a general interface-y thing.

But I guess the general idea is that, unless otherwise specified, every object creates and manages its own internal buffer (with random names ala jitter u2535233523) and then passes that out of its output. So that if you want to do a series of processes you just create a serial connection of objects, and send a starting message/buffer at the top, and at the bottom you are returned a randomly generated buffer~ reference.

You can obviously still specify a buffer~ if you want or need it, as well as specifying if you want to do things ‘in place’ which @jamesbradbury brought up in this thread.

For my own silly shenanigans I’d still have to create and manage buffer~s in order to @blocking 2 everything, but for prototyping and/or offline processes, the workflow is so much nicer.

Really, seeing @danieleghisi whip up a patch that slices a buffer~, applies a fade, and reorder the segments was 3-4 objects, maybe 10seconds of coding, with not a buffer~ in sight looked so effortless! The equivalent coding in the fluid.verse~ (presuming you could do fades) would require figuring out how many buffer~s you needed, you’d have to name them all and remember the names (as well as making sure that you haven’t already used that name elsewhere), etc…

So this is more an interface discussion/prompt than a feature-request, but there’s some feature-requests undertones to it!

it might even look as simple as source $1, bang for every time you chain.

Maybe @rodrigo.constanzo needs to read the forum more thoroughly…

Even simpler would be that if it receives a randomly generated name (that is presently in the memory space, or some other kind of check), it executes on it sans bang. So you could literally just have object/cable/object/cable.

I mean, that would be gravy all the way.

I don’t know about you, but the last thing I do when making a new post is check to see if someone has made that same post before…

1 Like

I mentioned this during the latest FluCoMa geek chat, but I wanted to bump this thread with the buffer~ stuff in light of some of what may be the suggestions from the use case prompts in the thread about “fluid.datasetfilter~”.

As I mentioned to @tremblap, I spent a few weeks away from FluCoMa stuff working on some other things (largely 3d-printing!) but I came back to it last week in order to tidy up some of the code I was working on before.

One of the things that struck me was the amount of friction involved in “data house keeping stuff”. As in, I knew what descriptors I wanted, what stats I wanted, and where I wanted them all, but it still took me the better part of an hour to get it up and running. And if one thing changed (another stat or descriptor), that often broke everything, particularly in @blocking 2 mode.

Granted, the @blocking 2 means that things will always be fiddly (actually, is there any overhead with just making buffer~ foo @samps 500 500 for everything? I imagine the @samps 500 probably doesn’t matter, but I have no idea if having loads of “channels” does weird stuff), but the kind of coding required to manage this stuff is pretty far removed from any creative process. mitigates some of the unpleasantness of this (though the js-based nature of it won’t jive with @blocking 2) but even with that, you need to think in “channels” and “indices”, which is not the same as “descriptors” and “statistics”. So figuring out I want the 3rd sample from the 2nd channel is still required.

I’ve rung this bell long enough, but I only bring it back up because the prospect of doing all of this “at a zoomed out level” with fluid.dataset~s is pretty daunting.

So if building something that lets you do a (simple) binary hierarchical search inside a fluid.dataset~ requires you to remember (or make note of elsewhere, since there is no symbolic notation anywhere) what indices in your fluid.dataset~ correspond with what, in order to be able to cleave off the bits you need, it starts to be the same kind of fiddly “data management” problem which kills the creative coding flow (for me, at least).

So this bump is part-bump for having better ‘low-level’ data management, but it’s also a pre-bump for future ‘high-level’ data management.

There will be a notional performance hit due to cache misses because buffer data is stored consecutively by channel rather consecutively in time. However, whether it is noticeable is another matter: almost all (maybe all) of the NRT algos copy their data into a time sequential layout before processing anyway, so the cache misses will just be on the read / write from the [buffer~].

Is it not possible to use a dict or a coll in exactly the same way as you would add labels to entrymatcher? If you found a workflow that made sense to you for this, you could make some abstractions that wrap (e.g.) bufselect, datasetquery etc which would, in turn, give us food for thought about the shape of a possible future. At the moment though we’re still at the ‘getting it working at a low level in multiple hosts’ stage.

Ok, that’s really good to know. (makes mental note to be fussy about channels, but cavalier about samples) It’s often the samples part that is harder to figure out, with channels usually being multiples of 7 (for stats or spectral moments etc…).

This is an obvious way to go, but I’ve put off this for now hoping there will be an internal solution later on. So far my patches have been exclusively “oldschool” (entrymatcher / coll / with a bit of dict in the mix) or “newschool” (only fluid.dataset~).

I guess for some of the things I’m thinking, fluid.datasetquery~ wouldn’t be able to do the manipulations yet, so it would just be an exercise (worth doing) in tidying.

I wouldn’t internalize that too deeply, lest you fall into the trap of working around a performance problem that isn’t really an issue. The only way to really be sure is to measure, or at least try it and see how things behave. Like I said, the performance impact could well be completely negligible.

Yes for spectralshape (because these are time series), but the stats are frame-wise, no?

Worth doing both from the normal wholesome point of view, but also it will be really helpful to us to start to see the repeated patterns that lead to boilerplate which we may be able to abstract away in future versions.

1 Like

Whoops, my bad. Yeah you’re right. I guess, either way, I tend to get hung up on the sample count rather than the channel count.

Indeed. I’ll start having a think about this, but this is very much starting to overlap with the stuff being discussed in this thread. Off the top of my head, an abstraction where you define each descriptor along with what stats you want for it (similar to descriptors~) would be quite convenient: centroid 500 10000 mean min max 0.95 std etc…

That could get messy quickly if you want the same stats for loads of descriptors, or if, by default, you want “everything”, as that would get really verbose.

Another alternative would be to have content-aware labels for where you can give it an attributes or tickboxes for whether it’s dealing with spectral descriptors, or stats, or x amount of derivatives, and then you can be verbose with it: @source source @destination destination @descriptors centroid flatness rolloff @statistics mean min max. Or something like that.