Max: comparison patch update


The example folder contains, since Alpha02, a semi-working example called -WiP-comparison.

Now it works in its first (dirty coding) way - you can compare between 2 types of analysis (what to analyse, with which fft settings and which stats) and you can get the nearest 5 matches of each. You can select a target from the analysed corpus or just load one. it’s at the top of the patch.

My next push in this thread will be to add the time bundling (3 windows potentially overlapping in time) then maybe sanitisation. most of both will happen in the dirty javascript :slight_smile:

This is research code - i.e. it is there to understand how things work and interconnect, for me as well. It is shared with all its dirty limits and under documentation. (27.8 KB)

1 Like

Awesome, finally got to play with this today.

Definitely takes some time to run!

A couple suggestions/requests:

  • the ability to skip the segmentation step (if you’re loading a folder of presliced files already)
  • multiple time windows! (separate start/end points, including “rest of initial file”)
  • the ability to to take diff features for each window (turns out I only need loads of melbands for my first sample window, and not the rest)

by skipping the segmentation, I mean using the boundaries established by the individual files themselves, not treating the whole thing as a single “segment”

thanks. this patch is there for people to plunder, so feel free to do what you want/need with it. Thinks that will happen:

that you can do already. Just bypass it. The format is the same at the output of the loader and the slicer.

that is happening right now, but not with rest of file.

that won’t happen in this patch.

The idea is that I share my dirty code. You do what you want with it. There are so many other ways to deal with a similar workflow (polybuf, dicts all over, etc) that I just share it for fun and for people to learn… I’m definitely not catering a semi-app with support!

So plunder away. Hopefully tomorrow morning, 3 time window version is to be posted…

1 Like

Yeah totally. I’ll modify stuff. I was just making suggestions as to what may be useful for the bach-style abstractions if the point is to make them useful to many different use cases.

That’s good to know. Having it as a toggle in the descriptors section would be handy since I imagine a more common use case (if you have files that you are concatenating) would be that you have already sliced segmented. So the concat + resegment seems to me more be a non-standard use case, which comes at the cost of tons of processing time.

Curious what your implementation will be. Say you have segments (after the first couple steps) that range between 300 and 5000ms (or whatever). Will you just have options to do 0-x, 0-x, and 0-x (with no option to be able to analyze the rest of the segments as one of the time windows?)

3 independent x-y without zero padding (if you put a large number in y it takes the rest of valid frames)

1 Like

actually, I’m coding it right now, and I wonder what interface makes more sense. each is define as startframe numframes as usual, with a ms calculator to help decide, but here are my 2 options:

  1. the non-patronising-you-are-free-and-responsible:
    startframe has to be valid, numframe can be -1 (until the end) or a valid number (user responsibility to make sure slices are right minlength
  2. the patronising way:
    there is no real way forward for this because if the user throws anything in, but let’s pretend it just stops if startframe is outside, and will take the min of numframe and end…

I think I’ll code 1 for now, and we’ll see

1 Like

Most everything else is start/numframes, so if they’ve come this far… they should be alright with that. Having an ms calculator is handy though.

I like the -1 idea to specify ‘until the end’.

1 Like

ok here goes the variable statistical window number and size. if statistical numframes is -1 it goes to the end, if 0 it disables that time frame, so to get the old single frame behaviour, just enter 0 -1, 0 0, 0 0 which will take all in the first frame then nothing in the 2nd and 3rd statistical time frames.

With this patch you can evaluate many things subjectively:

  • the impact of statistical window size and number
  • the impact of fft and hop size
  • the impact of stats (which and how many deriv)
  • the impact of descriptors (which)

I think that should give us all a little playtime and thinktime :slight_smile:
then you can tinkertime! (27.9 KB)

1 Like

Awesome, I left an analysis running last night, but I’ll put this to run again today and take a look/listen.

Pleased to have the different time frames in there now too.

p.s. this is such a delightfully “PA” and oldschool looking bit of patch:

pack s i i i s i f f s i i s i i i i i i i s i i i i i i i i i i 1 0 2 0 3 0

I know. the poly~ voice old school coding method. But hey, dicts could be used there to make it readable… I have other things to do now (other demo/seeds) but when I come back to it one day if nobody does a cleaning/pull request then I will recode it with something more readable…

I tend to use join a bunch these days since it can deal with whole lists, or any other data type without having to specify. So a couple of joins upstream which get merged down as things go.

A question about the interface for this*.

I’m trying to recreate the analysis window and timescales from my main patch, in order to compare the different types of descriptors and stats.

In my original I have 256 samples that I can analyze (from the JIT rolling buffer), and I’m using @fftsettings 256 64 512 (zero padding the end as per @weefuzzy’s recommendation).

In my present patch, I have this:
Screenshot 2020-07-04 at 3.50.07 pm
Which gives me 7 frames of analysis total. I ignore the first and last few and take the middle ones, since I don’t want the pull from the zero padding on spectral descriptors (mirroring would be great! (but different story/post)).

I know I can’t do the selective start/numframes in the comparison patch, which is fine, but I have no idea how I’m supposed to designate that I want to create the same type of analysis.

If the -comparison patch, if I put in the correct @fftsettings (manually adding the 512 at the end), I can’t seem to select an amount of frames that corresponds with what I’m doing in my patch.

This seems the closest, but I don’t think this is right:
Screenshot 2020-07-04 at 3.46.56 pm

If I put in 7 frames, which is what I’m getting in my patch, I get this, which is waaay off:
Screenshot 2020-07-04 at 3.55.46 pm

So either I’m not understanding the maths here, or the maths is wrong (i.e. windowsize vs hopsize being multiplied for the ms conversion).

And to be clear, I’m earnestly confused here*. I would actually like to know what I’m supposed to put into this interface to correspond with @startframe 0 @numframes 256 @fftsettings 256 64 512. (The purpose of this is to use the same settings I’m doing in my actual patch, I just want to be able to compare the different descriptors/stats to see what works best, and 256 samples is all I have, and those fft settings are what works best with those few samples).

*I promise this isn’t me trying to be difficult or bust balls, but I actually don’t know what I’m supposed to enter here.

It is for the duration indeed, nice catch. I multiply by the full window and not by the hop (plus one window). I’ll post the corrected patch.

There is still a major difference between this patch and yours though.

the short answer is you can’t, for 2 reasons, without changing the patch radically.

  1. the patch is not meant to oversample so your 512 won’t be possible. (small change in the patch could allow you to do that)
  2. (as explained via email but added here for completion) even when you enter the right values (256 64) , which in your original patch would give you 7 frames, zero paded, this patch analyses the whole segment (however long it is, let’s say 500ms) and will only bundle the descriptors. This is to allow a single analysis to be bundled statistically in up to 3 frames. That changes the zero padding behaviour. The times are right and the same, but the zero padding will be different than a patch that would run the analysis on the first 256 frames and zero pad them in the description.

So in your original patch: 256 samples analysed is 5.8ms and each hop is 1.45 ms. Your descriptor starts at the first non-empty frame (as explained in bufpitch) which is negative time (zero paded). That patch will actually help you see what each frame is having window wise. Let’s do it in sample first:
analysis0: -192 to 64
analysis1: -128 to 128
analysis2: -64 to 192
analysis3: 0 to 256
analysis 4: 64 to (256+64silence)
analysis 5: 128 to (128+128silence)
analysis 6: 192 to (64+192silence)
all of this is windowed.

In my patch, because it does not zeropad on the right (it analyses the rest of the file and it is just statistically clumped) you won’t get the same values on the last ((window/hop) -1) number of frames.

1 Like

and here is the version with the correct (28.0 KB)

1 Like

Phew, I thought I was going crazy looking at the numbers, hehe.

I see. Yeah that’s a bit different, particularly with the short frames.

This is what you’re doing in the LPT one too? (as in, your “extracted” 500ms bit is zero padded, as a whole, but each individual subsection of that carries on into the next analysis window (as dictated by fft math), but takes the actual stuff in those, rather than zero padding)

I’ll still try comparing stuff to see if I can figure out what works best to analyze with.


(I started a long addendum here, but decided to fork it into its own thread here)