Categories/Descriptors for contours, envelopes, and morphology

rodrigo.constanzo · October 31, 2020, 3:43pm

I’ve been thinking about envelopes and morphology a bunch lately. For one, trying to generate more temporal/morphological information from onsets and short attacks, but also how to better capture or analyze that information.

A while back I was looking at linear regression as a better summary statistic, which I quite like, particularly since the formula also spits out an r2 value, which seems to indicate a kind of confidence of the slope. So in that I feel like it gives you a derivative-esque measure of contour, but also how much that is the case.

Either way, these are summary statistics.

What I’m thinking about, and asking here, is if there’s a way to represent the envelope or contour that 1) can be some kind of summary and 2) can be queried in a variety of states.

By 1) I mean, not having the information for every single frame. For a low number of analysis frames, that’s fine, but if it’s a longer sample, having hundreds (or more) of frames, per sample, starts getting unwieldy.

By 2) I mean being able to query based on transformations of that contour.

Here’s some visual reference as to what I mean.

Say I have some kind of contour (be it loudness, centroid, whatever) that looks like this:
Screenshot 2020-10-31 at 3.35.32 pm

So a way to represent that that isn’t a list of 7 numbers (or 7 x/y pairs more likely), for both efficiency, but also to have it to be robust to transformations.

For example, if I wanted to have this contour match highly things that look like this:
Screenshot 2020-10-31 at 3.36.03 pm

So the individual values would be different, but the overall gesture or contour is “the same”. For the offset stuff, it would be easy to just store delta values, and then it doesn’t matter where in the overall space it falls. I guess it’s more complicated for scaled versions.

But what I find more interesting here is to have this be applicable to difference scales of time:
Screenshot 2020-10-31 at 3.38.25 pm

Here, I’ve literally just dragged function to make it appear longer, but this would potentially be the same kind of contour but made up of a different amount of individual points.

The overall idea here would be that I can have any given contour/envelope and query via that as an initial way to cut down on the overall corpus/dataset, and then match for specific ranges/values after that. This way I could match via contour or gesture in a manner that is divorced from duration and temporality. A way to side-step my “short analysis window and long samples to playback” problem.

Metaphor wise, it also makes sense to me in a vector vs raster way, where I’d like to have vector representations of the contours, which can be queried with, that is divorced from the per-pixel/per-sample counterparts.

Is this a thing?

jamesbradbury · October 31, 2020, 3:50pm

One thing to look at is the dynamic time warp distance. DTW is problematic in that it compresses and warps time in weird ways to satisfy the targets of the algo but it kinda sounds in the ballpark of comparing things in a less direct way, and more on the speculation of how the shape can be compared.

rodrigo.constanzo · October 31, 2020, 6:26pm

I remember you mentioning that in the past at some point. I think it’s definitely a useful idea/algorithm. I wonder how flexible the data it can take can be (e.g. 100ms to 30s).

Also had a(nother) look at @a.harker’s gesture_maker, but that looks like it uses it’s own internal language/syntax. But I’m imagining something like that where there’s a general contour or shape that’s laid out, with specifics being a bit fuzzier and/or accomadating.

a.harker · October 31, 2020, 7:39pm

PLA (piecewise linear approximation) is another useful technique that can maintain a good sense of shape with a lower resolution representation. I wanted to make a shape analyser at one point along the lines of gesture maker in reverse, but some of the ways you’d do that are hard to formalise. Might be fun to revisit that project as more of a group thing…

rodrigo.constanzo · October 31, 2020, 7:57pm

Definitely consider me interested.

I think it would be a fantastic thing to have as a more AudioGuide-esque way of matching samples since more morphology could be retained (I guess theoretically), but then for general matching, distortion, analysis, etc… there would be plenty of uses for that.

Some PLA would also be great for downsampling complex filters (ala the spectral compensation stuff, which presently loses lots of resolution going from 40melbands down to 8 static cross~ filters).

a.harker · October 31, 2020, 8:02pm

There’s some PLA code in the HIRT - it’s certainly good for going to a low dim representation, but that doesn’t allow you to consider stretches etc. easily - that’s a different difference measurement, unless you want to store a bunch of different transformations and match against them.

rodrigo.constanzo · October 31, 2020, 8:17pm

Not played with it in depth, but it’s handy. Wasn’t useful for the spectral compensation stuff as the buf-based HIRT stuff was too “slow” for that workflow, but might be good as another kind of “summary” stat(s).

Was what you were thinking a while back more for analyzing or for also finding transformations/distances?

a.harker · October 31, 2020, 8:30pm

I was specifically interested in a complex analysis system that would allow for analysis and resynthesis (with transformation) - I got as far as outlining the problem, but never build even proof of concept. I guess here I’m thinking about something simpler that either classifies or can compare shapes directly.

rodrigo.constanzo · October 31, 2020, 8:37pm

That would definitely be badass.

I’m definitely down to help in whatever way I can, particularly when it comes to identifying and comparing shapes: