Some else that came up during this week’s geek out is the idea of some kind of scriptable, or embed-able “analysis pipeline” , where you have a sequence of analyses/stats/scalings → dataset processes/reductions which can be applied to any given corpus (file/folders/whatever) and that that can be baked into a metadata-esque file such that it can be recalled and reused, with whatever relevant conceits for realtime use being made (threading modes etc…).
That would reduce a ton of headache when building patches, or moving between corpora and/or if you change your analysis/analyses. Would actually be super handy for exploring very niche analysis settings that suit and/or are optimized for a specific kind of material. It would mean that you could swap out (realtime) analysis/matching “pipelines” with each (offline) analysis/corpus, and not need to worry about what goes where.
There’s obviously some complex interface things here (some of which I’ll brainstorm below), but it would be cool to have some discussion about this as @tedmoore mentioned he was building (or thinking about building) a similar thing for SC.
///////////////////////////////////////////////////////////////////////
Where I see this becoming particularly complicated is when you have multiple timeframes on one or both ends. Specifically if it’s asymmetrical, as would be the case for my main use case (e.g. multiple “offline” analysis windows with only one realtime analysis workflow). For my purposes I’m intended to use the first/fast offline pipeline, but that may not always be the case.
There’s also potential interface friction with regards to normalization/scaling/overlapping where you may want to normalize/standardize on one side, but not the other, or apply a subsequent bit of scaling to further transform a 2D/3D space.
That could potentially be a post-processing step once a given pipeline is setup, but that could end up back and square zero if you have to prune/unpack a dataset to scale different bits differently, to then put them back together, with a unique version of this for each pipeline, etc…
So yeah, just wanted to make a thread about this as there’s some cool stuff to think about here.