Thanks for having a read! I’m not great at statistics or even a numbers person really. My approach with narrow curation of statistics is to ‘let the machine figure it out’ which is where dimension reduction comes in. There is always the problem of sausage machining your numbers though which makes the values themselves fairly meaningless, unless its PCA where its standard deviations.
In my approach everything is evenly weighted before it hits the UMAP part, but now that you’ve raised the question - I think that it’s likely to produce better results if certain stats are heavily weighted over others. Min/max might actually be misleading, especially if those values are derived from a single frame from thousands for example. An approach would be time weighting maxima/minima, or an average/median of the maximums. As you’ve demonstrated before in ag, you could also do content-aware weighting with the loudness of the frame etc to get rid of mostly silent stuff. Actually, that really simple code snippet you posted somewhere else on the forum showing weighting with
numpy inspired me to put post/pre hooks into FTIS for each stage of analysis, allowing you to do custom weighting.