After winding down for the day I thought to try PCA since it’s the “dumber” of the choices, that it might go with what I’m trying to do better.
I wanted to keep a sense of the difference in scales between the (currently random) entries, but I guess that would still be the case if I either standardized/normalized them anyways(?).
Me neither!
I guess one a simple/pragmatic sense, I could “mix” a bunch of related descriptors together (duration/timecentroid) and then use this single “timeness” metric to bias the query.
So rather than having to chain things together (duration > 1000 and timecentroid > 500
), I can just query for timeness > 0.6
.
This is likely just me entrymatcher
-soaked brain trying to make sense of a knn
-y world though.
I’m nowhere near confident that it’s not a meatspace issue with that, but if I find it, I will do.