Dimensionality reduction (mixing the rational and irrational)

rodrigo.constanzo · June 29, 2020, 9:28am

After winding down for the day I thought to try PCA since it’s the “dumber” of the choices, that it might go with what I’m trying to do better.

I wanted to keep a sense of the difference in scales between the (currently random) entries, but I guess that would still be the case if I either standardized/normalized them anyways(?).

Me neither!

I guess one a simple/pragmatic sense, I could “mix” a bunch of related descriptors together (duration/timecentroid) and then use this single “timeness” metric to bias the query.

So rather than having to chain things together (duration > 1000 and timecentroid > 500), I can just query for timeness > 0.6.

This is likely just me entrymatcher-soaked brain trying to make sense of a knn-y world though.

I’m nowhere near confident that it’s not a meatspace issue with that, but if I find it, I will do.