Transformpoint for fluid.mds~

Don’t know if this falls into the same category as the read/write messages being missing, but fluid.mds~ doesn’t seem to have a transformpoint message/option.

This would be super useful, for fitting a pre-computed reduction on a single point.

This is by design. MDS as an algorithm doesn’t really support fitting unseen data, because the process is based on the map of relative distances of the source points. Adding a new point amounts to doing the whole thing again. This is the same in scikit-learn, from where we borrowed the interface:

1 Like


So that means fluid.pca~ is the only one that will play nicely with a pre-fitted set, which can then get applied to real-time/variable input.

I guess I’ll find out soon enough, but will TSNE allow single point transforms or no? (Just to know whether to dig into PCA for the long haul, or hold out for TSNE).

(And as a side question, when @tremblap has mentioned that TSNE is “slower” does that mean in computing the fit, or also when transforming via a pre-computed fit (if it, in fact, allows that kind of behavior)? I don’t mind waiting around for a fit calculation, but trying to keep things lean-and-mean in the real-time processing chain.)

t-SNE doesn’t (in sklearn at least), which might be an argument in favour of doing ISOMAP or UMAP instead / as well by way of a nonlinear dimension reducer.

Well, t-SNE will only do fit_transform (unless there is a way of supplying new points that sklearn doesn’t implement). But it is significantly slower than PCA – but everything will be slower than PCA to some extent, because PCA is the very simplest approach to this (and hence very limited).

1 Like


So fluid.pca~ and me have a bunch of getting friendly to do.

Just re-read this. Having an option to do a transformpoint with a fancy algorithm would be handy, so consider me feature-requested!

1 Like

Sorry to simply restate what is learned here, but this is really helpful. I was getting better results with PCA anyway, so since my future entails mapping incoming points to current sets, it seems that is the best tool.


1 Like

PCA is generally a good one to start with anyway, considering how much less intensive it is. In your case, because you have a neural net involved (yes?), that can help soak up nonlinearity in the mapping

Well, as you will see from my presentation, I think it is either NN or PCA, but not really both. I would love to be proven otherwise, but at least how I have them set up, it was redundant and not effective.

Interesting! Very much looking forward…

I’ll stress repeatedly during the session on NNs that what’s effective is always dependent on the interplay between the data, the network, and the job at hand. Certainly there are cases where PCA->NN works well, but it’s bloody difficult to say much a priori about what those cases might be.