I’m trying to do some map on 10s of thousands of points. I’m pretty sure it was giving me answers within 10 minutes earlier today - since then I added some code that removes equivalent/duplicate entries in the data and now it is not taking forever/not getting to an answer.
This might be something dumb I’ve done, but is there something in the also that requires convergence (or could diverge) and does it depend on a random state how fast we get to an answer, or is it purely deterministic? Just looking to try figure out if there’s a way I can tweak this to get to results faster.
In my particular case I have close to 46000 data points with 5 numbers each and it never competes (I get bored before it completes). Earlier today I was under the impression that double that was converging within 10 minutes, although I may have made an error in testing and been using a smaller set - I thought I had checked that though.
It bails at @iterations
whether or not it’s converged, and AFAIK can’t diverge. A few tens of thousand should be well within UMAP’s algorithmic capabilities, although it would take a while and it’s possible you’re running into some kind of issue with our implementation: what does activity monitor say about cpu / memory whilst it’s failing to return?
Haven’t checked, but I’m not on an official build because I had some bugs back in feb - is there an updated official version I might check?