The need for speed (fluid.dataset~ + fluid.datasetquery~)

After finally getting to the bottom of things in the Example 11 verbose thread (thanks for your patiences @weefuzzy & @tremblap) I’ve managed to build a super stripped back version of my onset descriptors stuff.

I’ve created a batch analysis patch that analyzes the first 256 samples of my Metal Resonances sample library (3094 individual samples) and creates a 8d dataset. It’s almost identical to the process from example 11 except I only use MFCCs and loudness. I use 20 MFCCs instead of 13 as I got good results with that elsewhere. I still ignore the 0th coefficient. After that I do the same sanitization and PCA-ing that happens there to end up with an 8d space at the end.

For the real-time version, I analyze and post-process everything the same.

The results are pretty good. I haven’t A/B’d it with the vanilla entrymatcher~ version where I use only centroid/loudness/etc…, as I was mainly trying to get this up and running to check out the performance.

The good news is that for a super basic use case, it’s faster than I thought it would be! I got around 3ms average per query, with some spikes above that:

Still about twice as slow as my “full bells and whistles” version that does loads of desriptors, melbands, spectral/loudness compensation, etc… (around 1.5-2ms average) that’s built around entrymatcher~.

But this speed is with no extra funny business. As in, a pre-fit fluid.kdtree~, with no fluid.datasetquery~ on the real-time side.

If I put in a single fluid.datasetquery~ + re-fit-ing of the central fluid.kdtree~ and the numbers jump up. Way up:
Screenshot 2020-10-04 at 10.35.09 pm

This is around the range I was initially expecting.

This is also just a single fluid.datasetquery~. In a more real-world context, there may be a couple of forks, I may have some metadata (timeness) stuff I want to filter by, I have a load of melbands (for the spectral compensation) that I’d want to store somewhere and recall with each query. And that’s not even getting into actual biasing and skewing of an analysis which would require some further data transformation per query.

And on top of this, this is a very modest corpus. Only 3000 entries, with 8 dimensions each. I’ve loaded up corpora up near the 1million range, and my “bells and whistles” version has like 70+ dimensions (though not all are part of a distance measurement).

//////////////////////////////////////////////////////////////////////////////////////////////////////////////

I know this will get faster when things get to the optimization stage (similar to what happened in TB1), with regards to the algorithms/functions/etc… That being said, I am dubious that copying everything in a fluid.dataset~ per-filter-step per-query, and then re-fit-ing a fluid.kdtree~ at the end of that process (again, per-query) will be able to get significantly faster than this.

Perhaps there is a way to cleverly form queries that retains the fit of a fluid.kdtree~ as @weefuzzy hints at here, but the bulk of the speed loss here is coming from fluid.datasetquery~ addcolumn-ing the dataset just once (the fit only adds a fraction of time, on this small dataset).

//////////////////////////////////////////////////////////////////////////////////////////////////////////////

With that I’d like to re-bump the idea of having some kind of fluid.datasetfilter~ that is meant for filter (but not copying) data, that perhaps has a more friendly syntax/interface. And an easier way to get data in and out of fluid.dataset~s, that are quite monolithic at the moment. This would be for use cases where you have some data that corresponds with each entry (say a buffer~ name, duration, melbands/filters, etc…) that you want to be able to get in/out of a unified storage container, without needing to go through all sorts of intermediary container/types.

We would be too, if that was how it worked. Fortunately, it’s only on transform / transformJoin that any data gets copied. Anyway, I’m glad you realise things will get faster in future, and there’s reasons why they’re not yet.

What you’re asking for with the filtering is reference semantics, which aren’t terrible things in and of themselves, but are much harder to reason about both for library writers and creative coders, and aren’t necessarily the solution to all performance problems (IOW, there’s not much to be gained by guessing where bottlenecks are occurring) .

Meanwhile

  • I can’t remember what it is you’re doing that means you have to re-fit the tree on every query, but I’m pretty sure there’d be a way of reformulating things so that you didn’t.

  • As for getting stuff in, I think there’s a variety of ways this could improve. The dictionary input / output will be much smoother in the next release, because the Max wrapper is now taking care of a lot of the lumpy stuff.

1 Like

Isn’t that the central thing that fluid.datasetquery~ is doing? Or rather, in a fluid.datasetquery~ as entrymatcher~ replacement context where you have some fields that you want to filter by before doing an actual distance match. So anything you do with fluid.datasetquery~ would end up in a transform of some kind.

Fair enough. I’m just poking at bits and gleaning from the The Team’s posts on here to figure out what’s going on underneath. You know, with the shadows on the cave’s walls and all that.

In my current workflow, I tend to have steps that narrows down the pool of total available matches before any distance calculations take place.

Like in my test example above, I’ll have 3000 samples, but I’ll apply a filter like duration > 1000 to filter down only to long samples. In entrymatcher~-land, that’s just a step in the process and things carry on. In fluid.datasetquery~-land, if I understand correctly, I would have to transform everything that has duration > 1000 into a new fluid.dataset~ and then fit it from there.

I would also have several of these steps along the way. onsets == 1, duration > 1000, etc…

To use a more @tremblap-esque example, it would be forking off based on analyzed pitch confidence. If the confidence is shit, transform a version of the dataset into one that has no pitch-related columns, and then fit from there.

So in short, there are things I’d want to filter by, and things I’d want to match via distance, and sometimes these will overlap (e.g. pitch > 200 -> distance matching on pitch in the remaining entries).

I still need to wrap my head around this, and what would make the most sense. In my entrymatcher~ workflow, I use it as a generic data container too, so holding metadata, melbands-for-filters, etc… along side the descriptors. If dicts are the intermediary format/container, it may very well be keeping all of that kind of shit in dicts and using datasets for distance match-y stuff. That does get sticky where some things are both.

But yes, dict-ing would be super welcome!

Yes, but I read you as being under the impression that every call to addcolumn, filter etc were also copying. They’re not. And, I stress, copying isn’t necessarily a big deal.

I don’t know what order entry matcher does stuff in internally (although I could look). I can tell you that based on eyeballing the fluid.datasetquery code, that it could well be faster to distance match a bunch of candidates first and then thin those out with a query. Whether there’s graceful workflow for that is another matter. (Again, to stress, the fact of making smallish datasets on the fly here shouldn’t necessarily be a concern).

I don’t know in any great detail what the code for that does, as it’s still relatively fresh out of the oven and we’re still bashing away at the interface. I will say though, that even when you do have the code (and perhaps wrote it) that the sources of performance problems can be surprising.

Sounds like some of this could be reordered, or helped by having prefitted trees that already have e.g. no pitch in them. But I don’t see where the the dataset wrangling comes in for real time analysis: are you building new datasets on the fly?

Right. No, no, I get that those are the same. I just meant in chaining together subsequent queries, requiring new copying (if you need a filter to happen before moving on, to further filter that area).

In a performance/fixed context, something like this would make sense. For example, in Kaizo Snare, throughout the performance I go between filtering by duration (duration > 500, or no filtering on duration) and filtering by onsets (onsets == 1, or no filter by onsets), so I could easy pre-“render” four datasets/kdtrees and select what I need, but this couldn’t cover all the bases.

I’m totally down for rethinking the problem and flipping it on its head, but I don’t see how some of the problems could be avoided.

A use case that I’ve been meaning to implement for a while now is mixing descriptor types to widen the navigable descriptor area. For example, taking a slow envelope follower and using that to filter by a timeness descriptor such that the louder/busier I play, the shorter the samples that are being pulled from are. So rather than having discreet datasets/kdtrees, it would be a continuous query ala duration > $1 where $1 = [scaledOutputOfEnvelopeFollower].

What I proposed there though was to make pre-sieving - I would have 2 patch structures/searches/databases for pitch and no pitch for instance.

That’s smart.

I presume there may be other things you’d want to filter for though, which would still require querying/copying along the way.

So I’ve revisited this based on the last geekout session.

From what @a.harker and I believe @weefuzzy have commented on elsewhere, the way the data is structured in fluid.dataset~ stuff isn’t “ideal” for small datasets, but should theoretically scale up better, particularly once a fluid.kdtree~ has been fit.

I also wanted to do a more apples-to-apples comparison than the numbers I had above, as I was comparing separate patches with different functionalities and setups etc…

Here are my findings, with some test code below.

////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////

At first I wanted to test pure matching speed at different sizes, with a single nearest neighbor (trying to be as consistent between the approaches as possible). I’m generating a 10k point fluid.dataset~ with 8 dimensions, along with a corresponding entrymatcher 10000 8. Both are filled with the same exact data.

I chose 10k as a starting off point since it’s bigger than the 3k above in which entrymatcher was faster, so I wanted to “scale things up” by a factor of three to get the ball rolling.

I’m also comparing the mean of the output and testing hundreds of random points to get a decent average over time.

This is what I get for that (10k points, 8d):
10000_8

This is in line with the numbers in my previous posts with regards to raw matching. My “bells and whistles” patch also does extra analysis, lookups, etc… so this is point-for-point the raw matching speed on the same exact dataset with the same exact test point.

Scaled up to 32k I get this:
32000_8

It’s starting to converge slightly, but still significantly slower.

Big jump up to 100k:
100000_8

Again, an improvement, but still slow. There is convergence though, so that gives me hope. I think around here I’m also at a decent “big corpus” real-world number, and 15ms is definitely on the slow side (for real-time percussion stuff).

Things above this point get interesting. I initially tried jumping right up to 500k but that patch kept hanging. After a bit of troubleshooting I discovered that I could not fit a fluid.kdtree~ with that many entries in it. I’ll make a separate bug thread for this with some hang/crash reports.

So I dialled things back to 200k:
200000_8 - wild peaks

This is the best result yet, in terms of raw matching speed difference. The fit did take quite a while here (more on this below), but I suspect if I kept going this way and started getting into @spluta sized 1mil+ fluid.dataset~s, that a fluid.kdtree~ might start to be as fast as or perhaps at some point even faster than entrymatcher.

I should also note that even though the average is 25ms, the actual time was jumping around like crazy.

I was curious where this “changeover” would start to happen, but I’m unable to fit a fluid.kdtree~ big enough to find out. The trajectory is such that based on the decreasing difference between the two approaches, it would presumably happen.

////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////

Now, I wanted to compare apples-to-apples. What the above tests/comparisons leave out is the fact that entrymatcher can vary the querying/weights/etc… per query. So I went back and tested things with the fluid.dataset~ + fluid.kdtree~ process also incorporating fluid.datasetquery~, which means that fluid.kdtree~ needs to get fit, per query. With the patch from my previous post, with a smaller corpus (3k points), this was considerably slower, but I wanted to see if/where this would start to balance out and converge based on the above tests.

For the sake of consistency I tested the same size datasets (10k, 32k, 100k, 200k) and only added a fluid.datasetquery~ + re-fit per query.

10k points:
10000_transform

Right out of the gate we’re orders of magnitude slower. I ran a ton of points here just to make sure that it wasn’t some spike or other thing going on.

32k points:
32000_transform

At this point I was manually banging individual queries, since it was too slow to use a metro consistently.

Big jump up to 100k points:
100000_transform

Over 5s, and at this point I start getting a pinwheel per query.

Finally up to 200k points:
200000_transform

…a little bit slower.

////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////

So from my tests here, I can’t scale a fluid.dataset~ + fluid.kdtree~ big enough to mitigate the difference in raw matching speed with entrymatcher using the same data at each step of the way.

It also seems that fit-ing increases exponentially (?) the bigger the fluid.dataset~ gets, so where a fluid.kdtree~ might start getting faster, having to fit per-query bogs that down massively.

I was initially worried that the ‘copying’ step of fluid.datasetquery~ would be too slow for real-time use, I didn’t anticipate that even a pre-fit fluid.kdtree~ was still significantly slower (in an apples-to-apples comparison (while ignoring the variable queries possible with entrymatcher)).

////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////

Here is my test code. For the sake of ease I have multiple versions of the files with the different numbers plugged in, so I’ll just post the 10k versions of both and you can just tweak the numbers as needed.

Here’s the patch for vanilla matching:


----------begin_max5_patcher----------
5045.3oc6cs0iiiic94t+UHXj4spc3cRkmxNYSlEHyr6fLSvffAAEjsYUslV
VxQRt5tmEa+aO7hjrjsjMkMkc0XbUncqRRTjmOdtwy4H5+9aeyrEYeRVLK3e
I3WCdya96u8MuwbJ8IdS0e+lYqi9zxjnBysMaY150xzxYOXuVo7SklyG8hLO
5YY842jKKT2VTYbV5iIwoxkYaSM2Ip5NR2tNNMQVZdrv5lEUt78woO+XtbYo
cXEFhlCdHfwU+WqefPxCATykPf4ff+2cO1rsk0OWP0YiWYFjYK9s2g.yzm6e
712p+3gKjrKiUTwyyNOZhJ3cnILA9P.ANdZBF5SZBOOHKOnXSz5fnfEaSW99
frmB9+1JyiU7JU2cuyom4r9SYokoQqkld+uHSdQVFuLZ1v3GMjOGQUnDQiUP
t4+HrSBY5N5onkx1cdaXjNq0MVD+6laTOU3QzEMOXYtLpTFjGktJacvlr3zx
.0wFH9yAeLt78Awkup.ZHfMGyZ8KWI8gPBOg6jq.tCav8nfUQkQExxfmhSRj
qrHd0rgh.VHyONatevVNsEOLhA7CVx7HTtXaYYV5XUtAgXrlVvDtgxH0eN.g
U+nrmp7yaj1myrEQJMqMMpM+BdGrlqf7RY9ixznEIlVdNT5SIYpA0nUiC.yw
s+gC0FpLJvEVhGMHYiFhrm8vQHcJ6njdybe95Hyfjc0.CFTLGCHhPl.CoPDg
BGvB2TAMzWqPSu7IWSnAy+phqgwXSnHDFN8fwPFgTmK+yq0vfLOHtHfMW.CT
9PJKBdJpPMR7B9wCMpffg7w6Jo.cNtRlJ+npwGPtEaxU917Tf1V69j927zDP
3T3bHHr9WH.y0LPgmfOZPKP8x+vB8HB8OG.m2KoiFKoCLTIFnl5QBBlQAPAB
DhNOJWIMq496g7IPOR9aBT8ypy1CCgxEJEESoDyuTs+TUqexWy1DPGW3TLpV
hrhJUJJhSjp09Vn7Du0S+Myh1ro0oeSqlngleKy7fDOzbp3T6ofMmJW9Rbc6
YMmMJWAGkJrXat0WyOwpceV+XxVIyS2FadR1SpljpFRloCsOpp01YcfzLqUe
41t8aze.LttwHVWSEszinlteNIa4GjqZoFQAoajowosWbRmKuR9Tz1jxGa6Z
JTuR3dtdsKt8dwF+r+S4wQIMDvy4wqxR0ChNyD5SW2cJlGpg1nsIFycjFsom
Fq3ATvx.WrPQjaKVDkqmnpLffpuXYVVR2K0ztD4SkUWdSbZ5dnXY1lguXd7y
u+HscQl5hqO1y1bkhG2lZu5iJdhxGKhdoKZWFkjTIy18w+onzXkVboV2scoE
MWzZD88EKyyRR5Pu1q7ROWYkhGeo7iwqJeuoiZyLnt83M0LQyZlkWE+rrnr6
4Jidtn6YJJ+rEzacpsKpjgerTtdShhJ5dCJoi3hxh2m8whparlQqM.rKdYsk
oaq.ry42SQnQAWi759J8.stPeJ9vg1XgYU0W+YCi7fJ6bXAVGrHq1CxZ+YZC
Wl6Ock7Ssz0TYWnRsyYBQcrUz0dg1ZgYTOD.BON.RAVEa.a3212RgWAP05Q7
HnnTZWrK9pcQEiJ8frTYP4GyBJeetTF7T117fmheQFTD+IkCXuHSCjZcGCBc
HmfNh0YJR3khcChaBlOwMa+dY7KHLztTMFfGJvgB.2QwOP+jXy5wNegqcFuz
gnZ.URFJRe89ApBEOxx54iJn+gftziRMbYbZis7ecGAnuQmlnF6nPuPdmFE0
C2IYT.u7AQ0Iqc.al1J6pGsdT7XTYYd7hsk14t1dTNJKeJ2vVDkrm4p9LL91
cCN+369ZYTpehXFpajPTNxqDrBO2E+aW5xCZSs8uBFL8bVASWEv6fgkIxn7w
tDNAg2AGHB5.QDBS85xYvLOt5seOI347rsaFM0KrVdw5HfS18KOTABvKInO8
RynyhlOufeGRXGxfSXJl4VIOQsr8IIt3dOr3Ctp8ka1pW.vw46GbkZtGiTHX
NLr0OXjRcwYKQLbPMf8l1jcqM8OTK9GYWgr.0NHhX9jr5e3bZHEJXMyv2iEv
8XA7GkXAbYqMAWU7ClPgZkYQzyY0HCtRekCTx7WhRBhSCVWnCZetbSVdobUf
BOj2x.ArH.5DtMrFhgBwBvnuia8P4fDn30HDfls2HcX0hWGTUYVeoVw9jBsU
gUlxtTrsq47gbF5UC39NkFMmB9xnA1JMADaRQPyoSItRdsgqSlp.h3poIf9G
KMApU+RuZJBX2Dr8xhyuUSYUAT4ho8ym0avPQpaUPgxAzh.aUtZpQQqG.JmA
p7MnWm.PuV.RqH7UAGEm.GiznVEPdTv65EsWGivIcBixJ2ww.ZBGCtFzakku
.3TEyaGGCSYDuINNFfS3X.Oh4hoZLfbMIH2z.+6VXTtcoG32xhSGafgwDp1w
.LEMG1JtvLho1tL905q.gK.uBBDNAXhwFFhmCE69EQHSRfv49rx01n7DPpbH
npj9JFajxqRBPuzND32rdvI9jviV9AKU2rviwl4CbXO0on.3WZl6QZtLXQPx
4NC2mzLDeJwYzQS+wfr3BOR0cJYUkKcahxiKxR00OB.DHFKdTocinqY.Np4W
BVywGdJwcxP3QRbgIcm5+Uer4+6sbFaJRdGsIIWuXuvFejplt8EWGsYSmbOc
dVWFJsq5pc4aBfyC9mflpdwdLpp5Wr+E1VEL1+fXqFF6ePMUEi8XVU0wX+Kt
sJYp9Kw4lZSNYNpsZMBWuPWjeykqXBTma34OSN69Uqg8JQG95U9VHL0rvMU9
VP9JS9tL64mSjiEpoLwbw9k.eUoRb14Uen5EgSuR4UW8nyyBPfwpzoWzfGNm
ztZR.rlPsg7XMGvwytskTT+uAQW4RJhI7YIE8gTYTtZAeAJ40RyqG7neEQ.g
GBJ8pdV8oee6X7samK7htAHkOGiZqUFYewwgp0MCvPlfH3TR3k4R5fRILx0q
Dj5m9g7ddAt8eIHw7ttxyCDXLROuMp.wgkSzT.BnuhKDq9eOd0ARe+RSbJJD
KF9dgXcuPrtWHV2KDq6Eh08Bw5dgXcuPrtWHV2KDq6Eh08Bw5dgXcuPrtWHV
2KDq6Eh08Bw5dgXMhfDtX6SOIy+xtjID7uVDsVYLYzI3z5WKEz8sYMzTwNmL
0RmN4JCuy9E5yrq7TbYPQ75MIRpde.c70nlh7C4TJjJHHBlqijtNxxsiOJBo
y7MyqYVg5yz8mq5SYPchlFc5k5K5589VNKnduJ1n9rXtdJYa7p4eXUYtT9kQ
mjg9xxVuo+2tCEeDX.epr+2HkzOl3yrv9twWWaLkPwA4bp2hfAEdtkw4P4dk
32LNFO9LN1O022ta.6rq5MSRfOhNRhWKFHo7Ce4DYeFOR7n+x.Dh7eNjH9rF
g196wABuTPFvP1bTXq7uCgMaEUmilgNIc9H0l.E328qRk4swukUpn8CqGg9J
8WxDjTQpO2xJq1mv0vfWPgd0RxO6srzA4BP9zGJy955XI+PxfEIH1uUFIg40
JibHZ8HlDPbt0W4d7K7jDaOaKuDh2c96rlCqnqd80ig8tudjareMvPfgmsOM
U92SFL5V6ISE81qoJFZp7cAC7tuK1E3cVduTOm2q+J.1Dr4Cgt09q.YFd4dK
RNxj6fBDd6cPAYScWuFimBWRDuN7Holr6asJ92GD90qHHqRAKch9FHAGdkpv
uySKNwnkBpqju8q4SxjoEGIt4aqhUy585gomeiB8oNq5Hz1xn0EFi1d8OCxg
SWLZgfI36MhnUqrnw2r5jVzOs7P+0+Nm62nz5Ya412EIv4IIz26.BDglZC59
O.sUegc8kSF39SyEz+qmE8h8x43AnEdy0MRUNx412OS9VUYmEiatitIO2BE6
mbvJ.4fjBF1NofCkTxVYJ+.L20NRTkl4i1SXezQ.GHoPezQNzO7t3aV9Jaw1
Cl7dFh5f2M8L7h5YtK7KlY.3EhtbgC7KBhGlG4NMQx8QOwcA87BMwbnmXBez
STWPOeHXyItzSdglvtfd9PqHG5hnrO5HW377P+vbRAAxG8jKhsXefcLW.OpO
XwYtHLw7gBBlKliY39sdAu7t9jvI3ZQjP1TXhl4hJDpOTgvPtvyf8fIYp3ZI
GPcw7EA5CZhdsLTRcgsm4CMilo.zolm7gVDpK7dTenCl5hgRR3T3yqScMZRz
hPAtnklOEZoMc8oXhHdQvvEhbhv2qj2WDWVLCE4qd5jxI9XwLDtqRjWbOwtV
SSNYzVLEbhN00DxTHoSbQ+FyKSjfq05ecwwOBySczozRh7gVRrSQpyGcjSgC
wK8jKZPv9vGYL6p0StXRg5CNOLx0n.dw8DzUuwu3dxEWA39piNICgW.OWlkX
dXkLHgqgl06FubpqoSgsKjKNVA8AqIxIUH9PSOzodxGRAPmb5vGQhABcQvNz
W8zorHi8BM4xBjNLZH1T1s2NojtO1aGTZucOoC24jFdWSZ+cLIyKrmc6hYuz
Dt60ae6p3rexrW+73OHS2ZyGZ8NRTWjXwyOEmjrLKI6fMNp5rWNyd0lMoo56
8WC.yQgDHTnfk4XHliXliTGn2TAg6difMsAV2HBkDBP5akvPbB0bj.goX8Qf
8ZFZWeAfg1d.DJ.D6QpSoKfi1MKJ84p81HdqMnmM4Y5WK75s7o43lW.wnskY
OmGsJt50gt6d14CULR4pqt+tzyr0J.MttMsY7ZlL99sKitzIflA2.yANhKcm
EDMXufCnBg8HAA2ayZlEXTUmnuWl9Gt8H6SZRlDpe8TkIJgCU65P5bDfYGMP
FGYYeD..jzYvnv3NsJDPf5x3SQugTB2PCBBJjaPCJqocs2zHTJA9aajoA+TT
ZQvOIWGuHKY0tr5GsboZf2oavHytln5.RMlokLzmq8fSQXq2uobgBZsyM.nP
XOR8TPntssSiHbJDatUTHVQYlinPEB0sQ5hkvzvGiS05sjMLR3PDjYaVsDJp
MGgWDH9gnxxiIPzd2T6WaYmo8KSrd6mvaCne7OU9noYfiMrLUVRK71TvXJdI
E+C0JOnDi338Y795R314l4Yo6CDPsDXeG0Fd2S9ztstbLQR2Td0cFHjpMxY0
LP3.ybgdp.6pXkgOo9i8G9s3nQdii9W94+iWklbbSi3M2te+LkPFlQLrLPNg
EFVcDH7n7j+kOuJO6YY5Oa.1ivZRAblgz.LJFZ6GHBhbvZAgAg.qWTgHFw.U
zPnRvsaaabFt6bo1yMioSU2QY14GsvuQ+eX+BEPs2dVykZIAysxq86aXVbg2
XwMkBX7xIv01c5PwgbqFPbXkF.8onX+vs1AOEMldO7nyhG+nrxtQY6yJ+cxT
4KQGkIlToWruiNMm3fJJOq93R0GG5ad0us0qqw.9Xbf6EhFtIP0QPeOr9yM6
EZuhFTeWTb5WbbToX2azAc3Q9enU7wXE26wFbGHY.qX018wjLlPupFT+mEIw
qp2+s6eL0+p5bUBeOsZCJW2qMSG6iqHd88xUGCq5iJXNnwD1Cv3ygsZ8b4we
ZYY9H0hLcCH86b1WNtmAiS8wkvG4SJ6ulsRVLVdjyfS2Y+QlZSl+0SqB4Dp.
75nY65EiWe1gFMmfg1OFur7zX0MmA9G0a79ku5713mVlsQNFMFmkL0tfcGh0
+LU7B+bzhwxD3p39siy4mUdyu3jNP2yRR.tZhbBsr+ys9l7x8IELmQ3PqwHB
FZVIIoSFRtgdA+eu9DIV5JBu+RzKxmxxW+E+5pIXN8VBzqhx+v6R0e8Y7Nyx
kmlH5AOA2l2izg2ilW2HcziF.WCfa+AkfznPj2jSjlriHLOlCC4fywS2KbJ+
V0dmJ.TsK93iGEZLIC4ng1ELxDhbrXx4V27UAm59IuXQhTYcylewuMK6CyNA
i4tbYbvAC6eN0arbwoeXbbH6BYLCnEAZeD7xYQZfE0DDPzXzGa5himUhoSy2
UII3mlhOH9utkludyf1o6rdCTwoa1Pb4Oz4ydiv6H5EmueuHl7A6Zx9Eco8b
qcVJQ6Jwocgd+Xi.BIUkRPXHVvqNhKbHuTrF6jzFcz0Ba9itVG8IVy2RbNPS
6Avs8xqG9cQ2EEwqj2OyDM3MBt023DNDVDBBSQgcMVp0.384gNaKGSlqphlj
2e3QGQqMWnqwrG58nqnW.Npq2SbUckNbaVr824atFzsV0zm2CAfc+u4aS1Jq
+dfy4U+ig7pj1uSYTn2UAUM.yy9X5nGgWGkj1Q3+1miF+.DfBo1ZJfywXjUZ
CAUty4+A32kKkmwHzfdMwTCxz923+A2OnzrkVFM5gmf.XLaz9X.0fy.krPXH
0+iw+K4pQO93Md8BaVTEnxhguGe+OxjjrONdHDQskDCWu0SZFfTEWI8T0SEM
D.DSAAf9Zj.Lou36xRNCVjl.mPUdBQsUmTaifdY.tYa9lD4nbNk.r90fn.Vn
gGFInLjvQOz3MkWHARUpML51zZO7pnYtboL9kim4t93YZVa1tAGnc0J5mAmc
IJZ1hQf7mjgn2kaQT5ACqrevqp6ZkPPnWsjTHSWU7ZyBxNP1+xd8Vzp.NDvs
4Zhv.gUk3NG5UdmxnMmpRtaEgquOpLK360qBcl25+hR4SaSRJOYBV1uXNETJ
zFLWFlisE8NSIccrDdoeWW3lak0rn.cDDZpyRuRTillPBtv5mAtgSFiBwgBO
rT8aHdz7svzqj2WfOpilymGetl.ZzxrpSbEHoDtqxMwQJ6x+bbd4mC92eN6T
ADFFFBrdWp3iIUoLmA.XX2B4cPqgNVc7cAVL67.V6qNmICdZhsXS0TnYi+7s
+i29+CLEF0mK
-----------end_max5_patcher-----------

And here’s the patch with fluid.datasetquery~ and re-fit-ing:


----------begin_max5_patcher----------
5109.3oc68r1iiajbed2eEDBweaVk9c2LeHHmyk3.b1WLhcfQfQv.JINyRuT
jJjTytqOb6u8zOnnHkZJ1TpozrvZFfYnHEYWU006p5l+s29lYKx+Tb4rf+of
eM3Mu4u8127F8oTm3M0e9MyVG8okoQk5u1rrsqWDWL6AykjeJIKMtReMX8I2
DUs78IYO+XQ7xJyylQ.yAODfYp+R0GiPyAA+u6eN4aq18fP0m0bppOuI17Tl
M6gfYKhxddVyMlrRCU4K9s2A4yZ.fhn0wUwEOFmEsHUe2.0k96u8sp+7fq3Z
7GkO3cO0p3OowlYo4QqzPwHIBPTngJDJzjAw.jAXejgdIAnY9CKqBVDrXrnH
k.mK.DQHSfgTHhPgRTjnm1wnycZ2PrO0Te34f3qiKKidN9HLuHtLt5gfmRRk
rPAff+4.4zl7GIpDsZUgDHhkmUXk1fN0zOWSAHXyDOH7bm+sSCDdbx+ozsIq
luJpJRRJ9RPYx5MowzxsKjebzr8XgFS4T3bHHb2uP.l+P.jMjx.7PbEOIkGq
TGXmrPt0xDPjPy+ynXMYfLYhAryAUWrspJOazHEDqwFLgqQGxt+5McYPruUm
KYTj.0XwzP.XNt8Obod.FGpPXAeRrjQYmD0qu1S4Eqiz.I6pQLXPwb7g52oJ
xPqevD3zQZnuVIMV4StljFL+qJtFFiMghPX3zSLVludcbV0QVIjmq3yqUjAo
6CIkRK8n4bbPUhzgifmhJqNC+msQA4gZkPvPCIDzKIDbL8Q3SWEK2TjjU8Tf
z4ffCQ9u4oI.ws4IAB4W+oXgdjB8OF.mOV+Esi5FuFw.4TORPvLJ.JjtRhNO
L236jMzm.8H5uIPEyzY6igfNGIwXJkn+khnRungdc1l.l0FPjLpFjrFKkpJR
RieItnLQ5qz9m9alEsYSqS+lV2hhz7a45Gj3glSkjYNEr4TEwujr69YMmMpP
RNpjzhsEZ.e1mXjY6eL4qhKx1lneRlSJmjpAI8zQlToW4lnklaVMqs6xshVS
yKAAZm2pCQmHZoGQNc+bZ9xODupkZDIIcSbVR1FUbRYUQU0vdykWE+Tz1zpG
eJOqpL420P.TE5ikq+TMDZ8hJbPC9+ohjnzFD34hjU4YJfnyLg5z6FNIyCUi
az1Hi9ajEswxMK4AjjkdtXoDI2VtHpPMQUaBAs6hU44ocuTy8kF+TU8k2jjk
c.UrJeS+WrH442eh6cQt7hqO0yVekxG2lYt5iRdhpGKidoK0tJJMsVls6i+S
QYIRs3wJc2ZzEzbQiYz2WtrHOMsC9ZtxKVtxJIO9x3Olrp585ApMyf7qmrYG
SzrlY4UIOGWV08bUQOW18LkUe1PzacpsKpkgerJVF2pDK59EjRGIkUkuO+ik
0ewcLZsI.6SKVaY51J.6b9CTDpUv0HudnROPqKXMhYsfIiaT0u6uMLx8pryg
PrNJLq1.4NOZZStze+rUwepktlZ6B0pcNSRTGaEmLCa8Z0n2zPALJ1.1yylW
IfxHR7HQoaNo5RUzpzCxyhCp9XdP06KhiCdJeaQvSIuDGTl7IoCXuDmEDqzc
zKoC4DoiXblhDdoztdoaBlOoalw8x3WPXnIXMFfGJvgB.2QwOfcTrIhryW3Z
uwqzjr9TIowH00sSnJk7HK2MeTS5eHnK9HUCWkj0XK+W2i.punSSTiEJTgx6
DTrCbmDn.d4.Q8I24.1LkU1UOZ7n3wnpphjEaqLycs8nbTV9jtgsHJ8.yU1L
L918.me7cecbTlexYFpatPjNxqpEx4F9eSZeSxrGACl5yhBrLMNpXrgvIH7N
zAhf1SNgvTuFNCl4wn298zfmKx2tYzXuvX4Uhan5P2z+xCkDA3kj1G6E+5Jl
96PB6XFbBSxLiYM+JCaeRxLN9JUmyMAK2rUE.vo466MRM2yRJDLGF15GLRUt
jyUhn+jZ.aRrt8XS+CUv+HSDxBT6jHh4SRz+v4zPJTvZlgumKf64B3OJ4B3x
hMAqMf.45TgZjYQzyIZjdizW5.UbwKQoAIYAqKUIsuHdSdQU7p.I8H9VlHfE
APmna8qgnuTr.z563FOTNp.JdMCAnYG.o8qV75PUkl0WpTrOoj15zJSYWJss
q479bF5UCw8cRMZNk7kQSXq0DPv0cSEcJoqjWaz0ISU.Qb0zDP+ikl.Yzuzq
lh.1Mg1dY442nortEpbwz94y50apHU2UPozAzRUg5k.XPT1pZO.jNCT6afUm
.PuVHjFQ3qBcTL.cLRQ0pIjmj3c8x1qiY3jNgYYk6HLflPXv0jdKs7E.mpbd
6HLLkY7l3HL.mPX.Oh4hoBFPtVDjaZh+cKMJ2txC7a4IYiMwvXBU4X.lhlCa
kWXFQ2aWZ+Z8UhvEfWAIBm.z4XCCwygh8+hHjIIQ3be14ZajdBDKcHntk9JG
alxqKBfUbGB7aUO39rW+2Ds7CFrtIviwV4Cbnk9TT.7KNy875aH8bmgsIMCw
CINO3pcvNV6yE6RmVVU5R2lnhjx7LU+i.5ac9.GT6FQ0y.bTyuDrhiObHwcR
ezizjRypb4g8Gq+u01YroM4czlT75EGj13Sz2zsu35nMa5T6oyy5RekcU0sK
eS.bdv+.T20KliQ0c+h4SXSWvX9.wzMLlOP0cEi4XVc2wX9D2zkL0eRbtk1j
SliZqVivUA5h7asbESf5bMO+YxYaWsF1qHc3qW4agY4rcSkuEjuxjuqxe94z
3QubRYhiWNo0sJwYWW895WDN8JUWc4itHO.RAiUqiUxAObNoc6j.XODvvdeo
Eywyts8Tj8EQzUtmhXBe1SQeHKNpPFwWfTfsZSdRV0nWiHfviIJV0OK+qeWd
L7a9ZM2lz.jxmiQsUKiTEOkLGJCbFfgLAQvojvoZE3Rtd8fjc7GJweVqe4jI
oGjXdWY44QDXLhkEjJPbb+DMEDAzWwchk8kxqJS5G1ahSQmXwv26Dq6ch08N
w5dmXcuSrt2IV26Dq6ch08Nw5dmXcuSrt2IV26Dq6ch08Nw5dmXcuSrt2IV2
6DqQjjvEae5o3huruXBA+KkQqkFSFcENM90RAcWNqg5V1o14gKo3J8u4940s
w0mRpbXiKEM7F2Y8dtUnvqEPg5yx5WHGy3fc0SZzUQxVRzstZlETu2sZTh22
2Z+vpph33uL5ZIXqXZVKy+f6gq3gpx+o2zZo9rXqua78uFiDZYab1VytfBO2
10ruRrR7agESFegEsi811ECXmc2soq06ITER7ZS+DG+guLPQlwijdXuc+fH+
WpHhO6Ens+dx324tsWo0P1bTXqxrCgMa4TWx1Y8t+2m7AE328kR..L9slRIt
ebaGXqEeISPsCo9bqorPFic9ZMYvKTAqZI4m8VSZubAHe5pjd+acrneHo2lA
D62Nfjv7ZGP1GtdBSBHN23Rb2xiiPtfrV19cIDu672YMGViWV80avlHa795Q
tw90.CAZdVaZp7umLXzs1SlZ70poJFZp7cAC7tuKl33NKuW1Mma0eE.aB1jg
P2Z+UfLMur0dgiL4Nn.g2dGTPlJzY0X7T3Rh30gGI6Paawp3eeP3WwW2H51g
pGuMlfN6CAtRc1WYbZPkbFuT8pH3bSPEyD1k.bouYYrSKvWu445JpSmn2oL3
vqzz54Ys17dxBpZLyCagWxjYsFIt4aSl0y5VijvyqPTnOQ1FA2c4YV8By5By
4LLTSLHg7471svLU2iIb+RM.S0qLr+uswEe9Ki+E9fA4oR+WXchyB6Rtktv2
TXLe52xthwzxw0KrbLViQCxgSW4XffI3cDSzpUFpw2rZPu5GVWo8k5B2uhIT
O6OuYcGBNOsj1VtWPDZpcpmL4ubAUe9L4BruTLoS8aVvatcSpLXN2dar4ayn
cRHm9azsOYLjhC6CfZBxQ0+Orc8+6q+CZ0TLGQyccfD0cTxIGIrOFHfCnTnO
FHGFGdW5adwJy5pAL4iLD0gd2LxvKZj4tvunmAfWH0kKbfeQP7v7H2oIRtOF
ItKTOufSLGFIlvGiD0EpmODr4DWFIufSXWnd9PqHG5hnrOFHWzJx7xH4BOtG
FGlSphP9XjbQAA1GzNlKDOpODlXtH1x7gpHlKF9YX61IgW9POH4DbsPRHaJb
Ff4hxJpOTVwPtvyf8fweshHz.iDxKxAtfRbOMPCwfP8gdQp3ZoCg5Bwi.8.+
.kdsbmg5hJClOrpnmBFhGm5CMvTWjao9v9E0E2YHgSQjINMznIQCr1BxPyjD
uvcBbwLiOBtkdk7Kk3RXdTjuFoA4M8gpdB2UofKdjXWqoIWbxf3CkUDWjiYd
g3AtVQi6hygDlmFnAcaxGJhvNk2PeLPNkbFuLRtH0h8gezX1UajbQMN0GbdX
jq4j7hGInqdcdwijKVa49ZfFjgvKDOWlkXdvicjv0DE6c2wbZnoSQ9FPt3LC
zGrlHmTg3CM8HmxEpOFHWDB39HYMNopxGzNX30BkfgNv4g7wzDTbsRjLj6Zk
EtTkUPmjl7gldnS9Q6ENBnKFuB80HMjWmXufStDm8wY1xTj7C1lBUiwAaOgG
r0Dd71RX+aIgGtcDpWM7l8hsCJL+98NlsqRx+I8Fo2i+Pb1VSGHra69qKkXw
yOkjltLOM+nckwc8KvLyUa1AD28c+0.vbTHABERxxbLDyQL8QxCTM1Fb+1sg
9df6tIBkDBPpuJgg3Dp9HABSwpi.Gban8iE.FZFAPn.PLGIOkpkoZeaQYOWu
wAxas62soHWsmqra+TbNtY08GssJ+4hnUI060Hc2QrenlQpPd0C2B7lsVRPS
1cOsY7ZlL99sKitzIfFfqm4.GoKcmEDMzdAGPEByQBB15s0LKvnxAQ8cYpe3
liLOoIYRX2d+PbpT3PdecPcNBvLPCjwQF1GA..Ic.FIMtycEBHPUyyKw2PJg
qwAAAEx0TCJq49ZuiLIUB7etINK3mhxJC9o30IKxSWsuOZhVtTB3cFFLRukD
KOfrilojLTmqMvIQr0GdqbgjzZla.Pgvbj7ofPcu2N2DgSgX8WEEhkXl9HJT
Rg5dSp1SReiOljozaE2vHgCQPl411IghZyQ3EAheHpp5TBDs2pR+0V1YZuSc
n1am7F.8i+opG02F3TfktWtZQu0snojWRx+PMxCRwHN9PFuutDtc917rz8QB
nFDz1QsIuGHeZ1yzNkHoaJu5NCDRUF4LZFHbfdtPMUfcUrRymr6OGB9s3nQd
ii9W94+8WklbbSi3M2tuclRHCyHZVFHmvBCqOBDdRdx+iOupH+43reVSXOAq
IEvYZTCvnXnYbfHHxAqEDFDBLdQEhXDMohFBkBtcu2Fmg6NWp7bSa5TNbTlY
9QI7q0+GZWn.p71yXtTIIn+p7c980OKtvar35luMY4D3Z6dcn3PtQCHNrVCf
5TTre3V6POEMldO9nyhG+jrxtgYGxJ+cwYwuDcRlXRsdQaGMLmXuJJOqw3R0
GG5ad0us0hmqGeLNx8BQC2Dn9HnuAq+byFM5qHf56hRx9hiPkjcuQGzwG4eP
q7iIRt2SAbGIY.qY01+mIAlPup.p+RYZxpcubKrCS1ipyUI7Czp0qbsUalNN
FWQ502Gu5TzJaXAyAMlPKDFeB1x34JR9zxphQpEY5.H0J.9Km1yfwo93R3i7
Il8WyWEWNVdjyfS2Y+QlZSl+0gUgLfJ.uBMaWuX75yN1n4D.Z+XxxpgoU2bF
3eT8Vso5Um2F+zx7MwiQiwYISsOY2gX0OSEuvOGsXrLAtJte63b9Yo27KFzA
ZKgj.b0D4DZY+ma8dxz8IELmQ3PiwHBFpijjzoBI2Puf+uWOPgkthj2eI5kX
0F1vW7qqlf4zaIgdUTwGdWl5cS06zgKOMYzCN.2l2yzg2ylW2LcXQCfqIv0d
RIHMJD4M0Doo5HB8i43TN3b9z8BmxuUuwjC.06cd93QgFSwPNYpcAirfHmJm
btMLeUvodXwKVjFKstYpu32lm+gYCvXtuVFGcP+9mS8FKWR1GFGGx9TFy.JQ
f1GAubVjFxhbBBHZL5i0CwoqJwzo46pTD7gw3ix+qak4yZEzFdvrlnhgus93
xenyeslg2QLJN+88hXxGLwj8KpV64V6rTpxUhgcg9vbi.BI0sRPXHVvqOhKb
ntTrF6jzFcz6D17GdsN5SrlWAqNfSGPfa6kmE9cQ2fh30x6mYgF7FB2504jC
oEgfvTTXWikJM.dednyFgyj4ppno38GezIzZyEpdL6AqGcE8BvQc8dhqpqzg
ayhsegp5ZR2Z0SedOE.lcbpuMca7tWxpNG8OFxqKZ+dkQgdWETM.Vj+wrQCg
WGkjFH7e8yQiG.AnPpomB3bLFYj1PPo6b9G.+th33y.B0TulbpAYJ+a7Ov8C
RMaYUQiF7DD.iYx1GCHANMojEBCo9GF+uhWMZ3i230KrInJPsECeCe+Owoo4
eb7jPD0zRLb0F9rF.oRtR5P8SEMD.DSABf9ZDAzku36xSOCVjlDmPkdBQMcm
TaifdA.2rsXSZ7nbNk.L90fn.VnlGFInLjvQOz3MsWHARkpMz51TZO7pnYQ7
x3jWNck6rwyzDa1dfCztaE8CvYBQQwVLBJ+fLDVC2hH0CFVa+fW220RgfPuZ
IoLNaU4qMKH6Ix9W1yZSqB3P.2TqIBCDV2h6bnW4cph1LTmb2JCWeeTUdv2q
hBcl2F+xp3m1llVMXAVNrYNETJzjLWFliMM8NSJccpBdoVqKb8Wk0DTfJCBM
8YoWQpQiSHAWX7y.2vIiQg3PgGBU+FROZd2G9JY8B7QU1b973q0DPQszQchq
IRRg65ZSbh1t7OmTT84f+smyGJgvvvPfw6RIeLotj4L..C61Hu8ZMzwtiuKg
EyNOBqYoyoqfmBYK2TOEp2pce6e+s++PokcYY
-----------end_max5_patcher-----------

A little addendum here.

When making what I thought was a bug report thread I discovered that fluid.kdtree~ just take a really really long time to fit larger fluid.dataset~s.

I managed to fit a 1mil point fluid.kdtree~ and the converging is still happening. The closest matching so far:
Screenshot 2020-10-09 at 8.53.07 pm

As I noticed in the 200k version above, the duration is super erratic for the kdtree. I guess this is a function of how that tree-based distance matching works in that some may just be closer to stuff (?) so it takes less time, where other things may take longer to match.

The bad news is that it’s still really slow.

This fit took over 20min of pinwheeling to compute, so the corresponding fluid.datasetquery~ version of this would be super unpleasant to test, but coming in at 20min+ per-fit, we’re looking at around 36000 times slower than entrymatcher if you need to filter or touch the data in any way.

Thanks. To be clear: it’s really not surprising in the least that you can’t yet the same (or even remotely so) performance from our kdtree as you can with entry matcher. We’re still in alpha, and still concentrating on questions of interface and architecture, so have paid no attention at all to making things fast, and we won’t be for a while.

I’m not sure what the takeaway should be from the above: KD-Trees are what they are: pretty scalable and performant for up-to-medium-dimensioned data, and not too slow to build. Ours could be (and will eventually be) more efficient. If you’re coming away from this thinking that KDTrees are somehow inherently slower than whatever entrymatcher does, then don’t, because there hasn’t been remotely enough testing to demonstrate that. Also, I should point out that the tests above have very little at all to do with how a dataset itself is structured: the point of a KDTree (and query-able trees in general) is to provide an alternative index on to data.

Re-fitting the tree on every query seems apples-to-apples from a UX perspective, but absolutely isn’t from an algorithm point of view. A KD Tree purposely frontloads much of its computational cost on to building the index, so that queries have good and predictable on average performance. What you want is something like entrymatcher that allows you to combined tailored range and nearest neighbour queries in the same index, and we don’t have that yet. We might (or might not) one day look at something like R-Trees which are (purportedly) good for combing such things (albeit with different trade offs elsewhere)

So, looking at the figures below, I’m not much interested in the particular numbers, or the comparison with entrymatcher, but the trends are interesting. It looks like something in the fitting is O(N2), but the queries alone look like they’re sub-linear as N increases, which is what one would expect.

1 Like

Totally. I get a lot of that.

This is just me poking at the code, my use cases, and reporting back my findings in a… Rod-ish fashion.

Aside from raw speed comparisons (which will obviously improve from alpha code), most of my nudging in this thread (and general line of inquiry) has to do with…

because

This may change with some things as you’re suggesting, but if any variability of distance matching, biasing, forking, etc… requires copying and re-fit-ing a tree, I think there may be a big gap in terms of what is possible in a real-time context.

I’m imagining (hoping) that for vanilla distance matching, that a pre-fit kdtree would be fast as shit, and be a super go-to where you only want to find the nearest in a set, with fixed data all around (like for the time-travel/prediction stuff). I could be unique in that sense, but for me this covers only a small part of what I’d want to do with a corpus. So a lot of this is just gesturing towards those possibilities, which circles back to…

Thanks. To be clear: it’s really not surprising in the least that you can’t yet the same (or even remotely so) performance from our kdtree as you can with entry matcher. We’re still in alpha, and still concentrating on questions of interface and architecture, so have paid no attention at all to making things fast, and we won’t be for a while.

Just a few of comments/questions about this:

  • I expect (as I already communicated to Rod) that the implementation may outperform entrymatcher on larger numbers of dimensions, given how close the results are in terms of orders of magnitude for search only - it would be useful to know if that is where it starts winning.

  • I don’t agree that it isn’t at all surprising - I am suprised. The code in entrymatcher as Rod is calling it should be a brute force distance calculation that has to run every time on all items of data in the whole database/set and might have a sort to do at the end (I forget if that is the case). The only optimisations here are to do with the linear ordering of the data in memory, and for this case I am surprised to see that the KD Tree doesn’t overtake with smaller data sets to get the nearest due to the algorithmic complexity being preferable.

  • In terms of this being “alpha” - assuming that you are compiling with compiler optimisations on with a vanilla implementation then in my experience throwing optimisation time at it isn’t likely to improve the speed by factors of 4 etc. unless there is something fundamental issue (algorithmic/memory copying or layout etc.) that is slowing it down. Given that the purpose of the tools is (at least in significant part) real-time usage I think some indication of whether there is at least a potential optimisation that can be identified here would be useful here, both to predict what improvements might come later, and how this might fit a real-time workflow - I suppose whilst you can say there isn’t enough evidence that KDTrees in these kind of cases will be slower, there’s also no evidence in this thread that they aren’t - algorithmic complexity is all well and good but in there realworld the constants matter. At the moment I’m a little unsure of the claim, as sometimes (in other contexts) I find that people assume that code can just be made faster by allowing time to that problem (or moving to different lanagues etc,) often in cases where that simply isn’t the case.

I’m not particularly interested here either in the fact that the code under comparison is mine, nor in asking for a specific outcome, but I think understanding the pros/cons of the tools against available ones is pretty relevant, and my strong expectation is that KDTrees should have the benefit of being fast, at least for the nearest neighbour part. Any light shed on why this isn’t seemingly the case here would be useful.

1 Like

Indeed, I get the desirability of marrying nearest neighbour and range queries like you want to. TBH, I can’t remember the chain of decisions that ended us up with different objects; AFAIK, datasetquery is only brute force at the moment, but there’s certainly plenty of low hanging scope for that to get better once we settle. Another thing that I’m keen on (and which Gerard reminded me today of my keenness for) is trees that support insertion because, like you, I want to do everything on the stage, during the stage show, live, as it’s performed :

1 Like

There may well have been an implicit ‘to me’ in the expression of surprise :wink: The reasons it currently huffs a bit at scale are partly to do with implementation rather than the algorithm itself. Because there’s a lot of recursion, small inefficiencies can pile up quite quickly. The happy news is, of course, that it’s easy to tame those. However, I think the first priority has been on making expressive code that works.

In fact, the performance of KD-Trees begins to fall apart at higher dimensions (because this governs the depth of the tree and, I guess, there’ll be more backing up and so forth), and it starts to approach brute force. Where I would expect KDTree to scale better than brute force is in the number of data points. Were it not for the hitch Rod just reported in fitting a large set (which turned out to be a pass by value in a recursion problem), it should have been possible to see that.

This paper is reasonably up front about KD Tree’s strengths limits (although perhaps overly pessimistic) without being too comp-sci

I’m not entirely sure what you’re after from me in the last point. I’m pretty sure you know I agree that RT usage is important, and that you know I’m not the person who can speak with much authority on the how / when of performance tuning the objects in general. I do know that there are some specific inefficiencies that can be easily addressed (because I just fixed one), and I’ll send a PR to G suggesting some of these. But, as it stands, I don’t think it’s at all unusable in a RT context (as long as one isn’t re-indexing all the time). If you’re hitting a particular performance / ergonomic wall with it, then we should certainly look at that: given that you have a gig to try and do with this stuff, n’all.

Things like adding new tree algorithms are, AFAIK, definitely on a far, possibly conjectural, horizon. Certainly not before the gig.

1 Like

I did a quick test here, upping the dimensions to 20.

Here are the comparisons.

10k + 8d:
10000_8

10k + 20d:
Screenshot 2020-10-10 at 12.14.22 am

100k + 8d:
100000_8

100k + 20d:
Screenshot 2020-10-10 at 12.15.43 am

So it actually looks like fluid.kdtree~ does worse with higher dimensions, everything else being equal.

In terms of the last point I’m asking if there are places where potential inefficiencies might seem obvious, rather than a general “it’s the implementation/we could make it faster” - because I tend to be wary of such claims without specifics.

I thought it would be easiest for me to go and look at the code and see if there is anything obvious there or differences with entrymatcher. I’m unclear on the probable speed implications of std::priority_queue in this context (it might be overly complex when k is small, compared to an insertion sort approach), but the other thing is that I remembered that entrymatcher doesn’t calculate the correct distance fully until output, preferring square distance, as it saves the root until we know that we are interested in the exact value. It looks like changing two lines of code in KDTree.hpp would do that (possibly with some renaming to retain expressivity, as things called distances, should not contain square distances etc.). Eigen support .squaredNorm() so it’s easy to do. The other area is the effect of the tensor view on speed, but all of this is conjecture. All I can say is that changing to a squared distance until output is incredibly unlikely to lose you speed, but how much performance gain you’ll get is unclear without testing.

Happy to make other suggestions, but if you would be willing to humour us with a quick test on a branch that would be interesting - feel free to say no, of course.

1 Like

In other news Rod’s timings are not being done correctly, so the comparison is not correct - he’s starting both timers before either side calculates and stopping them after each one returns, which means it’s impossible for the one that goes second to measure faster.

I have made him aware of this, so expect updates from him…

You’re incorrigible :heart:* Because this isn’t my code, and because G finds it unaccountably unsettling if I just go about changing shit, I’ll pass suggestions his way in the form of a PR. I’ve been tweaking based on what I see in instruments since Rod’s post earlier, and things are already much improved. Good tip about the norm(), I’ll give it a shot, but will limit myself only to low hanging fruit with simple changes that G can make quick verdicts on, because he’s busy, and suffers enough of my nonsense as it is.

*If you’re in the mood for looking at code, I know of two very lovingly crafted PRs currently feeling folorn and abandoned on a Github repo. And a forgotten branch… :wink:

1 Like

I guess because threading is a mystery of the universe, the results of fixing the timing method are actually the same.

10k/8d:

(the overall speed is faster cuz I’m on my laptop now, but the ratio is the same)

and updated (but messy) code nugget:


----------begin_max5_patcher----------
5200.3oc6cs0iaijc9Y6eEDBYdqs15dQlmxNYSlEHyrYQlIXPvffFTRkZywT
jJjTssmEq+sm5BEEoDuTjpnZ2yptgkYSJxpNe04dcph+s29lEqR+jHeg2+r2
u38l272d6adi9TpS7lx+9MK1E9o0wg45u1hDwGSW8qKdvboBwmJzm9cdvimK
8PQrnn3y6Elm6hnD424+s7p6CKV+9njmdLSrtv7EfH.coOf3Gv7wPJDQnvG7
fXxRDz+zuD9CdnfkT4mnkfpGXxgcQIxFT28PkmLZitWI6ouCgVb5aZ5a5uJT
cx+9aeq5iGtNhuvKxaUuj+CdKVEl7zDfg.5RDkb5WdvCdLZ+n.rET.2NJfbG
JrWH9vW7JD4E6S0zbG3w13zvovPffnkvZHAin3RjmDzGVfaAKHyNGwgeKxyu
SDPyJ7vo+ePADIBbIdPYKQAfpegPEb..8CGsIfPaGNvtCN9CpNFX4UwRPQKw
WnifiWVWEAhPdviPGODvmcNhrvjMo6z3vUnnrUPnMEk7fwCBA2.EkqT+Zsfw
v5LARICIg5yUeRFPWPa5EgyNy+uE6kIdpSZVQm8Ri9bMCMF.0+m+nGXg9ytt
+B6GV6kT8gKYToTrOAR4TFBwLB4XD+zuDj5jjkv.JDfgLehDhHAibjGxcGpr
Nc2NwISdUvR3yhrvmDK5iAFnGcY9xw0Z+.gRMYTC+LXL7yf1oJfCophHYe3I
aHJ.tAQgUVrIvQSTvf4mnvK8Ry7x2GtyKza0gj0u2Kcq2+2AQVjzA4xucbTh
Xc5gjh5RZaSSJRB2oY1W7mEwOKJhVG1C7vfPoOcJE2KAXVseK4qUZxX0fGUC
rMbsnN5LHj0gQcPsmYdzuoelpwBGhjnkdqyDgEBuRadZ2A8jGqgyO68wnh26
EULOfpzeYRcdN.SoWHvMfJ4ECTgUfZn2lvhvbQg21n3XwFCbVB0x90JQli4W
kplTHKw3ZIC3DrjM6P4pCEEoI8aTpeutUDozhKCqIbxwOsWuE9DplIQ7BQ1i
hjvUwh5j506skz8Y4CoWOLZRsaSy1EpISV2zefT+Dt9ObnxNE+j+VixdKkMd
v.8UCXvvvK87tU6aSAZnulgl14SbDzf4+9iqwGMQQHL71.FcYDRdtrOuSQlh
LunbOUBH7jdCJx81FlK6MKFIT3Czf.LfOZeB8QtymvNBpIeelzwksdJasmS5
ey1VI7KYb5yBSqXBmwWBqkVG.lW4Ayn3VXye77+AO30jTm1oezovcQ9DLiBf
9HP.ZjQ8RfyN4u2SRhal1X+QuKHR5DIIVZYZMoZWsfidzlzQfev5cAIipgHK
oRoZrnXgLH07HomRm91uYQ3980N8apcKJn4WS0OH+GpNUTh4TvpSkIdN538y
pNaXlDHJjnvgLiCnehczoZ0iIciHK4PT03jdPprKoGNTNtJCRy3qodT63kOA
sTsrBDncXiYbak3WS4hb39o3z0eProldS4.3dQRTx9LQtTZOrnruWc4MhsgG
hKdrtqo5rO2x0O5MbqWrx46+XVTXbEA7TVzlzDUmnwHg5zGaNIaiV3P84IhQ
+MRB22xMK4LjvRGWLWRjGxWEloFnJMhfNdwhzz3lWp59hEaKJu79njjyPwhz
8cewrnmdeO26pT4E202yVek7GOjXt5iRdhhGyCetIZWDFGWJy17w+ovjHoVb
gR2sIzhpKZLj9970YowwMnWyUdtkqrQxiuV7wnMEuW2P0YFje8n8GYhVTMJu
I5IQdQyyUD9TdyyjW7YCnW6TGVUJC+XgX29XIUz7KHkNhxKxee5GyK+hGYzp
C.mlsu5xz0U.137moHTqBpRdcvvp5T4GVKhx3FM8G+rws0PgGn1EpGf0hl2v
k58LZNJ8moNboeRIaDepltlR6BkpclHD0vVQS6EJqElbgds.HEXTwo+O5EoD
tSKFcF4QWFJcDnHUZmeJQnMQEsJcuzDgWwGS8JdelP3sM8Pl21nmEd4QeR5.
1yhDOgR2QuPmsvFw3LEIXXbC0Nt4ytM3l4wtvFpBIclxjvK.OvGGH8s1NYqN
XMph7p6rzLcgqSFuTonpCURZRUc81ApbIOx5iC8GGT7ZRnR0vEQIU1x+kSjl
5KZ0.0X6EJQJq5EG6tyRu.d8chxSdzArEJqradz3QwigEEYQqNTXF6p6Q4nr
7IcCaUX7YlqZyv3aO04biu66DgICD5hEyLe649fzL2GRm5kxfiO.NL0cyPUS
Evmfg0whvIF8pOCzf1I9zNx6Cd7yHOlM6Qu8awdOkkdX+zmY1.PYbZsTiNT3
3m.dl6Fuu1jeGHCh5BFa4HbiYrRF09jxK9MJs3cF0958GTA.jYapK5LnMqUI
PQrkvfZ+nlqO13EKfr9ixu8XS+Gpf+MoOBZRx5wLKh4yRz+vkz.JzmUMrdOW
.2yEv+njK.qhMAqsNnJgGkXH83miNZD33CFQ56lH64vXunDuc4pj1mI1mlUH
13IwCwKYh.V4AGYN.5VWQm0fiVyG2jafKleEqAdjcoXoe0h2FTUZVesRwduP
aSS6WE1VlgYJaxfK90C39Nup52c9.1RsEDs1BLRWk6fIjdFxqGb8FnJf36DM
Az6ZBtDZ4JdzqWQ.6qLr0U442nnjvs05eG44mO977q5ed4RGPy8L0qpt.DMd
.HcFnz2fVcB.80GPZDguZbzex3XnB0JAxdAuaW1dsLCmzYLKqbK6CnYrOXaR
uk1D8fyUNusrOLmY7lXYe.Ni8A7HFKlq9.x1IA4EMw+1kFkWtoG3WSiRlVZw
wTekOAXl+kqVQNczKeUevqfDgSLohBiBtb43MgDgyCl+EqpzS.gzgfxR5KeZ
i0kS.PqzMDPG8L9vm+Ul59v0evP08rhssgnIfVpSQ+wu7S47axxNLdvUmlUi
zsIUCwie8Xym+0jWiRVU5R29vrn7zD8RPEzypSNNJWO2mp+c7X8+agVOhpnF
quT8vJIgfITNicL2mjwYSRra0YoMtmobp9E2Etee0kmt0ktl1UU0t7Mdvkd+
SPcUuXNFUV8Kl+BapBFyePLUCi4On5phwbLqr5XL+E2TkLk+k+UIfiXsrtho
Lz3mKW+4WqVo5bMO+UY2tc0Z3Qy.O+Ee8MU9N.nG2cl7sO42Ex2EoO8TrXxK
nepOpk84BxkKKzILI6b5K5rrKavrTODX5kfPqfiz9KoA3vqx61nb5.uX9A.W
TfQsuBhbTAFw7m+BL5CIhvLY3eVrwvzOPfZYqunUk0P.dz7CraiSnNXKRf5C
aYuRZd1hDXjudJHoV0F.ArVVs2iWWI6Foq7ZAAFmeoP.EQtr1hlBHf98bYY0
phTpzE2KpVwITVVL78xx5dYYcurrtWVV2KKq6kk08xx5dYYcurrtWVV2KKq6
kk08xx5dYYcurrtWVV2KKq6kk08xx5pSR3pCa2JxpsKy68ujGtSZLwe3oZY3
TLqcTPmopZSvTftXdFZNlZaW8KX9mcksQEd4Q61GKnpcEzINy2p4zOfSoPpO
AQvbNwjU45oFEgTyC93moM57O4+YxGqv63DMM8pXq0rq25Zd1e7oHlRl8B9Y
a7gnMK+vlhLg3KCN8+UxESXZ2ZqJ.LabwiSBA6tMe8tjPz6geSbKZCv6rhPv
iuLXHraPYvTiXaghPAlMLn1jswieyymPl88fQir8.ChCwDWR1sJJyF+jESl+
2kJWyqSGD.qK.41JJ0I79yA+Z48mSIcqdMwbw6EEFZzUnIFbidi4Xreesuyb
NNp2ZApBXi9kjCD8Z6kjiudLt0RgfL9xRFBeE7VwAUl5Ao9sKpTyI7Zvw+Uw
aAmiDMEd4ttyDds23vJ+4pqwES.HvItcyiCdQKfCG8VPqb2aESV1rN+H52uO
i98dl+W06gVTVmKkhIroYwm+s73iAfWynk6CAuUG0f7wWkwv42F9wMI7vMaL
nw2rwJK51HDzd4MxmvKXfahsbSgmCblEcCyPauPWfnID90raO2D+c4amkuXU
dYNBE1EL9Q1hVKBeJX7hGye32WitQJ1xW4FSQUoEAiqeXMmaDCTbdteKAjKm
CfFI+tqjuWayH8BP21VR8ZrSMMB81RPm0RnAZID0AsDiZA5gfthlFbbh3hVh
Zw3Dh4hVhaC50XVpVjlswTMpvqqoQ1zzNANw1zRHW0RCNvE3JVjAEvv2LVDR
CBuhEAbcMcfMMsKzWAsgHg9Nnkv911Rvqklrok3DWgdCZWwE5qXDKZIlKX6s
g0i3BCX91PRtfhT6QDCRRtP2jU7cAsaSAL6srJkksop55rlwsgeQOBbsh0be
K3W7cgXM2pARWn9kai5W2PSLabhyEJ5413XJ2EB1baBff6DZxFOoXtPqH2Fm
scgW8Vw44BKWVofvENgxrQrE6BriYC3QcAKNyFFOpShmzlvRXXGnHm5eqPOp
MJ8HPWPSzak5UpUYMwExS5gfgB0h5BiSzajNOB6V0PVI15OGtBZUSSHyQlMH
VknHW3oMAbq7axlfTHLG0PClWCmDHtUQ34hFxFqjXWX6BytYsjMppntfe.ir
Mltqtkf1Zk7paIfERSbW0PCxP3DvylQIlC7v.4aaf1N2jhUMMcVxUtMw7Ccx
jbXkJDWn+EZUK4jj6Zkq..WzR1LAa7.W0RCYmD6DZxhFhbYTJloE8rMiDUab
1lPxYa.IWt4iz8FOx4a5H507hYGW3roh8zxB8vlnzeTucY73OHRNXld2iapG
MQhUOsMJNdcZb5E68JGmb5ElqVsOmb769Kdfkn.BD5KgkkXHliX5ijGn1Etf
mVTc56Ad7lHTR..o9pDFhSn5i7QXJVcD3raCcps.v.SK.B7ADyQxSopHl52V
XxSkaOH7Z6wE6yRUqrxi6ZJKwUqgmvCEoOkEtIpbEE1bSv6gRFoL4UOeitXw
NIfFc7dpy3UMX78GVG12.fHVnVIiMPUtOiw0vhO.56aNhPAHTCJswhAVxj9e
tWj38igI4d+nXWzpz3MmJpfFOeBmBw5mJJ.6GXNhBArflOeUoKnuwGiRTb3h
pgBb.BxL21wwRD2mfaNTXE2UEx2AClkC5MYw7qXr74.puu4nK5fmyhwnxFQ8
cYpe3liLOoYgC63xWSxDrVceM4BPxADcuAx3HirgO..IMw3v0qOm+AizavYx
CHGIHkLII37QmF2U.f.45aSxRP3bCWGJvvIRYG07ckBD+PXQQeBD02Ph9kZ1
Ypud7TKHkk.G0g9q+whG02FXntUMYMyRuucwq.pRsnAzHbfl6Uw7hsU7B7Ps
OpeG5RHpVaAMeq.4nE0v2JY243W2BgVeaNVJrEAI6zGbNqIpOYLCp01QPmwQ
+y+z+9XM4PXPHvXIN.wH5AaZ.Th68Zx4O+4MYoOIR9IMiTqRCPkiAFUHJQ.s
sB9QWDp8jqbUqIqmxuBstOHBRYFsZJtB8yI3kmI2Ncsu39xzMONEvY51GvnX
nlGGJwZTPu739cyiCYXFw7b3DVPP4Qf.WxhqK2xn0iPi82IRDOG1kGQkpma6
nqSkc6r0c90mn+3mz4hC3FMl3fRiPpSQwtgcrAQ3W4E3kGMIl3A4UscXpIyZ
P27pCAWtjW8aqsjXZmi8RK69UCLfxibd25OUsGB8UTm56BiR9hk8JImSkgkK
Ox8cs7OFIEo6qycAmKrTL+zGyReB8UUm5+HONZywsv1Q3Ihkx3CPI8FZkssw
YZKtTqsKwquWrYrXEqy.EtPnc151x34xh9z5hrQpEY95PpU62WlK1tyXItop
d9KoaD42.4IqMs2lgVmRvCqB4FJf9WNraU+cl1z1boQyYnq8WiVWLSpacZ2T
s2UW7Um2F+35z8B6c23TVqCvpeZ0M9WNP9mBWMVl.aE2eAIJYfWqFsCzs6SP
KAVLuJN9oZuZbrePAKiKmCMFTHXnNnLRiYH4EzK3+6cCLwR2PGM+4vmEaSy1
4v.Flr2ifkTGQbaBy9v6RT6.8uSmCg9Ht1Sx.oRVlWMIMUSWiutC6f7mAGfe
04ocv44NqYZGpRwyHyGcapUFbl.bBmxuVt8CB.kqhYW7nPiYxPVEKjZlMy83
2ll9gECj9qS4U+hC5dxN5MWq.GlCM6ZlWELym6eN0YrbQIeXzQqJIKfekkVL
CHkh5wACK99cwd8PiOg8xUcZBJzM.t9QNH0r1SM2PEpNdZEtPmoECcsaa0ha
rs4oa3657D+NjRcmHl7ASLY+rpzddocVJV4JwvtP2pfakWCzJ8fGYN6K2Hf.
RYwkDDf84kGw8cp0ucgehU8hVpcZZ7yZSK7X9MCOhWJf0I8e1.o6bLTQv01z
1sIfUDlhBZZGSw9Cc83Pis9DWMV3.WS8ql99KOpG0obeUUo8PqGcCMoeM0oV
SoC6FEq+ZSx1jtUql9bt+sl8XnuM9f331zikyvgppD4kyE8IcXA00b4xNXV5
GSFcObPcqNrG9u94vw2AAn.podP3bLFYjcPPoWStuC9cYBwD5gZzqJUbPlx3
t66b+fTyVRQ3n6d9D.iYRRHCH6bZnjE.CntuO9eI1L59Gux4RXU7NfRKFtt+
8+HhiS+33gPD0TYJb0d5otCRkbkCVFVz..veNH.zqQBPO8EeWZ7DXQpxdAU5
+C0TbONur01eHae7zbdfWUMcDHUJ+qURoTCPGvCUBv3XDhBXAZg.jOkg7cIc
kIVKhdt+YtqMdlp.SNQSf5Uena5blPTTrEVawmQjJzBJMDvKqTXI2b.tWz91
vEkKR1j+0lEjSfb+xdmUrnbHfalmIBCDTVP1bHcvfUtMPcQ39QjkvuOrH066
UQgtvYsedgX6g33hAmfkl.KAJ8KT6UXku5pLHbdwtddgM5SoPSZzYXN1r9GX
RQRlaAUCQULoIMx1P0cAdf749FGavUhNXT.NvoZOKpdQl7Ux5E3ipr474Qv3
+mhxJ9r2+1SoKFtNRsrhyaOumvf.fw+RISIobRyY..F1rXoaO0a.ESfNFWb4
XuTOR4DXzsWJX1z.VyRmqiWZ1u8u+1+e.Qcg0TA
-----------end_max5_patcher-----------

What can I say- I like looking at code. I was suggesting trying things on a branch - not changing it without checking. Also it looks like the distance calculation is allocating memory because Eigen and actually that is where I’d look first. I see how Eigen makes this easy to write expressively, and that in theory it might even optimise the calculation with SIMD etc. but the flip side is it likes to allocate all over the place, leaving no obvious trace in the code that that is probably happening.

I’ve confirmed that even with the changes entrymatcher is returning a lot faster than the kdtree, but I’d think some targeted changes might reduce that difference, if not eliminate it altogether.

I guess the key interesting thing for me is where the algorithms with better complexity actually get faster, assuming the implementation isn’t getting in the way - for a lot of us a couple of (or few) thousand items is already a lot of stuff to be sifting, so optimisations that make sense for millions of points aren’t necessarily that interesting.

1 Like