A fluid.datasetfilter~ object for conditional querying

rodrigo.constanzo · September 18, 2020, 9:47am

The interface for this starts getting messy if you have have queries that change a lot, and want to pass on specific columns/ranges accordingly. Possible, but squirrelly. Also lacks some of entrymatchers sexy distance-based matches.

The main issue for this approach for me, at the moment, is that this process is really slow. I imagine this will improve once optimizations take place, but I have a hunch that creating multiple copies of a fluid.dataset~, and fit-ing multiple fluid.kdtree~s will always be “slow” (as in >20ms).

I haven’t fully made sense of this yet as this kind of paradigm is new to me, but I’m mainly thinking for metadata-esque stuff (e.g. duration, amount-of-attacks), where I’d want to query using this, but not use it for any distance calculations. I guess this is possible now, but would fundamentally (if I understand correctly) require a fluid.kdtree~ to be filled and fit… per query.

I think most of the use cases would be covered by scaling pre-fluid.dataset~-ing, but there are some cases (one outlined above) where that wouldn’t be the case. I guess I could dump/iterate, scale each buffer~, then dump/iterate back in, then query/filter, then fluid.kdtree~…again, per query, which probably gets “too slow” for even non-Rod standards.

Actually, as a more specific sub-question. I take it this is a technical property of a kdtree, that once it is fit, it can’t be altered or filtered or anything, without recomputing a new fit?

If so, that kind of solves 90% of my problems, in a “you have to keep using entrymatcher” kind of way. My understanding of kdtree stuff is that the fit is “slow”, so that your querying can be fast. A benefit that is lost if you have to fit-per-query.