Can you use fluid.datasetquery~ to filter other datasets?

One of the new things I’d like to add to SP-Tools is the ability to filter datasets as part of the corpus curation process. This is what fluid.datasetquery~ is meant to be used for. Now my misgivings about its interface aside, there are times where I want to filter by criteria that I’m not including in a queryable data space. For example duration, time centroid, amount of attacks, etc… Up to this point I have been keeping these in a separate/parallel coll and pulling up the (meta)data when needed, but now I’d like to filter a dataset based on some of these values.

Towards that end I’ve moved the contents of my coll into a dict and then into a fluid.dataset~ so I can use it as criteria to filter the data. This works ok except the dataset that I’m processing on is not the one that I actually want to filter.

I initially thought I would do this and then get the indices that were removed by dumping the dataset to a dict and then individually removing rows that way, but obviously what is left in the dataset is what was not removed.

I could do an ugly thing where I concatenate the column that I want to query with into the same dataset, then just not move that column over when I do the query itself, but that gets rough in that I may want to query with varied criteria. More importantly I want to do this to a whole bunch of datasets. In SP-Tools when I create a “corpus analysis file” it has close to 30 datasets in it (for different time scales, descriptor types, and pre-scaled/normalized versions of things) so having to manually concatenate, process, trim, and dump all of these each time I make a query would be brutal.

Is there a way to process a query with fluid.datasetquery~ but somehow get the results of that process in a way that I can then use to manually remove individual rows from a whole load of datasets?

Or is there a way to filter one dataset with the contents of another one being used as a filter?

This is what fluid.datasetquery’s transformjoin message is for, if I correctly understand what you want

1 Like

Hmm, it looks like it might.

Definitely a bit of a confusing message name and interface (and help/example), as I looked through the tabs and reference file for a while and couldn’t figure out what I was looking for.

(post deleted by author)

Yeah, I agree it’s very much in that category of names that’s a bit jargony: join is the name of the SQL manoeuvre that would do the equivalent thing. I think we should relabel the help file tab to something better.