In playing around with some of the IQR stuff (as well as encountering some bugs) it occurred to me that it would be super useful (sometimes) be be able to see the stats for columns inside of a dataset. This can be useful for seeing the state of the contents with regards to the need for normalization/standardization, as well as potentially “seeing” outliers. There’s also knock on effects of knowing whether some data is malformed or non-changing.
This may also be useful when trying to place multiple corpora onto a single space.
At the moment I guess you can manually poke out each entry, store that somewhere, transpose it so you can feed each “column” into
fluid.bufstats~, but that seems a bit of a pita.
Oh, a perhaps more elegant solution would be to have a
stats message for
fluid.dataset~ which would dump out a
dict (or multichannel buffer) with the info.