Easier way to concatenate datasets

In my larger analysis patch I have 4 datasets which I then stitch together at the end of the process. At the moment this requires me dumping through various temp datasets, which is clunky as it is, but this post is more about having to specify a range in order to combine things. Considering that it has to go through a bunch of steps I need to keep track of those things, but also how large the datasets are at each step of the way to change the addrange message that needs to prepend things. This is possible programmatically, but gets a bit message in terms of the message formation and legibility.

I’ll add to that by saying that I couldn’t figure out how to concatenate datasets from the help/reference files. It was only when I looked at Example 11 that I figured out you have to specify a range to join. A handy feature, no doubt, but it should perhaps be an optional argument, and if none is given it concatenates the whole dataset.

So the first part of the request is to be able to specify more than two datasets to join (I’m sure I’ve mentioned this before, perhaps in person). So something like:
transformjoin datasetA datasetB datasetC datasetZ datasetCombined and fluid.datasetquery~ will just parse it such that the final named dataset is where they are going.

The second part is just to be able to send a message like that, sans addrange or anything else. If the amount of rows vary in the datasets, it can throw up an error like all the buffer-based objects do when you try to use the wrong weights etc…

transformjoin is low-level, yes. Ideal opportunity for abstractions that wrap higher level functionality. Here’s the start of one for merging a list of datasets. Obvious next additions would be error-checking, and being able to set the output dataset name by attribute


----------begin_max5_patcher----------
3245.3oc6c01aihjD9yI+JP91OlC0u+xnSmz8869ErZUD1PRXVL3CvyjYVs2
u8q5FrCXC1jX.6YmVilD4tASUOcUUW8SWM4Ot+tEKydMpXg2m79Uu6t6Ot+t
6rMYZ3t5Oe2h0AutJInvdYKVksdcTZ4hGp5qL50Ra64QEQkdAIIdEYayWAeo
0WQbns+rke9uiw7csltccbZRTo8KE+ViYaK20Jpt0MAkqdIN84GyiVUVIpBI
0G8fGVfr+ha+EA9o2uYtm+796M+3gKTkJ9VZ4KQkwq7xCRCyV6EFTFzoZwFG
0hHEU5CeLTqznuBR2QZUoWh2RukcpGj2gdPqaspoxusIpRIVr3AuEKCRed+u
89s9UYExm+fmxpvrJ8k7A02v3Uk9eIN5qcpZ3wYHByvFAliXVYUwLehHF4wn
bPXh7B2tdSm5BpScgzotPN0vzIGZpUUlPYTUg9hFb5QQ23sNJ+4HHrQytVkk
jkWIDHesVfjJFESkDNS+.zDZ+mPXs.2YSXif1A3gzmzP3orzxh3uaAIrwarK
LE2KlddSchTZ.RkM.FQzPJs2RTdMHVihfDEmD8kn7h3rzFO66VDrYSiluqws
Xf9OWAfpG12TbZUS38MkG8k3c2OeeqA4ffWBR81bqZs3Uwtvalulrvn7zsw1
uopFAifZQxNbmFrNpXSvppa1XUrq62fBN2ZSIs1TRo0WRT4Ds6hW9bR1peOJ
rg+H.3ahRiS2XlqIsLnrV122cXzSAaSJer0fHwuy9epVB6rSiNXE++UdbPxd
E3473vrTiPzZjvz7tGG31TG7toxXuhzfMcbyfQI.K8zYAnjaKVFjaFnVlD0v
gF7TxxRZ2096KI5ox5t2Dmld.JVlso+Nyie9kSbuKyfNWepuaaOEOtMsp2GA
ahxGKB9RaztDb4qiIz9q+0fz30AkvztUCADz9NiRC.E8khU4YP7hl5aUOeoi
dBAa7UQeMNr7E6Cpow.b4wa1YDsX+nbX7yQEksaqL34h1sTT9sJPuQSaWV6C
+XYz5MIfVz9B.ui3hxhWx9ZQ8EtyPqI.7V9XM8oaFfsU6mLYl1Qag3Ua1V5E
Vzr2lAGUM6nq3i8OYYeQ83ZADBmvHJrhx4bAhBw8fo5EHLSQnZLkKYDBLmpp
c5N2saVl5fLS.fP9j2myhS8fz77xiVGDCQWd1Ch9stvKNsLy1wYQMw3iZZtu
PalLSvUDEkPYPHEByWvzXhfBsC49pMSffmcPC+IuMYarPySw4EkV7xCxP1aU
1lu4MTTiM9nFSvs1ZJLShvHEhRLvF8l.19OQEEAOaAlOYWIgY8R+iZXxDG3e
50GTgGenBRH1GPCtFVkg.AYOQA2RtV6SkfGKAgUZgPHMIr.WoBSPDoRBnmRZ
rEmUj6eCAM8xdZOnYyv.7O6EuPiOdA4qw3JvDhfXbDAiD1r23HCdQ.6KFWqX
crnsQBeZk+7wKnKoGnf9APBRidN8pF5CrnBe7gQufk1xkBFLO.jArBC3FrzB
rzmnEZMRgwJBSp0MWhwHAcqqb75D61YR82POZlH.RloabTq6EHI8Bj3SBjmC
DkjiAQNS5SgnZv5g3BLEyDfYHfksWZ1bXxsrWSNsbrs41QlvGBxvRjujfUHs
lSYbLlCFdTtcoXyok1uf6CuXytoEix7YCZdR8riSqRhBxevKHLLGF1i7Pd8h
bJwribbg1WRoX.0fvVDvvBb.YbhOClEkhEZoVonZiWIVM2dkUrFsJKoubtTx
O.fc4SGX.MSZYv+oDhlfMoU..lORwIvrABo.L5f1pyJaNM2xidJJGLwd3j3l
Z9MzPbexgFZTHULMUqDHACBzIHRHoM4ram8Tx13P+5oN+e8gY5Ovz.zgLMvS
IYAkCyvqSPjq8QXBGREAAYb.yhhszwA4wAPKVPnPTPwLaFVBA6JdJKeMXJd1
jQT3aiYL5NtGEO+YiT.+yNwQOHlbzMFs1eOTOc0GbB2txLwrkVvkY3EQqETE
UHmWv76IdEIwqh75aVWI6pLIRWqonqTVp16ra0TVDy+5HHcwkzOLorHtNorz
Mn8CTJKh4O2XLLaKhI3vLqLAgffYSuYRYwNEQePE6pPRhRcrEFQv7UBIVJQP
lIDrsQJ+lLAOA95mfW2lbckfGAi7QZBWiXDLipfEgbsRwytiCMSy6rI7wm+0
ezk0YmSZPPWYyyyidhafUhH49ZIURPB.6XP9J5d71wTwbimwkQ8k3Lm+Avtw
2xC7V8kZFSS4vBzTBXlDnwqqg2+caT925K33MPrwtvQNBAKHFw0LklRzBFwr
jCLZR.RqN26VzzedcnIxvpqUPPvGuENTKbPaVMJ1haodGpZti9VsJML50FkC
ynfcUZWuf2DrO8ctcCZVq8CjaJrpaA3YSPZTq8eXYvpe+YXwDogGBBMwM9EB
aKyxCixaUcZDXA8bkhP4DMCSLkhl.yEJHQWFGoUHlso8ejpsjPgO.85xbk3q
HRHsYMEC2kv5ohONLHUw8UZlDtHHsFpfX1YMpBVYLrnXIGwPJrjHvsddlx35
PsChvTWgQx2JSlpRwpsJ+PqebfhjmsIKeekY4yu9i5jejF0YDpOEAeyle.e4
LAquwcDCBvigwcAjgKmBgTuwG2eqd3Rfz05oJmrXho+tMHpJt75P961YVu1x
RXTQYb59xC7WaD0zanSw8tECX9LO7.DCSFUSlTX3laPRgUbmTrXHCI6F6lFo
fMPoPNkRggq9gYddz0sNNbCrtlxZGEFidbwMwTBeSg0ggr43P5bf++CdZ0wS
k2wENU1fC0gbGzLcRwP7DTpoVJHCIpfXJsAUCcDYR8DFJVnlxXSpgFaRMkiH
B8XEU.VL2fb1u1QEDCE2ESoMngyxAYCJlRaPgXnXwTFgjqFKavAZYcsMAUhw
Rg4R8w0UvMnFSGZ1frIzNiNzzvlRgvx2zfDC5T5xwGpi+ThEhgtHAgXpkhgf
E79SPqtwcG.sElSYT3iUmnpGCJKyiWtsrZglMOQcuqS9yyIYKCRN335z0AC5
92Dt24Yir8tF069bzr.WZSGn3cbdP+XmcQAmXqDTT0AtFilhCC51zXyIZb+A
tqsRNRmw5cZB0Vzzb4zbpVMGXb6LFcpJumiY8Ga7hyqOfKnCO6t+7cXSsPPE
NTcByYJ2IM0cRScmzzK7jldhMJ0re78r4UD7UnVps9I5pWYCSyQZ3T03V0KO
DLQ0294IGaHwLwy4PEg8UWAlZQE4U3HDEu+kPxnbhZImCPN70QRulKVDAWM0
4U3DvDDVk5foFg9Ex6+LgNstQDhs38TWoScPBDVirb6S8UTZyd4pT6FIpbiH
ydkTvPicgTLjfGU9Hr8kBvMc8RvlfyOcky.EO5HvjrIo3gtKTrojwQio5vHf
aJIdfL3cjaJwhgxEoUZwSEqXuKo.MkRwP3iBOkjxiG7FT7Wc5n5gTiCXiJpS
lMniwKPt9q0wSQ0QUhr5IgYJX8VfoPo8kxV8aL.6ovnSL.OOuc1vVdrnUIdb
zotardWAZdGz40Mgbn2wXcu5Y063tA8lSiRrj0wmDMsyi7PqiouX9MsqzaVU
5EpKyxtORlC7V5sxKrS+YNaxIVtdn0xj0gGcu4fLVlbx4h0brIbbw53h0wEq
iKVGWrNtXcbw53h0wEqiKVGWr+bxE6kSzzeY3b6lf+waBpXuMXk91fg9wX2J
9ohK1vtH1PM+7UsioiojJ1yPmCexoyg5JsNGcNN5bbz43nywQmiiNGGcNN5b
bz43nywUZctRqyUZctRqyUZcWLcNq5hXiqP4GQmgJq6Lz4Pmb5bHbgiNGGcN
N5bbz43nywQmiiNGGcNN5bbz43nywUcNtpywUcNtpywUcNWHcNK6hXC17Smy
NlNthz4L8u3qvBW043nywQmiiNGGcNN5bbz43nywQmiiNGGcNtpywUcNtpyw
UcNtpy48+ZI+6wPRQc8JPh+Nd83bVRb186lyB0U.XskfCLyNiqRO8uLj7B5P
2IyO+U6n14j7WY+BZOCREPbnaQMbbbrZRS2g9BK7lqvQH9PePFqm29aQSeSK
0VZN3uCIXo8smDVgs+pwm1eG1+1EZtZ1EIrL5.PEFaDfEFeHOIwX7jjC4Iol
sgZ5vGpIB9IFpoyfvxGtvREmxtjLCBqb3BKSfOgvhuLgEgGfEmVbRokPpV.h
zHlUAAse3RMQQCItm8E7373Nbn.cxQM9IMwPWHxHFDxLFA+vH8fdVnQ4YgdG
ViiviBeV0BOBOKq.eViK7nff3gffX7XnVb1fdTiRhI3gEJPOZOqyGK3zwDwU
oDSrICKDj8eZTjtyZ2xOBHpxB8f8Nz7TNXOCOX+BOduB6eeBObOBsK9pu8F7
9+79+Or+7M3A
-----------end_max5_patcher-----------
1 Like

That definitely helps, though it seems like nothing would be lost from just having no-range being specified (the classic “-1”) taking the whole dataset by default. (e.g. just being able to do transformjoin A B C with no other arguments).

Now that there’s a native merge function for fluid.dataset~, is there any intention to adding a, um, concat function?

It seems like quite a useful thing, to be able to concatenate several datasets which may have had their own separate processing chains (standardization vs robust scaling vs PCA vs etc…).

I’m using the abstraction posted by @weefuzzy here ages ago, but being able to do it natively would be better/easier.

I haven’t been following the helper @weefuzzy mentioned but which dimension does it concatenate? Take the scenario where each dataset has the same keys, does it add columns (problematic) or does it just add unique stuff from B → A?

It adds columns.

The idea would be to have a dataset of processed files for, say, “loudness”, then another one for “timbre” etc…, then to “flatten” them into a single dataset for playing with.

I don’t understand what is missing. Merge in dataset to add item of similar numdims, or datasetquery to extend items of the same ID with more dimensions. that covers all no?

To concatenate multiple datasets you need to do a shuffle/dump, and need to keep track of columns etc…

The abstraction here helps that, but was mainly asking as there’s a native merge option (some that seems like would be possible with datasetquery anyways).

Bumping this separate from the other thread as having the ability to -1 ranges here, or having a way to combine multiple, arbitrarily-lengthed, fluid.dataset~s in a more general way (ala merge, which seems like a quite niche use case) would be super useful now that you no longer need to be so explicit with decisions further up the analysis chain.

Also, these threads/bumps are essentially from the friction I encountered patching stuff where I got things working super nicely, hand-picking descriptors and stats, and then having to slam the breaks and switch back to “oldschool FluCoMa” coding to combine things and carry on with the process. So the elegance of the “newschool” stuff seems to drop off at a certain point of the processing chain.