Ways to test the validity/usefulness/salience of your data

That’s super helpful!

Also a lot smarter that way. I was initially thinking doing fittransform[> 0.95][counter]numdimensions and just looping that until it was true.

Not sure what the name for it would be, but at the moment I have independent processing chains for each macro-descriptor (e.g. “loudness”, “timbre”, etc…), so each has its own standardization/reduction, and each has a different (and arbitrary, while I’m tweaking parameters) amount of initial dimensions, so didn’t want to have to dive into massaging the numbers manually for each change I make upstream (i.e. type of descriptor, amount of stats/derivs, etc…).

Here’s a chain-able/abstracted version that plays nice with the nameless dataset paradigm.


----------begin_max5_patcher----------
3324.3oc6c01aipiE9ys+JrhFoc5nbyhs4Ey8CqlUqzt+A1uMZUEIwIk6P.V
vzo8N5N+1W+Bj.DHXH.M8tSklop17xwO93y43mis462e2h0QuPSW.9UvW.2c
22u+t6jEIJ3t7+9tEG7dYSfWp7xVDlcXMMYwRUU7+xOLfxj0AyKL1is4I+v8
OlP2vTOaWrwJik.HDZJ9sk7uPnUFf+yomTTFq3QgxKUUD60Xp54rXwRvh0dg
6Wb7F82Jkqn0+1ufftKNJCIdGnLZxizPu0Axa2PT0eb+8h+aotMW523O4hmJ
i9hrAsHF7rWhuW3F5+vKXSVfGyOJrufBAYKAECGIn.Q3qAUZAQrVTV.38bp1
Zdi8tE67CnOSSREx+Iw8tEdwwkJ9tR2h.g9sH4Chr7XQ9gphvGKJg9re8GK+
4lvgAFGCxRjR+hWrMWb5wDsklDl4KuGUg79pbQR1qDx6VSi81ntYQmWQ0m.V
KWEtlCqPIt535dBX4c66Ch17UpDpLJJLJlF5GFmPSogLUeZ4p2R24kEvdbWT
HK0+2kh.j2c0T86xEwFqTzHjx+emqDEbrErOweaTnPHpzUHJt308E.zR1prJ
2XjWQnWbC2LWKfiKsTYJuQlkt1KQzSkOJAUTIKJJnZUGuu.5NVd0w9gg0PQV
Tb6Ul3u+oKbuqi3Ud3ROaYMoOlEpp8QtRA6wTumqh1Luff7wtUe7u3E5eviQ
Y9pt.jwwJUVJdJcSRTPPk1qplmanlsbk7Mzu4uk8j7EUVYfe49wEJQKN1Ku0
eOMkUsLl29zpkjxdUA5kJJac9f3GYzCwbiN0t.9vC+TV5SQeKM+BKTzJC.mr
2WdPcYCgUJulAQksmiiXawbWal7vX4XRGnbDpxof3+OpJetQOiR0TxxlYYQX
SzgC7QrUPC40GtkJaTnhxjl+yMqLPDnhKgZtERn7gXaA7VvVt9Unv7W5PwJt
s6UVKA14tNMOymYCfU4G44dIpbikvRxh4AdXf.PvPgCULCPA.HMpKvlNPCzE
QikW.QrFSDoP6rIH4SfTVh+FVvqfzXp2W4s4kfuQA7AwYAaAa8e1eKEvdhWx
+MyKgtEjxuDd.GI7HPBxnof0uB9nJnLPbjO2wD3W.vG.7A+zkf0YLvd+mog7
GgGS7f4toO3Ev8lHdwdBiw.uu485RfO6ujxqkABozsbqtkj1.+P5lnL0.LyA
1+YBcVYYYa6x+g2KZCcs4i8cUw7v0wcMwXC9ODGGLwvjGjHTa6BM4vsVOJzv
XQs63nKbtl0L0c+uS7BS2w6B.uFkk.R4+2FJXqGyCvh38Q9of79xncf3D+vM
9wdA.9ij69l+PSasaAOvtEGnsxJCYkiMh3PLx+AJFdIi0v1YPcDvl6HbIs1O
Pls9gMYGjAs+LEPeg6IkijaOFR+hqCJMIvygRnsziGZP95bMmKyQ7PVnTPb.
2LvfQA9XYgoY2lPg7I9MLT.+dvMUtWaK9jN3+hXOotobgykhQwXCtMJ9ridJ
h6ahavZGO.KvG41ts.4SuQ524gVsRgFZbiHWoUJK0T5TpQX7fTiLFSLaWPD+
spkxBuX+C7qsxzDZSKxVp3XYQZltjgnEUg1jyBArhU4FYOYYEHnno.WU09OO
5BYavdlh3VoCpBzusdAjdCYcQhe4LgwYauntjbhPlp.PqDyzF4L0t8lHooEh
ZZmrltIroMRapPbiFj2Ti.GRNyMRellVNxeIXIqLx1LGNZviiNb4zIeNZxoy
E30QOtc5femN43oSdd5fqmt46oSNezf2Gc39oO7+bANf5jGnKyEzk4C5xbBc
QdgZianl4GpENhzhmHM4JpY9hpamotM7ypuyfLpZPOgFGkv.vT9+peQs4ZsU
C6J2mXdr2PjUqged4XGNahjmh4ph6spt3lZXZCWwSzNGIDBgWMZ.j6s.9jqF
wiKfOTLULaZX5HAVHWqU1iFZQtEPqCdIeEDvsG.3d5OjB9akBy+iPN7k.LRe
n9s2Vj8WC3B4yXJGbQMEie+.W64.bkMyKpaYnQy2VRDloQ67he4vP0HTzyrj
UWnagF8VnReBgTdThod6ospu9gK6Z.0MdKHWrjmA0zKqGz+ni3NyhB4YSWpJ
386Af8IQYwiDDl65P8WZffnNQvkcghN2.nHCrtBIQCxnWNngUIyPWcvNQP4L
86BEQ3aBTL.rdbPQW7Jhivwgq8XpIdFqImoMRtAvwc.AIbW6HZDhTjpQARpH
aaLLJtKHRvJzk7EYbCfhOSeINgGyH6ieXGjGDzG1gd.74zMdAdIBRN.ijaGH
oHBRs0UuR+N2Ftc3SL85gPakwRGmUXoRpwr42w7cSzjJ0rYMXRh4fhlDNCP5
YKYkqvKiohhNsA0KMyOqQGwpxZnX9YcvcircJttKCmp7TmqFjGH6Rv4s8szT
le3QBW+xoFp3hGTW8PkPtCE8kPG90BmYAzoGxGOVs4GAkuTnlRn4af.Z1mdX
m2JD7lVBc5iVXwf94UBI8PKrXP+7Kg8wRybKfjdMPg7VXrtGBnrO9xFqKUY4
jPtPjknsOpxV1idLVh+5LlxEYSYfcPYpYePzZufZoTosj4b+4MfoNc4daxW6
SzqbEtXpRTNYBSWN7moKWqzkWY+Nfbk8Hll1+Lc4+Lc4+Lc4uuSWdGjpvszj
bsSr0Qsh5MUI2zZVHjBOKIQuqzHkRYb0sqkPJngaI.DSlG.z8OSYRRECgRQb
FSjD17FIQR6FowvDqdoBpchj5l+dL9VImbiP9jT6zHC4urwZBlXsyJmNYUBc
KjoX4bE5xBIVWsSjhOZ0p4bdxpDx58Ck8FyNk8Nt+YmxdENpxi4nPYuw6WJ6
w8gEHxaBQZt8gR72BxHw8gPWj0akDB6iDN248n.VzRBwuE4U.Y+tPBQ5Jgtu
E5gtCXjRsK9f+V0NZNedVPhZARICff.O9GuEfutCwvv2BvuO7oi6N2h+eNe5
mSivoPj2lcHdv66LrbE+A4Sw.gcfHnawOhMF+jQqtyrc3F7OEaMOwYWfbXL3
ah8FJ3PThXWj5Ep1B7bkBewwZf2AwJIWrP8OtUS8SOs4rG8ii.rqbtxVl4Dd
I+EYT2o613YZmtOw6kYrsw4JnPGnbQBNY6sY6YauM+unLkVZRjXSivMOITCY
QLufSJimNl.V+Jf5s4olNdF9DXBzSyOpbfSmdpwMvIxPQmP8i2jOtymsTTSH
PXqcIuifk3sgA1kDc.r0eC6gQ+LwH21PiFlmjyDCKxaukhcAY9aWEuw6GW6A
Vj53rydT1z5seB8LplGNjEv7SC72VIcJ86buvPt8wsLrWgpetWXpXQDBcqgF
o7mcQVofiEJUn6fbZF5vN8Yu9yEQNHvQK06SwMZ4ftmV0R0Zx1acpbMY+I.+
+e3Z8mgTqBaHdxhwBabapcl2zuszGQlu6zGE5h+Uw9Cn0sGvPNAJLyOPPy4K
eJTLQvY5PzQtb+OLT.AaSj6pr7yGLS7j5KAYLSfhHbkUYgwda9Zd7M+5UZKC
QJNXcmNUlYBbR3hDEbUSnNGSLkiercmTklwM9iKPw.O.XvGPK0GYt3xqpwUv
fEpUhHbT6PI7DnX41ZPu3azfdGJxhTKMOBRCrb3Zj3YBOqmuyVy0Y64jyFo8
4+5vzsfF857gENhvy0cB4laCaDNgbgvqFBNcBSeVhIOgJMwRcIr47ki9YoW4
x6ojx8+IaUq.X8Tm6qjYA0Txf0X0eTkhy24RsHEVSnPXoqLLkHA4lPHzUckz
t5Jbx5ifZHXtSH53B0THvNSnTfzcfKZ1so3n6HI7DhOXM6jlxwQt2.3.RWSq
HX0UFw3JE5hDFSoWFcQhoTsT2tC6oTwzF1GkhIUJzYLJx3VvPJAN2d6H5FM.
wdBG65B6iTLYwjnqT31tJadgEKnitVLGeeHeVSt7h2n9B2PIbiyGnIA8iA9a
n.TieUlPc+UYp3SUk03+QYxPQo13zRUTXjcvK9GfOW4COB.C9b.0KILwiQAF
qPfOK1uPRcjT9DwAe9f37QLUrIXrj2aH0e+Sq4ZaofiSftwTp16uwUHC0h2x
Qwmq6HBmv7CF9dBmBU7L+d+M6pfuUaSEwN1cnd.as8zbaw7heAyxqSH6EeM1
J8glY.nPUhGOoUsmpH5pmCdbrUqOAN5fIl1PBARbrbcgDMFI0WnZ7GEkDsNK
kIxnC8GC9q4FxYEwvwwvwjPrv1DrkrzQ2LBzwcL6w24yUpJ95jb7.VrulMcI
pUm4H2WSbFw9ZUhFFh5cQizzr4DLb88oDxnqTK95xjRY8VgNerrqgzRGwXza
rVVkaqxqnJOipld8fixAfFhcsBgHsFDeI5PNCl626xnq2kQko5UKB4g+xq8X
aYt7t3QngBqtoLZETsFi2kq470tb0ACsFilkkNpkPBYLZUUmjTquKmQ4c4ny
6x47gapQ50NTNDulZGBG0N7MN+P2n8Cai5GxFxY4z1gpw8+w8+OvN1TCN
-----------end_max5_patcher-----------

Have to unfortunately zl slice 2 at the end of it, but that’s a different issue.

Be aware that by not naming the fluid.pca~ instances and instead fitting twice (as far as I can work out), you’re potentially doing much more computation than you need to: fitting is much more costly than transforming for PCA.

I think I’d take some more convincing that needing to slice here is as bad as all that. I’m certainly not on board with the idea of only reporting the explained variance for fit and not fittransform.

In this case this is only happening on the corpus-creation side, so cpu usage isn’t that big of a concern. Keeping them unnamed just means I don’t have to #0/— if I want to abstract it later, so the cpu/time is definitely worth the tradeoff.

At the moment fit doesn’t report it at all, it instead sends a bang, ala oldschool interface. So there’s not really parity between fit and fittransform already.

The issue is more that fluid.pca~ uniquely, and specifically breaks the ability to just being able to chain processors.

Screenshot 2022-05-15 at 2.26.50 pm

Obviously this is an absurd processing chain, but there would be no indication that this one object/process needs to be treated differently.

Also, it would keep with the paradigm that the ‘left outlet is for chaining’ and the ‘right outlet is for other stuff’. Amount of variance seems to me to be squarely in the “other stuff” category.

At the moment fit doesn’t report it at all, it instead sends a bang, ala oldschool interface. So there’s not really parity between fit and fittransform already.

So it doesn’t. Which turns out to be because fit doesn’t ‘know’ (or care) about the number of dimensions. Fitting PCA is always a N->N mapping. Which is, IMO, all the more reason not to remove it from [fit]transform or, more generally, sacrifice the idea that messages might return some value as well as filling some output container .

It would also bring into parity with fluid.mlpregressor~ which uses the fit message as a way to query how well it fit.

More than anything, breaking the nameless chaining with fittransform is a bummer since there’s no way to know that this object has a different/specific implementation when processing things as you can literally just chain any of the other objects.

I was gonna post this as a separate thread thinking there may be a bug, but it’s possible there’s something in your subpatch that doesn’t like different preprocessing?

Basically if I go:
loudness -> *standardize* -> PCA -> normalize
Everything is happy pappy.

However, if I instead go:
loudness -> *robustscale* -> PCA -> normalize
I get some fucked looking data (basically it tells me 1d is enough to get 100% coverage, and that dimension consists of almost exclusively 0).


----------begin_max5_patcher----------
6882.3oc6cs0iihjr94t+Ujxpk1taUiGxK.IyCyoOZk187CXeazQsv1oqhow
fW.WUWync9sexKfKvlKIPBlpO0HMU0kACQ9EQFQjQjYD+46e2pMwemktB7Kf
eC7t28mu+cuS9QhO3c4+86Vcv+6aC8Sk21pswGNvhxVcm5ZYrumI+b7l0f3n
vm4+fA1Evumzf3n0qA9Q6.GhSy3W5OXIwrz+qhuaXPDaa7oH4C.m+gGSXo7m
ueF+a+0ZuinSGBhBYYR5Al+g6iixh7Ovjzx+CK7QVVvV+h2zQ+rsODDc+WSX
ayTiVpEZs0c.GWn3WPJU9WNqs.+uu7dhOkU7hrJ8h16ukU9kGrS9Zi276+Dw
dUoaLM3Oj2HxZsk3S+Ou+8heb2XwZ+0fiIAubEy.ItVRPv0QBITGwOQiGQHy
.hflFDA6ZuFa45ZYSrPXWBxgZd7AOG3Ce1Y1CAofTF6PJHJN4fe3Omv7Sii7
2DxpcJIpyojDih11zqmQRFO9hlA7Exw2sgL+DPvdPDisisqUH0L.FQAUXhqD
wrrMDhAmCDSXYvO74+fA7AaNEs8AP7dPB2ZQ7Av9DNLjx+E+e+6BqJy.XhfN
qsuSNLESqMDVhcMHVFwdh+LuBJ+iPPZXvVF.Uq1OTKy37jJwbnXwurkiYDpw
wbwSR8QYOejodLqVcG++O+kJM7QVqL23be3ofcqU5t3.4e0pt9lGstH43zCa
9QKbHi18gw7WbeGLtDhjmwGSTgEJWBkZi4bRtLrs0XFZa7itu9gG5koOhInY
rjuxTlOpLgfyejDoybCFExwSzvGO8CeNctgkz2guiq0LL7IsN7M2z7ifG8SB
7i1x969gaOEJ86XnZ1HRHA531AlfacxdiS3sqXPgy4Ti17gKWZHHj8HKQrTn
R7v2sx+3wRe76J8UDXzuGKePz6N+QAQpOBe9iRXOFb4ik+bS3.QFGENknr78
cmB1l3wDuikDcJP9cTeHmakSRR9hvhY5wbyYR1WwkKKsgK4fFDAk+oqm2KfK
myeeX71uw1UZpAGbOxhBhJ6NYkKuis2+TX1WKaoDJVkVMWuvjasW7rc++atb
T34gv8IA6hiDDQEdg3iKdc+F.JMNH94KCF4cD4erluLWJfCLMbwT9f7T5F+D
AqJehBp3hYwwgUuz4uWHaeV9kOFDEcAJlEer4KlDb+Cs7c2Dyu3g1d1xqj90
SQpq9UtTQ1WS8erJZm4GFlO8s5i+69QAb8ebmsNj6oy4KpTV7P51j3vvJiW0
UdrlqriKkuk8TvtrGjunxBC7aO3XgPzpyb4cA2yRyp9YY92mV8SRydVA5k9n
SaxmE+0L1gib8NWbC74GAoYoOD+TZ9MVHnUF.dIbJkmUWVWXkOuMchU0KtKX
aV4KTmpvlTGBwRCjHpUtEB6pZCqQiHozUtRqnfV3Huexym0OVVG4k5Ig1kIb
gzztupl47U+rrjfMmxTv16tBd3RGbShUjR6xF7k2vAt11J1ox0789K+WJSWC
gsIYDMxbrZm43XIsUAUJRwVE+rElCrMlSyrAmxjXwJyJKiK+BQ6XxwLxf.D2
rRp+8rZEr+.dnh0HO4Z1bsHqccPTWpU9+A4h2dZHjOPbjrxfPi581HD.0CBH
VZK7X0vnBsHkNnHLeANNPLzE4hgdblMwihb7rHVPBkPWOToGhxVOgJiDLBit
zGQiIv3tbjWv1tFRdgzKwErAQfVLRxctj6E1N.eDbNIDoCdtEW.giUNJTBRr
lLAD5p4Adx.gfvgBGpEVBQnb+90P+JpUznMWFrMIhTMHjUgjOCR4dfrUjepz
iL+uwGy2AdhA394cJbGXWviA6XfrG3ex+9jeBaGHkeK7kklvWmZ3IVJXyyfO
pV3N3Xb.esKfeB.+Df6eH6N.20Fv8AOxh3OB+LwC9b.rDuXeg+5.+m7e9NPP
1eSDZ9rpAP953dRFH+i.4S8scb73+GmK5.8b3y8yiCFWF2ifwBimTWWtJWB2
OQn15EpaMYWZewxZ0EeiyqxKOdny.69ek3GkJBRC343SIfT9O1x.67y7AYwp
bjjyKi2KSoz1fi9g.9ijuBO9CMsQ1BdfrEWUfmcvzZbhQEqRQ3KF.i.VOivi
1HefNa7gsmNHCsyiL.667EawQxcmC7ypwAkDJ7ZnD53IEzGjsNOxboNhupVF
CbLjqFXvn.Vs9Nu5Pg7nCNLT.+ZvLUtUaa9xorkoocJMS4AmKAih4FbcTILt
wItsItBq8bGr.ejq61FjGALocmO0nVJzP8aTsdJGahJ9XJGGwCRLxxjXVkLF
z9vh+wAG32akHI0jTjiTvw1FUeL0GhTTkXqekKfUzJ2Q3MDPPwPAttp9+J4e
XF73VICpbzenqiOeJKMe47Sme1NqtjRdIn8WF3oFBdeSAv+hudcAxugf42b.
86Nn9MEX+JA2Wi.7eQP9opo4HoMShsL8IDWRUVR8g4WiP8qS396Lj+ZF1+VB
8udg+uiT.zYZ.5LU.cjNftSIPmoEPiTCnS5A5SJBZIMAclpf1SWP6oLn8zFz
ZpCZJ8A0mBgFRifVoRPyzITeJEtTOyk5vu55c5jQUE5IriwIY.XJ++u7lZxz
ZiJ1U6oKL22aHxtQ2Oa22gqVH4K9bUMP9ULwM0vzVtfGaaloPHDdsw.Huk.9
jKFw8KfOULUrZZXpg.Kjm8ZGigVzk.ZcvO4afPt9..2R+gTvuVxM+OB4vWBv
J8SW90axy9w.tPtSX4fKpNe76G35LGf6kYeq0Lv0rC+x.g0Vbwa2MTMbE8JM
YWRzMDF8Fx7xDBoWmQlKxYW6lFPci21vJVFpOccFGwcmEAxqVtzUaXy6ShOc
zPPXtoC0eoABh5DAuqKTzcAfhYfMUBRzfT5kCZXYxLHHMkA6DAkqzuKTDgWD
nXHXiYPQO7ZpqvvgZu2YJIwqhZxURizE.NtGHBB2XmQiPzhTMJPRUv1LgRw8
gwhnB0lsHqE.J9H66GS39Ll8wOrGxcB5C6QeB7kzs9g9Ihfb.LjYGHsvCRsk
UGocmkgYG9BSGOD5RjpIccWicsaJ1nShcGxqFuIUhYypyjTxf7lDNCP5UaYk
QXkIeW6nMn11J+rMNhUMpgh0m0QrajiSw80NbpxSctXPA2Fb8XeGKMKH5b.W
+syVHk27fX0CkBKdo5Qg2JBDpIAxMOdCnP29.gN2BJzoGTnX917Sg8f.EGwm
Ym.60D4aw7jBTYgSg5NSlbCHvBsGZqKDN2DXOnO4focBrzEKmDx9t0+GTlZt
OLdie3EoTooj4LMGGfVSWt+1789Daj6vEhJQ4zILc4v2RWtVoK21ScB7UqW2
SxQHkO83ukt72RW9aoK+0X5x6HnJbMMIicgstpcTOQkbS6YIfT3YII5ckFoT
VFWbarAjBZ4UB.wz4A.89QJSRJeHTBhyXhjvjERhj1an4vT6dIBpchj5N98X
7RImbFHeRpSZjkprVf0DLwZmUNcxpDZIjoX4ZE5RCIVWoSjJdz1tyWVkP1ud
BYu0rGxdWuezCYuBGU4wzHgr250aH6Eba8CE4sHZt3dQg10dyGB1oNEj49lA
opMUgToCEd9Ol6gVeByJ8VDreLd.f+7Sgv9PgycXLw8INl3aQz9kyqQ5Rgd2
JJTaLjbqnPckCKDIlWJz1rX3+OOd5sUeP1c5vwAetyvxccJjuDCD1EhfdE+m
3fwOckCjYq3F7ODGMOQsKPZRF7j3rgBNDmHNEo9Qpi.OWnHPTVC7OH1I4hMp
+4iZZP5KGNaiWNBvdx0JaSxC3k7WTidR2cvyzIcehOKyXGqqEPgtP4lDbxNa
yNy1Ya9exxTRoIwhCMBW8jPLLKNyO7EgwWJS.adFv729PckmgOCl.4TO05Vf
Smbp0BnhLTvDtr7l7w8AY2ItRDPnq8NNiHKwealp7LKpDbex30DibcC0pXdR
pIF1zaulBU8M93V++ZrErHWU4v0HGZ8lqPOFU8vgSgYAogA6pjNk9U2Krj6e
eaKm0nKq6EDUTDgPuKPiT9ytHqTPSgRmqy2z5gtyUE77gR6m0eNIxAANZode
pXiVdAzSqXoZOY6uIUtmr+Lf+yOMV6YHXoJd3j3iE1ZYJclOzWVxiHxqN4Qg
r3OKNe.Md7.FREnffTkhS2oqV.hfyTQzQtc+OLT.A6Pkmpr75CFAOo1RPVyD
nHbWY8oni9a+Vt+M+xH0kkWScgP3zIxLSfSBmjXfQsf5bLgnZtRdSpPiY8+n
kPLvc.F7Azc5iLst8ppcGLXiZLPDtpSnDdBDr7ZzoW7B0o2ghrH0VyqnTuNQ
Rj3YBOGU4lVkSNGzTWros5U8gEtbpnxJcXlnhJCGMD7RSH3pDS9BpTWTpKgM
cuYuaZideUTrWEmrSsCf0ie0aJyQSRi5TM6RFkJ7f8gJrlRp.pAUfcmRpPSn
.Ogz.ktDjJnZKaN2yZzlxHMSYVSCko6LIO3TJ+nIQXOkDgq8BXljit7CDbpo
BczrgrlPpvVayfy9zF8IM3bqqwVSVm2Tx4VBymwVZRENSJUnI6XJoAj1tKM6
RqZSZnYeNNRW4G2qXwKEJCAalzlH9ot1PmRCG0XOngIcnojJzkKYOkDgtRJ8
a1UN0VrUm5ZaN8mComv091Z5xszjh3LYerMIdyozLQtKFbmr0QETKnEx7sx1
hBksnNmpN+s+lrRiqR2S0b9zSXoZnVK26dZpAtiZq2mRJ0z5wcgDvFQhZQA5
JCN9RX6YIfOLztzb9Pry16ZOGhdFuCMKZlJorrgJUSsv0er1GsPM1xjry8Ab
EQEMSly0Cy9xUscTCZaixUwHCxUUoERphbfiQUUs65zAMdNJwjbzsgL+j9NF
IpVqIVUB.vlkOBKxPypM2WRQr0ZrEzyAdWs+qRcsohuFpueu6S72EjGsZ34m
09fvP4y6qEDdt4gZtC4Og8mdu3IblxQ2U5Gs9UZ1XUMeC+n6yKu.tmO26Utg
W1ShpyFuccOjSYwkQr71YkgDKumkMjIe4Yjln1Hy1FVvzcHS73dsk0Pu7tE6
BPpZpkJA6jhe12wQic1bn8L0YyO8GAby2V8lQpF+pp8BYvcx7hS+ZwuENaUK
bXRaGaNsm6zyeAXe2+vwPF32i4y2ddsev9d6cfC5bipi+CWWhnQf5YYSDm7e
xPMsnNBqs036wlzTpezt3C.HhXMX4f740d3gNgtQFuAGnhMeTXvVFn2d6hxy
OMEqkZq96y.zhXbWdO2SG6sSu4CWW0lZtSl5PFtCR.tRa0R6QC1S5omvuOJe
JpkKgRswbdosFKbo0wViSO8fsp8N+ZWzVrlSvnPRdhF93oe3qZHl8d3qLUOw
C+4x18wyGDk+te3VYqqrAOYzP2lZ2TBcFrs7hpXQ8XhypxjvKUrsh3j0PkZq
tpzVQv+po5rUSkYq9pxV6Ujs5pFamOiicTE1VUNNPJLkRy6Nwx+z0q79qt15
uVG0dstp6ZsVy0zndq0PsVq65rVK0XsVquZsVa0Zotp0dMUq05oVG0RstpiZ
5VC0Zn9o0ZsSq45lVy0LslqWZ8KBzWUizpo9n0YsQSi5h100DMCugpG5tEFp
BcR9Nn11RiN0Jos8hnfV3Huexym0O159Gtb4+tuGkZF2j30U6uNN5DkugCbs
sUrSY9iM8n1lnNVNkNiNS4FE0oW6RRzLsou+.dvaBdu71XJolF5smFB4CDGM
ZOpdj6g1bHv.6gVzhT5.454hVOTIj7dX.Qsczgno6zF4tbjIv1tFRlfzKQB7
LUXmSXbOs1A3ifcb2LhDtAmN1CxqBkfjIrDO+ZnktmWO.Q4SWfSZKce1JNGe
Fjx8xXaV3yfziL+uwGy2AdhA39xcJbGXWviA6Xpij++9junDcb4QyeyyfOpV
btp.ejB9I.7SxN.+c.t6Kf6CdjEIp0GYhG74nTIdw9BexA9O4+7cffr+VJ+p
YfHFa2DTpOHP9TeaGGwAZhyEcfdNmOYgx.5RvXgARpqKlZQTkfzAb39axFhU
y0WAqYq9J7uNmrXYYWQsWZ.hjkCxh47nfTPNuLdeck1hTiWkEbgNJsLzZbTY
JpxBdzEPctHu56ymYTpjhTDbmUiCJIT30PILOw2CxVmGYtTGwW4JiANFxUCL
XT.qVCmWcnP88pWMQA7pWOUqGa9Rl3+h5Lolo7lspyy4JvyKsjXtBq8hRK0G
EM8PPdTtj1cZtVvfFpeip0L4nJhn4hQX7fDiL5wLuRVAZeXw+3fC768xpqe8
MhXofisp8Kaajh3xUU63Jt.R6S0efCAECE35p5+qjigYviakLnxQ+wcf0so4
KYe57y14sVohVsREpZZNRZyjnpU0DWxasRk2ZkJu0JUJyVe80JUt1IiK2fzh
cZG.ll0QSpUiJONxlVo+x2f6ms66PMGn+Ynzq2ELskK3w1lYJDJu8waD.xaI
fO4hQb+B3SESEqlFlZHvB4Uz2nMAZQWBn0A+juAB45C.bK8GRA+ZI27+HjCe
I.qzOc4WuIO6GC3B4NgkCtn57wuefqyqlFOARUK3l0NOAbPsQAzBnkQ8AC0B
3OaYn9TxYbD204Go9EErroiYrgQ45tT5xQloalivPUcGTSYPsaXTs2UdVF8J
JSzonJ1KzTQovFpNuQlRRrydDkKcAfi6AhfvM1YzH0YzRjpQARZSludDEzZA
fhppJZPTlrJ29qhJK5mZrzhNBclhNDmxIGskUGocmkgYG9BSGOD5RjpIccWi
csaJ1nShcGxqm1XFb1cljR9QuMlovTUH5LRaLy90aaLyvsJ9InW1a0CJz8FP
fn9zGstIMXo9zJxbcuATHYoSft8oYtcKZDYt8oMeQuEsaN29fgt2h9knKsGS
jgV2JJrO5BmaBj1qYxz25TZiHc4xtI7IYNgF2NbgnRTNcBSWN7szkqU5xUkc
DnpRPg7T0H.hyaoK+szk+V5xecmt7NBpBWSSxXWXqpO2.IpjaZOKAjBOKIQu
qzHkxx3haiMfTPKuR.HlNO.n2ORYRR4CgRPbFSjDlrPRjzdCMGVUA4zVDT6D
I0c76w3kRN4LP9jTmzHKUcZ.qIXh0Nqb5jUIzRHSwx0JzkFRrtRmHU7nscmu
rJgre8DxdqYOj8td+nGxdENpxioQBYu0q2P1a51Q+DDwYmdDIML7VQgKeLDo
KE5cCnPbexcD9VDUbbeRMCx9Fj7HbexMC5VD2dbeBoK8VDUbr2.vvKt4CA6T
m45hpsJUsEtjt3Pgm+i4Nf+8Zn47V7zaOd5sUCPFUiRE60XC+D6MckCjYq3F
7ODGMOQsKPNIA7j3rgBNDmHNEo9Qpi.OWnHPTVC7OH1I4hMp+4iZZP5KGNai
WNBvdx0JaSxC3k7WTidR2cvyzIcehOKyXGqqEPgtP4lDbxNayNy1Ya9exxTR
omqx2Bwvr3L+vWDFeoLAr4Y.ye6C0UdF9LXBjSUE9ZK3zImZs.pHCELgKKuI
ebeP1chqDIa8x2wYDYI9ay.6ShO.DU6sOY7ZhQttgZULOI0DCa5sWSQOaBys
TvhTciEGqIsweaa1N+8ovrfzvfcURmR+p6EVx8uuskyZzk08BhJJhPn2EnQJ
+YWjUJnoPoBYGDsgNTsaeNq+bRjCBbzpnILTtq+L8hkp8js+lT4dx9y.9O+z
XsmgfkppgShOVXqkozY9PeYIOhHu5jGExh+r37Az3wCXHUfBBRUtMcmtZAHB
NSEQG418+vPADrCUdpxxqOXD7jZKAYMSfhvck0mhN5u8a4927KiTWVdcyMue
PLMhLyD3nZwTiZA04XBAUeqlxnBMl0+iVBw.2AXvGP2oOxz51qp1cvfMpw.Q
3pNgR3IPvxqQmdwKTmdGJxhTaMOJRCrb3Rj3YBOGUIkNu6Gfl5BJsUupOrvk
SUSVoCyDUMY3ngfWZz.WkXxEW+h8Gkto6hnwBuH5wxK4dP8xswcuf614K2dD
+UY7soIcSn3NVWYJmojJr0UG3TSD5vN7lRZPWoU3rKspOoM6ywcfZx5PSokC
GXerbLUTgqtykvSHQPWBymo5xO7lTpvQSpX1m0nMkQmaUMTckenNU2GTFkJ7
V.Sj7zUyF1cpoBs4GS1hTbVBREZaEzteyZxI2hs5TWayo+bH88s12VSWtklT
DmIaVsoY9Q67S1Mh1UqiJnVPqIne05btkA2Q2b2fsXcNjjjsWj7IvGf2ATGW
ig0clUmwgZRv335253A0zhaZ7J1+JY8cDhIv0hFuskMwBgcIHGpHFw4kBPyN
bQTSNbSX6YIfOLzdQc9XryVXaeGiXi2GpEcSlTV1PmVSUymoVleVcdd1LDCc
e.WUbQ6z4bEAsu7UaGYBNn1Fku5YYxFFuLwXRiDCbLppqeWmPrQyRoNCZbNv
dIs5HRo1ZrSTujl3LS8RZEScyo86a.IZSIrZWbf8H02vJFMaM+zEantlsHa2
96T6TdPPaiYbmlVy49PjYMshoFb.KNFpuTbX0V9NeDhyqlGWeNn0swfWbPSu
pLvVYD6XRcwV8VDVo1Mevdc4WdbbSSZTUtaRaPd0tMUu1kli5zEyjzz.T5i1
KMxXweV6PFZb+HDySC4K2gE8WCUTF5nbVDQ8ndtdPp6c.OyqtB4tXbhpPME0
cpbhhLELZQEuI8u.eQz3RYIAOlBfCbsOHWxZLwARbQdTJ2GdGGwNKgXdf.il
BjHL9ztHtVNNXj2cFYe2+vwPlDb19.2gS.D7kmBh1E+jXCVvmo6.9xCwGk+g
CA7ku8DST8d3vCvB7krjSLQm8T7u4OA4BLSUeoibqhh657ZRpa2q2WQOD2+c
OjXZGwhP773qTS0M9DhkdlbVWY3WdCU2k.JtwkA0ImmzwQqUiix3U7cceSXj
NuIGC7ljoeA1wap5Ik7hHMO3Wc0CrnF4fY3uIjtLN3XeS1Z.mkqili6Mg55M
QLwaxRGzCZh2DTm2jmIlfQz3MgQlXLQ04MYDobu45MoCaBhMgrGTmYtlPzCp
C3AaPOHb5e0UON8lRELzsGi5wyH6RwHz0DuIcLq.MxTaslFXh2DTK8UF4MUs
lZ2H7YD0HZoazPuK8boxctbopZ5zMkpDjNymwlvuQrN1.vUp6zlZPp0ql3LE
5Kw5L4PRenwhuN53WtIr7gczkSZj2TWtrhMh6cz4xtF1SWzarRDTc3SdFQWo
kd1.vF4coiksyTzndWd5H+4Yap2TWyd8LxXRGkgdNFXosdNZwnLhyhNHcGUi
F+zYwsmomqhlzEkscwa4hxz9Ekm8qKK6MWN1urLrK2sOMU10e++48+e.ANk7
w
-----------end_max5_patcher-----------

Something* about that robust scaled data makes PCA explode as a process. Exactly the same happens if I process it using sci-kit learn’s PCA in Python: the first singular value is absurdly huge (~4e16) and this swamps everything.

[*] Something turns out to be an extreme value of -1.1121164977963008e+16, which would explain it

1 Like

Most peculiar indeed!

I initially thought that my input data was super outlier-y and when removing all that the algorithm was like “most of your signal is zero, bro”, but this random/jongly example shouldn’t produce weird outliers like that as it’s random(ish) selections.

Makes me a bit dubious of robustscale as now I gotta double-check what I get further down the line.

//////////////////////////////////////////////////

On a whim I decided to try leaving out the true-peak channel for the loudness calculation ([fluid.bufstats~ @numchans 1 @numderivs 1] instead in the above example) and it’s not blowing up PCA. In fact, if I do only true-peak stats, it blows up.

This is what print on the first dataset looks like if I do @numchans 1 @startchan 1:

rows: 118 cols: 14
0     -11.26    1.2352   -1.3107       ...   -1.2649         0    3.5794
1    -24.011   0.18792  -0.28868       ...         0         0   0.37973
2    -24.788    0.5882    -1.735       ...   -1.7396         0   0.37447
       ...
115    -28.333    1.4876   -1.7817       ...    -4.387         0     0.623
116    -12.539    2.2324   -1.7566       ...   -5.3919         0     1.509
117    -29.661   0.69361  -0.93246       ...   -1.5607         0   0.14637

I notice a column of zeros there at the end, which I would guess is the first derivative of the median? That seems (statistically) unlikely?

If I chuck a zl nth 13 on the output of fluid.buf2list I see that that column is mostly zeroes with an occasional other value in there.

Is there something funky going on further upstream? (fluid.bufstats~ or true-peak)

I suspect that the problem is some pathological case for robust scale, but knowing that it can be narrowed down to just the peak is helpful.

1 Like

It’s quite likely to be a column of mostly zeros with the occasional spike.

1 Like

Ok, it’s a scaler bug when the range of a column is zero. In the case of robust scale it can be hard to spot up front because the range in question will be the difference between the selected quantiles (so a col could have valid min / max but mostly zeros and end up like this).

Fix in process.

Meanwhile, question: do you find, empirically, that there’s much value in using both average loudness and peak together? They will be very strongly correlated (as will most of the derived stats), so it seems like you’d end up producing quite a lot of redundant data.

1 Like

Fix done, this data behaves now. PR on the way.

1 Like

Woohoo, a bug catch!

Most definitely not. I was just playing with settings and for the sake of simplicity it’s easier to have the fluid.bufstats~ object be the same across the board. If/when I can just @select loudness I’ll likely never see a true-peak again in my life!

Also in testing today, fluid.dataset~ won’t take a name that isn’t a symbol in the first place. It reports:
fluid.dataset~: Shared object given no name – won't be shared!

Which is odd as it’s possible for it to have no name now anyways.

But since the name has to be a symbol, perhaps the fluid.datasetprocess~ objects can do a type check on expected dataset name inputs and if it sees a float (which can only come from the output of fittransform $1-ing fluid.pca~, it can ignore the float part of the output and interpret fittransform u235352353 0.9352533 as being a <command> <buffer> <unrelated info which I can ignore>.

Is it odd though? Neither dict nor buffer~ will take a float as a name either…

Nor should they. It’s the fact that the subsequent object interprets fittransform u35235353523 0.8385385 as a message that contains two dataset names.

Screenshot 2022-05-16 at 2.50.13 pm


----------begin_max5_patcher----------
1069.3ocuX1siahCFF93jqBKTOLMx+fA68n89XUUkSvIkYIlHvzNsUct1WiM
IyLaIDW3aFMRCBiIu9we+Z945UI6peT2lf9Kz+fVs5mqWsxOT+.qFteUxI0i
6qTs9okbR21pNpS1DdlU+n0O9gRqsQYZOT2bB0wYDLNEmyQ3sbIiH3Wdg5Na
k1Z+9YcP0jDzmFdzYkc+WJMG+biduM7TFiuEuAkIw8WnTQ3xV70Wxzcpz39E
8qN5yCFzwOJYXzxB+Jsd2CejHR5G6WqW2+uMQxtQ+M2K+6nW0UVr87d0S2Fx
M2ATJM.ZpzeAOMmjQ4jNBmYygyaYiOpsF0I87LkAxR4dz3bnLjbvMj65NPqJ
as2DS+Cmzqkl1imKFn+RNEJVwPZL2WoUMyLpLWtk63KKfIT1RIj3ctozXmGd
DxKcUYPY9xmCd65r1ZSx0EaiK9ypa9r1n1U4oAeKF2oLGmlyf8KOjTM8x++S
R4LBlPFP5JnTTeB4pkfw2zV1ami.SRl2PJx.xbx.jyteTh3Sfn2Tt44qwh7f
ib9bPlMRBHvy01ZcVXUSQ4Oz.T7jDnFyfp5INEbhMt9iTUvvanqHIX3lQADW
WYzC5lmPsFUiF8Ps4X022pJObSrOTUqrW8xmr9Sn8ARfbBUB0F.iCYIHmyci
8PeFazGHaPgf2YUPJDNSwzPTMCpFJXoPx6qNBvGHyi0AeadtGRAX8VfgrrjS
KMZ5VgiLFlyxBgxygyQbgEYfyYHPdAjJ8Hlh88YjiAhTlDPROq1+uHUQw4ZW
oUT4cYNhn0g1GIbxz.yhtgCHOypEsy8WUzcbDCwLtLJhIw1uACzSuhWzgcFb
eo9afn8QH6ebecmwNg+p+nPCgpQz9HkgCQrY8vlcGyIeTjSGAYxawY0OTorV
sYAsScw8kh2xv44XpPJj4RhH20bETeIFZN3vWnrpVsc4jmJ8AtBv9pSouEFZ
W6T11kCKUv1xRcmUHmJEBBUjk4PmATUIBi9VvdUcWgwkW6Ize2V20rWGZqd4
aFozsRZuWeJN04Ff4o88UClS+K2K7+VIUkl++Gb1u75G+0aPAPG.5xww1fdd
MUnaskFksr17xYkElznVgXkx0tXDJcY8rHotrdmVJgaRjqSptovksu+i+7NH
s6TgubVW0lrLseEP2PZND6u7XfDBKo7OPHX2LiQYWCGipLdwlw6pL.6sQYDA
PGZL5vnP3WxhIaFDJk+dE.vhZyCjjlrXLTzbHjJMhLUBHJ5PiwPwfPIVTgsC
0bVlRw3S3WNzkpTVD1IFD4HXYwt6AhRz6ojDBkDuaIIjwt6sTOBQL1oKc58a
MppNe9q5l1gY6Ew0y9C09hlhM9aKMga8e6hjF8WKuLeeKxIpFWe2VWS2cM9E
VxiYoIgWs1U90zUNT62gmSR+4A5+rhsmUAR7GaX8uV+e.7pEtpA
-----------end_max5_patcher-----------

Ah right, got you, this is still about the slice

Yup, just the small slice thing…

slice

So based on the points raised in the thread about fit vs fittransform and chaining, perhaps the variance should be moved to a separate message altogether? (e.g. variance or something descriptive like that). This would make fit chain-able, add parity between fit and fittransform, and more importantly, remove any accidental chaining/loops if what you want to do is query the variance only.

5 posts were split to a new topic: Making a well-sampled MFCC space for audio querying