Getting negative numbers from fluid.normalize~

This is related to a discussion with @weefuzzy on the Slack. (link here, though I guess it will evaporate in time)

So I’ve been playing with this some more, to see if I can get a version working with a smaller amount of dimensions after fluid.pca~ and I was getting dogshit matching again.

When I start poking at the numbers I noticed that I’m getting random negative values after normalization.

One thing is the nans and numbers like 2.404e+303-7.6205e-227-2.2587e-82 (a literal entry), but I would assume/presume that fluid.normalize~ would bound everything between 0.0 and 1.0 no matter what.


----------begin_max5_patcher----------
6924.3oc2ck0jihjj94p9UfkOmop39XeY2tmYmdLa6Zm11pFqs05oszHkHyh
tUJjgP0wN1z+1W2i.gPRHHjDPdTOjhBHhvO+bOBfv+mu8MWcW1WSVcUz+Vzu
D8l27Oe6adi6T3IdS4++MW8X7WmNOdk61tZZ1iOlrn3pq8WqH4qEtyKuaRzh
jGhKR+bRzmimuNYUT5hn3nEY4OFOO8+KYVzr3h3UIE+6aZ77zEISyVuv0Crx
Sde1hhUvs6NGYBo7zKV+X15h4IENxfT6lWD+n6lu5ulL+yIEoSiup1UuOdp6
pzs8S5hMcylSlNy0AY28a2HDaZ8x3hoeJcwC2lmLsvKhDZJPPQJg6Glvh+HT
SHQ+J1l+0aeK9mquPQIaxUsxzcxB7VXAIigDM0Pb+33.TL2qb.8B4.toMk.y
zebvcqKJxVzL0tgv7mp3aKS7DvU2Eu3gqh90JRLGr.KRxuMYQ7cySNEFkoZi
QsBGGpMdqsM+8rXzuD+4j6AWw+nYdUcLd894YwfBt0Clmtx86VYxcqu+9j7J
Oyat4lh73EqRAikIaADtpIoj7PoDMDeRqytlIUdOzd1hVNIB05QIoEeJIOJ6
9H32njuF+3x4qhtOO6wnUEIKYOmg1nrVDiZM2YrQ3d+Iwf.swmDceZgyV.MF
AgXbQzpOkV7bVtwaCOkRjdCOtGUZfhIHlDM8SIS+8nuksNeSfzUQw4IQOjkM
6Ys3qMyNQo4lOnjkLPATi.npYktqIQKyRW7r1hi0Z7OuIFk4+QnGDYFsTlUZ
pc0yDAitUWQunfTStvTuLSLf1lEfV3cWnhdHwfEIeAFvCz+KiAvlJfZmGCHf
olaKdbYzNwyWUDuXVb9r8hnGh7ZqrZG4AuAPDR.xCqKS.tO1Elm6KxTBssZh
SnSjWCQqGRUezTv0DTsq5G0IqAlj1M.GiZ1jJ2FsY81jjWxYkrF.njNO4yI4
qRAs2Vx7MWEubYsS+lZMAkG+VlqiLWWcpzE9SQqNUdxmS2zdY0YiyAJu.H60
4dHtup1jnJ1MYyfLfWm55I+IAMSII4zAHz3pkkHfNU0lKWa5ZdPduIMURbhD
iVsUZ.J4GlmM82SlUyJCTNKSVjtXYdxJPQByJ2S7UWdVx8wqmWbac7bJaRiW
eCHciWrBe+6xSimWwAOjmNKaARD6nJvSuY3.qYomqpyLt6XQ7xFZLX3.xkib
Q.EpX8p6hyQMUoKGayEKxxlu6kpZ27j6KJu7xzEK1SJVjs73WLO8gO0RauKC
t3is02tqr510K7W8Vvnn31UvD0189hmOuzSc2t+qwKReLt.Bp9XY73pK5gc9
zpo4YymuC+5uxma3Jy.i7oIeIcVwmbCTciA31SWtwH5pJs7rzGRVUr64JheX
0tmYUw27B8ZmZ8ckNw2Vj.4jAbwt2.3d.Spb0mx9xpxabigVcAv1ktptSccX
ucN+dveNDpJG1iAyE.d+gPdjZWnNrW8QqwrgcIsH493Xa96NiylD1pKtbCyh
YINtls4btnAk3NmoHB.nVE+PR8woJLwMTxDohYUbmKrUo3rHHa+IZpQIkQpI
FMgJTQ7IvbzY7H8DtfP4lnaTSTJshZvFZj9FRUPuAHcSfovyEvIXlIVijwkf
oHzGJFmJEQvHo4bi.ZvDt1RTZrSzBtENmdByRzvPIlvsJJbMno.rlwZfSdCY
hVSXJFzaTsAZQzMPK3bJUhmhpTVELtSHRARuTncRNdJ4DovPLLr6HSDvfRov
kEZCyJvV.+vsLWKLRCEINCGxGgEwln3Zpf56DgVYfSYEZtgCckxxAAGzNqUP
DZrqJuFHF4H4vmnYLhjiRQAgS3NVh.7tPyPVRJrZgE3XfQjVTPt4hBPfRbiG
nEjL7.PLXUnrE3ViAHT0DPe.xbkvnfqyMTM0hWWxrbICD2vnJURbjDLBnZQB
QgBNTBxDvnXkBNE6E3FAt.tAqlvMPaDns.P0QU2Mj8BmxYH6vIJFSg2j.HCA
HfENcjPaAI.NLTlhpQAmVyDVJbJvvxvcRbIDKzZw6xHXfh9FCnxQAo.3TK1O
LfpTXi.coFt1M7ILCGUXfklTRQiCv7BT..iSHZkVhcrQHgjrvwBU2L3FsdSA
sVRcRNTVpQSLB7KdSVlDT6d4LnzvQhB8h.uIrcnIOHczP9ynPFXEFZi.pVJG
j2vXIgKg17f3gPsBzhPw.yQJJdf+Cz.FUf8BXJQXZTUI33Av0A8ObNzRSPAc
AZgZzVPeSm.NYDC1sBKyBV4vMiRezkBbRUPWBRRPLiR9pqA5OiiytAsjrTmA
JZFfpTvQTBl+nljJ3bQIQA8ta7.1P5XMBShZWPlCbK0I.LJPSgMz.9ufcILh
JjZAeBiw0ODfC3nfTxA0gEU8BBdMTXyAOPTXqHfsBRQflw0uRoFLPbdJRKm4
LDA+bbj3Bs.UHvXhl6nPB714L+nIYnABPXfRPHw6BQd3Xmy.KMqzeWVBQ55K
gxAOPACWmWEG3VsSJYTLD7Bb3LDsSyZ0V2YPbIjjfIqYMNQjjAnfX+nsFsyf
fIEL1NwF1cQIzzyKHUKAnXMGfh1Q.JI0udM9oeHc4uwUhpoZMBgcbvabvlOx
BZ.AwgwywfAnlPAVqfEjcBi.oCgZdDXTA2XDLcBPyq0FvZBcOnHnIQ.fPNKc
gTiPAP2C9vbp6bZkk.H8.xMUJQbSJLCSKWwnvH.8EzVDMG5dI20YBH10FSRf
ffaGLw.2YzLhnMH1H.yAWDiAZ0fuAGgfs.VFEM6QvLzmTvgQzwPP7JiFirYX
JKBWBrO2XTHnEEBs.fqnOIvsTzMALCEFenNFfBi9RJpCJB7k.WI.0GuKHzjK
9CD7fvQd.LDUpR2aLT.f4BN2.WgtNfgOEIOf6f.XHoCw1DBGxGvyJGNI.V4t
GPf4j6PrMzfWAgZA2C7lA.YqEgdTfnf3ccI.9KF2Dj3DD3.tKvQ2fdkfKDvm
NLQ.xUJcCmEuKmhAxF.ccHZgz3vsAIEDG.zUfrvhdbf1gnc3s.fClZxDNnDT
XfQFfTHvaBT4FgCC.B3ocHvbNF2FzbXbEmY.ECofP.vARLVCJSzbOLrURoNB
GBrZbAU.2Z71wn.L.mi6yZf.PvLqC.SI0d9hv0NnEP8YX5xn4RG8.AgDJWVN
BCfloK6DEZy3ursrEPzWgi3LPPLG1EQ4itAvK9gDETf7wCVAAUQfS7xfYlaH
.Mj.MWw3PHPlycvX3JG4JAHKTByMVLHDl0BjjiSDqIf0kC2yBYe3v8Almfou
vUZHpfC2WxPCPfvXXBSd3QH.CpQAjDq6lHP7LsCx1.QIcZK.l0GJDDnbe9aD
HbfBocEjcj25jIMv+wQnBzs.0WfDvYB.ZIgvkMFHg4RjF.XcHxTjSoBjkmm4
LWnevKl6xWBFNlh6XP.fw5xZgYjtNGrJAKZmirjXENaH.DAuaCF0vGtEnpRc
AgxJiQBwOclGbzl3FmDxRcxEv915rgfbjv7u.TFvuP5DUPteLW6.rAkOCIPV
oztFBonYL99B.EQGYHaEiuu.DDL1Gpt.HAmrBBOiYtRPGAWjOBCL+cwfwTKc
F6H4I8If.Yle7fRV4XETRp5HpDSq7KXjK3jzOuIHNPOGUZ7muHsKN2yqBxkL
eQZOJh778wkQjiI.nGQ.X5P.PUbuDv+T35SYv10jBM7OxJM3HI75MKbVksNe
5FafM1xQ6RbyRVUjtnZI59ksbdTn1umJUvBkHFPZfFpjXiHaXnhKWcr0NwsP
WGwPY65jtdVZ1GbqV4suOYw5s1iaVW0lI86d39z4ymlMOKu1MT6NfbjKwatx
eWWW6RUs6WvftVAL8+qw3.TtlobGAGHkWGQqKLJaGcSCEX9hL71gHdPhWtiL
PTHNdDoglx1NlvDa8iDjxEQ3OBNE8ZLb2tME.KKWAW81kgzqeyyVlkWsb1P.
ycZ25hrGximkV5rSptVk9Z6hv6eNM3i43f0mDg0.EQZ89nIqsJk5Ott54Z1o
hbUxb.9B3fczIZHmEqSmPgb77xSHgNp3.wiapPt1da5B7AajTIigzGnNYLqR
QAobI3GJiAyocFdKQP0VrAPpqBs1cDj+l1oig7Ap215q6O3e72VlrH5CwKVE
8gjGSuKa9rq1anB2xsR60hw6IXHsq4qoxn0.4caL9iZT9ru4qBx0yc+J7eZ+
Q9daTrdqINimNEtgcjIbTe6DDhMzI5NimidPnvZsSngIz6DHLLsQ+QP9vpFZ
HX0939iq1.xBu.k.yrzeDPBLVY6GTmt2GWTDpSW8myE9XT2i4p+Xt9EDCbPI
7e56Jt00Mjyj7YghHPtt1e5DBv6D0zQzFjW0P.7O6uWcN8mTSGeu9vwj1UXZ
kXB.dbcALCdm4AHU4MAWLOINuVqodqIKDhP5GaPzn4ig69O+w+xY5tXZ2vGl
CAHGt1Ocek0VdDw1oc+e8ayxydHYwGcVx+59SUwOg45lwX9VtX6TFUp7P1nk
kKTs84oCT3AWd1kyW69O3hA5nIhRxodC.Puv5JpIESZ1mWD5z3zc5MoOGPTS
b0TI9bmsLkvIfjVJfrLFtQtIqmN87bkrCQfGC2p8XqbaIvDdJIuS+ueHYQxm
iCwyqC54hmZE8zYlKxaYeAXYd+GdzE4mErqjnL7QSG0dFnc0vwvW36q+F+0o
GwgQCMUZNR4QiFo+mq8RW8hhv+g3zE+wYP43CacCn6gGMdj+pujB3Lgx.G3x
PKwg19mQktYu3H7+qUySmkjeYKxxo.SsWPpVCfzg7oq7AdNfA9iUu71mANxw
jKMwspCZvXvduOtHO8qSKxm2Sb4nPz3qP9eD9BDeZvjuDrJ+uylkrpeRU8by
X5Yrv4zfD6.NaTn30Od2kggeXhKiH4+SoSKNMY9yVameB+v.JdQl43GllsL4
bQEO+IpTsVfVN9uw116iw2cVN5mBC2j0ZmHkiBym70h6NoIo0vbTujL3FElL
6Abl4CxTh5PIy0Jgl5akfScK8fn7Q5NFr9e+wS3Ag+LPU8y6sYJL.SGgLQdb
vnm1o.OKN+2uYA9wOciaw1BUHTsXgmwyW4HKqmnBSRW8.uqdz2Fmnb.VTcZG
9MC1B8MXKndvKzWmO6nA0x62J+VvHjZun5CQ2yN24bHa8wjd27DHFl+ck36y
x98PV45sOkuCNX3VE6VdlOjWIF26+jM6loGXS6zE+9kENA+1GHlprb33qYun
6m9xI.Be.BdfC4wrrudm+N.OVlSiNeBf2eBdqABSkcNsowk6Kvg6.uwsOkUW
S30OZ3iz76944+y3G3+KgDSmioicZSQZ+0KjXEkuAXVK2nKORaBbhhpp7tjU
flabfFd9+w3uppsCJDbpnm+ysdOU794q2fmmY2YQqKQGFGgyde7BmxJHfekI
1ciFhgLFM85AeMfCitsWmYfo5Ux5vi5H9g1fuQ0W23QOQYP0mui0G5od5VD6
+YlbpK7cs2i8QatK9sviue95j5eksm7pXwo5xWVss.t1QClsjIxy9xhKhKdZ
CV34h+z2huLlfvrR+aukVy4LuONiBItLdLwOjmjbgbgSKTsl1TENgfwiAdOf
runH9hXAifnT9UjWQ.F3Z+WHL0JGO93+IY1EwC5pYXPql8IoLx6XwC+uIymm
8kvYiCeqzw8JASm4Awj921RsgZbSuBZmTHGcFk8ZlQcOp0eHa9EZVVsJeRHu
Uo+kjsdJECJSrbc9x4Wdhe5pumAAUB3aNfZDlS14pivIBhO4WljnrNGTlQpX
ix6padxzjzOG9a.QS1fUKpyVdmT+qDXXY.+znQyvybZocZ.dz0rP.wDrkwj0
keQafymcThNuJYwrUuDiJuUgMr3FG8iSA2ljz9mytPQrkeHhZ5nXuVDu7ThI
Tas8+w3hrneDWElqFbZbUQx8qmOu3jdfw6+QuXjRp+AlgacD9uZTEfLn5RAI
nvTPbS.oZ5l3Z7MNe8BaX9Kh2YFswmuHuxCiyrbq4R48m0qQVAtWM9R76k7K
3pv9syiz4p8H8le1IDTw5VpEdo9jf6TWtm0Yve+0mvGj2dnG+4z7huE8e9PV
HOQPp0R7SxAbWEkuEVJBgSK+JxtbkQYeroCtB2TRmcqeCX813hh7z6VW32pC
puC7dRaTnOLO6t346s6d1z9H5a2t3Ruv1DukArMt2KU2icWhz5aVZVAticOX
ajyTdqrno1Fxsu.Sb1aK2GYyp994qSmMAq.Ir6Vee+vmMoJCYCqV518V0rgf
O8UTk+Hp6RoRMFm0YUcYWi8NkBssKrCHnt8CPbuImaXvzMzZp.2XwP7Okczj
J4IyVOcrDI1PjHBkeitVLZxft2T96aAgIDAgeG5hpGMigwg20Av6Jke2N2Wj
JNad+XH8KySOF21Gv7svfLCu110OePP+Zg6BttJzFKHsXUUfKoGhbwUWFCU+
SdHTyvCs.ujzPZs7EYsSXGi086tb8stLGX2jHLC8nVTq71jM6Vrw5p1gHBg+
AjoCCZwclE8tupa1ICkqZa4bx31A2U0mPVYYCJDvXdqIhWuNyENfbakUOsxo
wK2CioB1ffGmmDOK5eb0696qfYH9t7rY4oOj8tO7sESe2Okm4JjEu6uLe8eJ
68wu6mbyAZ069HVQHx9ZD6ceL8wjnOlCy5Z96pDe2VJTukQ9GWMX1OFSGkXI
T34ywUxouZsfL1.DCTtbHsgF1X5ZR2A0Mjm2A00ztipaEuJipqYADVqQd+UR
XccH40XIuVhqqEcGXePcWaAVtyEhXHPn0xtCxaJqdXWXIg6IBfV0M.spbGO9
YK.sta.Zsh+pDf1D.9Ti79qD.ZlLDA.S9ZAf11M.8f5t1F.cqqI5fj+LoazY
kwutnFyKPzYS.qIlvO+fmsnyl.VTLox7ZDc1DxpB0Hu+JAc1DB5rjSdkfNaT
ciNOntqsgN28SqYPfn0cCQK70TYFULbF82mVTUno2qjSO7qMbax.q+ox2SO0
piY1uWQ1tov1m1js5AuEgp0U+h4W9O+SzhS0itbotCyIjnSOHY3skkm0+NNv
MBONxvHX1we4Yg4BsMuHC2IMTkqMAgLbnqUr5ebh.G36OxYhhJZyf.hXNACY
p7OrAK+kbkoWz1DZsJRMLggovz+LHbQaI8a8OZeIaPCWbbu+mNPQQayivvbS
VP3+gSGP++kSi+in+CfZmk9XBv6YKVEwHiGV.sUr.ucgv3+nw0pWzXAsM8dS
YwXS+pFKfEfQOu7ADN1XAm5zK5iDiZc1D9LhXkycVMfuJn0X1QLK.VaOLJc4
yG1G9mRZl6ci0t0ZPuvX+ZZWoH4fZYmXmx52wJjc1sExtCD6gNRb6XMRtNg0
wHIn8vHYLAvSl9fmv4c2IOgKOykORpP3IceLRxPFIQeMRztFIVeLRzQS5QBv
hf0GVD5P7bcjyEORpQajjAH8v2VfKejDAvS5dQOwCYjX80H0k+jtOPX0jwR5
oBInQOLNg.5o5gwQDTTv9.IRDRTPZuLR5.r6X8Q7VQHHQhdYjnAvSBUeMRii
dJDuVQefDEB3JumFmNM75irj0AvPr9fi3jwZjXg3KwJcCtrQJDvUsoO3oPxn
T2G9RNUPm440G1dL1X405Fotjd79.ckEDlWejoBMjnf79H2UpdrhMQkg54dw
iDMTo2Vylrbrppr4aPeXGZJu4gldYC8YkKneQehWtD2HAJuY2Pb0iw+l+S92
bs6+ltv+ecK9zU4IeNcy8KcmINeJtMILsXct+Ko+qJ+q3xUOlAL3h0ok7n6i
5+J2NAvdKzT0Gg+2sdVZ1GJhKVu512WUZF1aaWnTPzvFm4lsCfl2cl2eOqTP
o98ZabC+0u6UtoXpVsYHr2tqYs8uwpcuZAtCYx2c+I+L2Yladuw736KlGYOw
byVVv0u83a.C6t4KT2tqRY7iqmF2lBn4MzBMin7aFETkl4kKFBgt6du7w2KK
XU6SsaKnBrC2VvOrHhKnZ+l1iUJJ20kDLquJwKUUsauM.i+1xjx8E+Oj7X5c
39EUsgnaKr82RVOuMF0yrzgumYVfaCH8lUVoXpoMKzfJcy6sc+pkTd49.M2T
tiPKoD0dMpo82FsA3Wu.iPMF+QvPypWWRtXGh2GWTzlCww26ZZdeq4hIne56
Jt00LxIPVrP7BO1VV7gtca1g6N7H5d7e8BRQNV9Xew5nEbyFdOsv74u5zpkJ
aZzd6M9k623V.dU5GSf8079zE6m+3e4DLkMG2vjp3kaWwTsPYskGQrsZW9W+
1r7rGRV7QmU2uV+IqcvtgDCyavuI4wnRkGtCsFXk6FRO0F2gAB+jmqxwsskD
sxWMvTRteCMkBxZVaQQ1V8FjUUnJ8lz55HJhPQoDeNcVlRTtKBRM6rCycwl3
kkxqvMys8IfcXkI9Np28mXst+bSU+Tqs8mqk7YUgVC19uSy7.qThmQ8quurU
qWm5a1h8jJTm8EYssFz+LhnpUe46jpBrf50ajV8ZGeyDWvkc8dllXOqHpcp4
6mxjeC0UNvZ7dX028yuX+1WxqpZ3dn9gMwqgWy16Kxd+Zy9YQ88JAUqtqerE
85bp+pO8VH0pm5mZJOmSz4mALb2PHATWz6Mpo128X33YsVyy6KRa+Zadyj2S
t9rdcK+YT1F0qI4AgXbdIa1QMHuu3lpZMd.NLgxHmcsEu2Xp8pg3Aliy4mUP
uR701ty6izYaQozUs.uuXoZ076vbjGPw6d0y69JUyZ0Dvmhoazbc5tYl6bqM
2Wdc49BVXtSoPs1KKJQuunbctnDAUusuXKkCpq18QWwNk7LkG8QTbzZl8ETu
ru3UBKv5i8KAityqNX2ClbU065SAts6xI7YVeqOmZa8oWWquz2Vh.plxiGz3
H7T05VEbp2+4UipO25S8E6lbXcn9oKYoCpwzgs1HgUWoO+5kyEyW6TQZ6H8n
y648b7ZE84Tmn6EFtV8fNjY4EPMftWHqc9RW6OcwEmgYn0y4yrVNOvQ7uj2S
sc8NBSKVu9LG5ht0RMY9hsrNr1KG7r+CpdK2WDX85pbvT33.RdXMSN7EPIj5
jbOQf6TOjOsk3okpsXOQb6UqiCl7Bp9F2SzXs5XbvzWP0t3dh91sFEG5aZ2N
kq2yrT81uL.6kHCbPMENbSjPpivWLAVudAehIODXMB9rpOvWLesac.NbbiPp
8uWNwseM9MnoEDXAh8LpouWLCUq1897IBxg0k29y2qwWN0PpCuWLWUUucONN
Xa0X2dX7anV5Fj8af0O2yn141aL0IySAVWbOqZh6S3762V2ael78BTud1FBI
UuF1dY0u1Ks101ccq8RpYsmlf0+oy4dBdHEsZYodxs4Q81+0a++QuiFoF
-----------end_max5_patcher-----------

You need to load this dataset:
transient_dataset_20.zip (15.2 KB)

Also, a more qualitative comment on this.

I have no way of knowing what the numbers should be after fluid.pca~, so I have no way of verifying that process other than seeing the matching I’m getting. But when I go above a certain amount of dimensions in fluid.pca~ the matching drops to nearly 0%, even when feeding it the training data back in.

Using the same patch and process that I outline in the thread about speed, it “works” at something like 152d->8d (with shitty matching percent at 44%), but the moment I move up to 152d->20d (literally only changing that in the patch), the matching drops to 0%.

It seems that either I’m getting a load of dimensions back that are garbage above a certain point, to the point that it ruins the matching, or something is happening in the maths somewhere (hence the nans, funny numbers, and negative numbers after normalization).

This could be cruft from uninitialised memory. We’ll prod it.

1 Like

By and large, even if your data isn’t standardized before PCA, the range of stuff coming out will tend to be in standardised-data range (i.e. largely ±3)

1 Like

That’s useful information. I had no point of reference in terms of expectations, so other than seeing nans and stuff, could only go by what I was getting in the matching.

this looks like 3 entries
2.404e+303 (huge)
-7.6205e-227 (tiny)
-2.2587e-82 (less tiny but still)

but @groma and @weefuzzy will know better than me.

Based on this, I decided to retest stuff.

So running the patch from above, and print-ing the reduced set, then closing the patch, loading it again, and re-print-ing the reduced set over and over gives me this:

print: 
rows: 10 cols: 20
1     6.7399    2.0371    4.3377       ...5.1543e+289-1.2287e+279-3.4544e+284
10    -2.8283   -8.6985     1.012       ...3.8848e+2885.5557e+278-7.1012e+284
2     7.4867    4.9098   -1.4104       ...-3.6521e+289-9.6743e+2792.8294e+284
       ...
7    -2.1185   -3.1353    8.4738       ...-5.4163e+2893.728e+278-4.5481e+284
8     3.9849  0.014451   -7.0811       ...1.9185e+2894.5642e+279-1.9177e+284
9   -0.83661   -2.9675   -0.4302       ...-1.5829e+289-8.6771e+2796.5456e+284
print: 
rows: 10 cols: 20
1     6.7399    2.0371    4.3377       ...3.6106e-3103.8548e-3107.1734e-292
10    -2.8283   -8.6985     1.012       ...1.7581e-310-3.9964e-3101.1415e-292
2     7.4867    4.9098   -1.4104       ...5.2428e-3107.6327e-3101.5268e-292
       ...
7    -2.1185   -3.1353    8.4738       ...7.4322e-3104.1143e-3117.5282e-292
8     3.9849  0.014451   -7.0811       ...-1.0017e-309-1.9054e-3103.0277e-292
9   -0.83661   -2.9675   -0.4302       ...-8.4878e-3104.8599e-310-3.0456e-292
print: 
rows: 10 cols: 20
1     6.7399    2.0371    4.3377       ...-1.5342e+3075.2035e+306       nan
10    -2.8283   -8.6985     1.012       ...3.7579e+3064.7314e+306       nan
2     7.4867    4.9098   -1.4104       ...-4.6995e+307-1.1269e+307       nan
       ...
7    -2.1185   -3.1353    8.4738       ...-1.5332e+3071.7758e+307       nan
8     3.9849  0.014451   -7.0811       ...      -inf-4.4904e+306       nan
9   -0.83661   -2.9675   -0.4302       ...1.0697e+308-1.258e+307       nan
print: 
rows: 10 cols: 20
1     6.7399    2.0371    4.3377       ...       nan         0         0
10    -2.8283   -8.6985     1.012       ...       nan         0         0
2     7.4867    4.9098   -1.4104       ...       nan         0         0
       ...
7    -2.1185   -3.1353    8.4738       ...       nan         0         0
8     3.9849  0.014451   -7.0811       ...       nan         0         0
9   -0.83661   -2.9675   -0.4302       ...       nan         0         0
print: 
rows: 10 cols: 20
1     6.7399    2.0371    4.3377       ...         0         0         0
10    -2.8283   -8.6985     1.012       ...         0         0         0
2     7.4867    4.9098   -1.4104       ...         0         0         0
       ...
7    -2.1185   -3.1353    8.4738       ...         0         0         0
8     3.9849  0.014451   -7.0811       ...         0         0         0
9   -0.83661   -2.9675   -0.4302       ...         0         0         0

I don’t know if PCA is a deterministic process (the first few dimensions seem to lean towards that as they are always the same), but you can see the last few dimensions get pretttty wiggly.

Ah right. Just seemed weird as there was no space between them (same in my copy/pasted console output above). The first few entries have padding between them, and the last ones are just concatenated funny business.

(I’m also vaguely reminded of something @jamesbradbury posted on here a looong time ago about having numbers that looked like that on the output of some process)

A very useful tidbit of information that @tremblap shared with me today is that the amount of dimensions can’t be smaller than the amount of points in your fluid.dataset~.

So in my case, I have 10 points, so anything above that will give be garbage (which is where the actual “bug” happens, in that I shouldn’t be able to transformpoint at all given those restrictions, and as a result am getting garbage values back).

I’ll create a new training set with more points and get back to testing and report back my findings in the relevant thread.

An addendum to this.

It appears that the amount of dimensions you reduce has to be no bigger than your amount of points minus one. If I go from 100d -> 9d (with 10 points), it’s all good. If I go from 100d -> 10d (with 10 points), I get doodoo values.

@groma has added an error feedback but I think 10 should work, except if it is points-1…

I can test again, but when I tried 10d, I was getting dogshit matching. I don’t think I looked at the numbers though.

As in, matching 0% of the time, whereas with 9% it very clearly was matching the training data 100% of the time and the testing data like 50-60% or whatever it was.

Ok, tried testing this but it’s difficult ot make out.

With some datasets (like the one in this patch), I often get “real” values returned for even higher dimension stuff. So asking for 9d and 10d here gives me realistic numbers, whether or not they are valid. Then again, asking for 50d also gives me not garbage numbers…

So if it is an uninitialized memory thing, maybe there happens to be numbers in there that are somewhat reasonable and other times not so much.

if you could give us the larger dataset that generates the problem, that would be ace. It is hard to reproduce here once we respect the size of PCA (which will report an error if the pca requested is larger than (size-1)

I’m using the stuff attached above, both code and dataset.

Those bits where the PCA starts spitting out garbage is with that code, and my tests today with 9/10/50d stuff is also with the code.

edit:
with none of these I get an error, I just either get garbage value (as above), or sometimes get numbers that are sensible and in range (though I guess because of the algorithm, not possible).

I don"t get any garbage with 9 for a dataset of 10 (which is the hard limit) but I don’t have a 50 entry dataset to test with 49

I was saying that I was asking for 50d down from 150d or whatever the example is, while only having 10 entries. So in effect running the latch above but asking for 50d out of PCA rather than 20d.

I’ll test again but all I was pointing out is that sometimes I get numbers that aren’t obviously garbage.

that limits you to 9 dim max in PCA. In the update you won’t be able to run it so you won’t get garbage to feed your normalise which in turn won’t turn garbage out :slight_smile:

1 Like

Yeah that’s ideal.

Was just me realizing why it took me a bit to figure this out as I would sometimes get “real” numbers.