So I had a Skype chat with a friend of @Angela’s today (well, friend of mine too now) who was a fairly successful (fancy-pants MIT-trained) quantitative trader, who has since retired from the bloodsucking world that that entails.
When he visited last summer we got to geeking and chatting about data-science-y stuff, and the early days of FluCoMa tools (as well as my background with corpus-based music making). There’s obviously a lot of overlap in the approaches in terms of having data sets and navigating things.
But I figured it might be interesting to pick his brains a bit more properly about some of the stuff I’ve doing, how they (quant traders) go about things, and what, if anything, could be gained from that.
I’ll kind of summarize some of my takeaway points here, and will update this if/when I chat with him again and get something more concrete going.
///////////////////////////////////////////////////////////////////////////////////////////////////////////////////
First, I’ll start with. my crude simplification of what he “does” (or used to do in his case).
Fundamentally the idea is that you take a sample from the massive pool of available trading data and try to build and train a model on that. So taking all the trades from 2013-2016 and building a model that could predict or generate profits within that window and if you get something that works, you then feed it new information (say 2017) and then refine the model. Once you are satisfied with how the model is behaving, then you run it on “live” data and viola, you are doing quant trading.
I don’t remember the margins here, but the improvements that you can hope to gain over “traditional” trading is miniscule. So it’s really about min-maxing every bit you can, algorithmically.
Lastly, he’s a young guy, but in terms of the trading world he is a bit ‘oldschool’, which puts him squarely in Python land (vs Matlab or R), but without an overemphasis on cutting edge ML stuff.
///////////////////////////////////////////////////////////////////////////////////////////////////////////////////
time series
Because of the nature of trading, everything is about time series. That’s obviously something that has come up a bunch in various discussions here, but in order to make temporal predictions, time series, and how much “lookback” is involved is central to the overall paradigm.
I didn’t drill further on this in terms of what this means at an algorithmic level, but summary stats weren’t as big a deal.
features vs algorithms
This one was somewhat surprising, but he put a lot of emphasis on having meaningful features (descriptors) as a starting off point, rather than what the querying/predictive models themselves are. So whether it’s knn or an NN (or CNN) was far less significant than having solid and meaningful features.
This is also a bit of where his ‘oldschool’-ess comes in, and in a sense a kind of wisdom, where graduates coming out of MIT now would be way more model/algorithm savvy but get worse results without the corresponding “fundamentals”.
Specifically he said he most often works with linear regression-based models, which is a pretty “lofI” approach nowadays.
lookback & context
As mentioned above in the time series stuff, the lookback in terms of the algorithms is critical. So that would be a central parameter to look into. Along with this was an awareness of overall context that can make a huge difference.
I’ve spoken with @tremblap a bunch about this, and @hbrown is doing some similar stuff where you have multiple time frames at play and that informs the analysis/querying/etc…
visualization is king
Sadly we didn’t get too into the nitty gritty with this, but he said having a way to visualize the data is incredibly important. I presume that this would involve some kind of dimensionality reduction, but perhaps in a quant trading context this may mean more comparing or looking at specific visualizations for a single dimension (or a couple of dimensions) of data, along with a range of visualization types.
I’ll definitely follow up more on this as this would be quite useful info to tap into.
simple feature sets work better
This is something that has been echoed elsewhere here, but the idea that having a smaller set of features is far better than having a “slurp everything up” approach. So a direct analog would be using a smaller set of descriptors and stats, and just tuning to find what those are vs taking all descriptors/stats and going with that.
Interestingly, although dimensionality reduction factors in, he said a more important thing was finding out which parameters for any given model or algorithm were “explanatory” or “indicative”. Again, I didn’t drill too far into this, but I guess the algorithms at play can report back what parameters have more significance within any given model (vs having 200dimensions which you then reduce down to a generated set of lower dimensions).
it’s all about the data
This is really obvious in a musical context, as that is a big part of the aesthetics of using corpora, but the data you use to build and/or train a model is paramount.
Although banal, this could have some interesting implications with ideas around the creation and navigation macro-corpora as part of the creative process.