Monday, December 04, 2006

Artificial intelligence and stock picking

There was an article in the New York Times a short while ago about a new hedge fund launched by Mr. Ray Kurzweil, a poineer in the field of artificial intelligence. (Thanks to my fellow blogger Yaser Anwar who pointed it out to me.) The stock picking decisions in this fund are supposed to be made by machines that "... can observe billions of market transactions to see patterns we could never see". While I am certainly a believer in algorithmic trading, I have become a skeptic when it comes to trading based on "aritificial intelligence".

At the risk of over-simplification, we can characterize artificial intelligence as trying to fit past data points into a function with many, many parameters. This is the case for some of the favorite tools of AI: neural networks, decision trees, and genetic algorithms. With many parameters, we can for sure capture small patterns that no human can see. But do these patterns persist? Or are they random noises that will never replay again? Experts in AI assure us that they have many safeguards against fitting the function to transient noise. And indeed, such tools have been very effective in consumer marketing and credit card fraud detection. Apparently, the patterns of consumers and thefts are quite consistent over time, allowing such AI algorithms to work even with a large number of parameters. However, from my experience, these safeguards work far less well in financial markets prediction, and over-fitting to the noise in historical data remains a rampant problem. As a matter of fact, I have built financial predictive models based on many of these AI algorithms in the past. Every time a carefully constructed model that seems to work marvels in backtest came up, they inevitably performed miserably going forward. The main reason for this seems to be that the amount of statistically independent financial data is far more limited compared to the billions of independent consumer and credit transactions available. (You may think that there is a lot of tick-by-tick financial data to mine, but such data is serially-correlated and far from independent.)

This is not to say that quantitative models do not work in prediction. The ones that work for me are usually characterized by these properties:

• They are based on a sound econometric or rational basis, and not on random discovery of patterns;
• They have few or even no parameters that need to be fitted to past data;
• They involve linear regression only, and not fitting to some esoteric nonlinear functions;
• They are conceptually simple.

Only when a trading model is philosophically constrained in such a manner do I dare to allow testing on my small, precious amount of historical data. Apparently, Occam’s razor works not only in science, but in finance as well.

15 comments:

  1. If you ask me its all a big marketing gimmick.

    People love quant. trading- even though they have no god damn idea about it- this apparent fascination with it makes them fall for all these new sorts of HFs (don't get me wrong Ray is a smart dude) which try to stand out on advanced methods that incorporate AI and quant trading.

    ReplyDelete
  2. Ernie,
    Well said. In my experience, trading is certainly hard enough without throwing in all sorts of esoteric formulae to ponder when the system encounters a drawdown. I have a question though: What would be an example of a trading system/method which has no parameters? Don't all systems have paramters?
    Thanks for the Blog,
    Steve Halpern

    ReplyDelete
  3. Steve: A trading model with no parameters is one where we have averaged over all possible values of all parameters. In physicists' language, we have "integrated out" all degrees of freedom in the parameter space.
    -Ernie

    ReplyDelete
  4. Ernie,

    But what about the case when we remove the noise from the data and then perform AI techniques. For Example, Wavelets combined with Support Vector Machines ( A new Machine Learning tool) would in my opinion smoke all this NN crap out.
    SVM are much better when compared to NN....Just my opinion...

    ReplyDelete
  5. Certainly one can find better tools than neural networks, and certainly denoising the data with wavelets is better than not denoising the data. However, no matter what AI tools one use, as long as the number of free parameters are large compared to the amount of statistically independent data, the problem of over-fitting remains.

    ReplyDelete
  6. Interesting commentary on this posting is accumulating over at:

    Data mining useless in finance?

    ReplyDelete
  7. one can of course use svm's as well, but they are only suitable for offline learning, whereas NN can respond to the environment while it is still learning

    ReplyDelete
  8. It's reassuring that you don't like the sophisticated techniques, but -- you don't back-test?

    Are you saying people who back-test over-calibrrate?

    ReplyDelete
  9. Regarding the comments ... how can you call ANY market movement "noise"? Each one is a real bid or ask.

    ReplyDelete
  10. Chris,
    In my view, we should use backtest to validate an otherwise reasonable strategy, not to "discover" (i.e. datamine) a new strategy from scratch.

    "Noise" in the trading context means that it is a non-repeatable pattern.

    Ernie

    ReplyDelete
  11. When I first read a book about quantitative trading I wondered "Don't you ever worry about overfitting?" And that was in a book that didn't mention machine learning at all. Just the plain quant/technical algorithms.

    In fact, no AI algorithm is totally immune to overfitting. The big problem is: You aren't either. Even the process of entirely "human" ideas will overfit historical data, if you don't guard against opportunistic data mining.

    Without any practical experience in the field I still would use genetic programming to at least tweak some parameters of the strategies all the while bootstrapping it until it breaks. Also MCMC might work to find the spread of parameters and uncertainty involved in estimating them. I would not dare to fit many parameters for an SVN or a deep NN to the very limited amount of useful financial data available.

    Another point: There is a cost of overfitting and a cost of "not enough data" as well. It might very well be that not including enough recent data (for CV, bootstrapping etc) might be more costly in terms of certainty than ignoring the overfitting problem in a "stupid" model.

    ReplyDelete
  12. You say that a model should be "...based on a sound econometric or rational basis,..."

    Now what is a "sound econometric or rational basis" for you? I mean, ex-ante you can only guess, and ex-post your theory is either falsified or not (not to say "verified"). But even a model based on the most rational basis could be successfull just by chance. You never know, not even with hindsight. Financial markets do not adhere to laws like pyhsics, and repeatable experiments are not possible.

    So no matter how rational your basis for a model sounds, I still will only shrug my shoulders, because you can never prove that your model is successfull because of your "rational basis". You will even have a hard time just to give some indication that the performance of your model is actually a result of the rationale behind it.

    The only propositions I consider rational and sound when it comes to financial markets are the EMH, resp. the random walk, or Martingales. But building a trading model on them would obviously be contradictory.

    ReplyDelete
  13. Anon,
    Traders like us generally leave debates on EMH to academia. Fortunately we don't need to prove to anyone else that EMH is false: we only need to see our own networth increase exponentially!
    Ernie

    ReplyDelete
  14. Can you recommend any resources to learn more about wavelets for time series analysis? I have found very few resources for applying AI to time series. Weka and other tools are designed for static data sets.

    AI is good for data fitting as long as one limits the complexity of the models and uses a very large data set and generates at least 10 or more trades. Least squares is very vulnerable to having one data point overwhelm all others, so for both techniques graphing the results and doing a common sense check is very important I feel.

    ReplyDelete
  15. scottc,
    Matlab actually has a wavelet toolbox. You can visit their documentation online for free.

    Actually if you just google wavelet, I am sure you will find numerous other packages with documentation online too.
    Ernie

    ReplyDelete