Tuesday, December 26, 2006

Do Factor Models Work in the Short Term?

Besides pair-trading, “factor model” is the most popular workhorse of the statistical arbitrageur. In a previous article, I discussed the most well-known factor model – the Fama-French Three-Factor model, with the general market index returns, the market-cap of the stock, and the book-to-price ratio as the only three factors driving returns. However, as I explained earlier, this factor model has a very long horizon. For the quantitative trader who needs to make money every month, the natural instinct is to look for a more “sophisticated” factor that works in the short term, or even to develop some kind of model that use different factors every month in response to “market condition”. Alas, other than hearsays and second-hand gossips, I have never witnessed an actual success of this approach in a hedge fund or proprietary trading group – at least a success that lasts for more than a year.

I am of course not privy to the current performance numbers of factor models run by some of the most successful hedge funds today. However, there is a class of ETF (called “XTF”) marketed by PowerShares Capital Management that uses a similar factor approach for its stock selection criteria. According to media reports, each stock in these XTF’s is scored by 25 variables such as cash flow, earnings growth, price momentum, etc. This sounds like a classic factor model to me. This model is reportedly designed by the quantitative unit at American Stock Exchange. To find out if they have indeed discovered the holy grail of factor models, I looked at the performance of these XTF compared to their benchmarks.

Here I tabulate the XTF’s for each market cap and value category, their corresponding benchmark market index ETF’s, and finally the YTD differential returns up to December 13, 2006. (PJG and PJM have too short a history for this comparison.)

Large capPWV-IVE=4.8%PWC-IVV=-3.6%PWB-IVW=-5.0%
Mid capPWP-IJJ=0.1%PJG-IJH=N/APWJ-IJK=3.1%
Small capPWY-IJS=-0.7%PJM-IJR=N/APWT-IJT=-4.9%

The differential returns are all over the place: some positive, others negative. To me, this is symptomatic of a factor model that does not have predictive power. (After all, if the differential returns are consistently negative, we could have long the ETF, short the XTF, and make consistent profits!) At the very least, this factor model may have a horizon much longer than what most traders would be interested in – in which case, why not just use the simple Fama-French model?

This is not to say that exotic, proprietary factor models have no use: they tend to be pretty useful for risk management, as volatilities and correlations are often easier to predict than returns. But beware every time your risk management software vendor tries to sell you an alpha generator!

Tuesday, December 19, 2006

Another limitation of artificial intelligence and data mining

Sometime ago I espoused my views that AI and data mining techniques may not be suited for predicting financial markets. Here we have an article from the Chief Scientist at IBM's Entity Analytic Solutions Group who believes these techniques are not fit for counterterrorism either. Why? The same reasons I mentioned: not enough historical data.

Thursday, December 14, 2006

DNA, cryptology, speech recognition, and trading

There is an interesting New York Times article on a mathematician and cryptologist who used to work for the wildly successful hedge fund Renaissance Technologies and is now famous for decoding DNA's. This article caught my eyes because quite a few of my former colleagues from the speech recognition research group at IBM also went over to Renaissance as researchers and portfolio managers. Renaissance is an extraordinary hedge fund in Long Island that has an average annual return of 35% since 1989, after charging 5% management fee and 44% incentive fee. They profess to hire only scientists, engineers and mathematicians with as little background in finance as possible. They started off trading futures, but has since then diversified into equities models, and is reportedly raising a $100 billion fund at the moment.

A lot of people want to know the secrets of their success. From the people they hire, one can always guess. The common thread among DNA decoding, cryptography, and speech recognition is information theory, the discipline founded by legendary Bell Labs mathematician Claude Shannon. There are a few tools in information theory that have found wide-spread applications: hidden Markov model is one, expectation-maximization (EM) algorithm is another, and then of course the grandfather of prediction: Bayesian statistics. Needless to say, I have tried them all in my own trading research, but have not met much success so far. Aside from the limitations of my imagination, I suspect the reason is that these tools work much better with higher frequency data than the daily data that I have thus far worked with. Therefore I am not ready to give up yet. (Readers of my earlier article on artificial intelligence may think that I am being inconsistent here, as I was less than enthusiastic about the application of that discipline to trading. There is, however, quite a big difference between information theory and artificial intelligence. The former is characterized by sophisticated theory with very few parameters, the latter, simple theory with a lot of parameters.)

There is one published trading model that is based squarely on research in information theory. It is called Universal Portfolios, created by Stanford information theorist Prof. Thomas Cover. It is an elegant and quite intuitive model, but I don't know how well it performs under realistic conditions. I hope to write about some of my research on this and a related class of models in a future article.

Further reading:

Cover, Thomas M. and Thomas, Joy A. (1991), Elements of Information Theory. John Wiley & Sons, Inc.

Sunday, December 10, 2006

Market-cap and growth-value arbitrage

Predicting whether small-cap or growth stocks will outperform large-cap or value stocks in the next quarter is a favorite pastime of financial commentators. To many financial economists, however, the question is long ago settled by the so-called Fama-French Three-Factor Model. This model postulates that the returns of a stock depend mainly on 3 factors: the general market index returns, the market-cap of the stock, and the book-to-price ratio. Furthermore, as an empirical fact, over the long term (i.e. for any 20-year period), small-caps beat large-caps by an average compounded annual rate of 3.12%, and value stocks beat growth stocks by 4.06% (the latter result applies when we confine ourselves to the large-cap universe).

This model is very convenient to us arbitrageurs. Statistical arbitraguers generally don’t know how to predict market index returns, but we can still make a living in a bear market by buying a small-cap, value portfolio and shorting a large-cap, growth portfolio, and expect to earn 3-4% (on one-side of capital) a year. For example, despite the much anticipated imminent demise of small-caps over the last year or so, I found that if we long the small-cap value ETF IJS, and short the large-cap growth ETF IVW from November 15, 2005 to November 15, 2006, we would have earned about 10% return. The 3-4% average returns look meager, but note that since this is a market-neutral, self-funding portfolio, your prime broker (if you trade for a hedge fund or a proprietary trading firm) will allow you to leverage this return several times.

Some traders will find 20 years a bit too long. Is there any help from academic theory on whether small-cap value will outperform large-cap growth next month, and not next 20 years? A recently published article by Profs. Malcom Baker and Jeffrey Wurgler says there is. (Mark Hulbert wrote a column explaining this in the New York Times recently.) The gist of this article is that when market sentiment is positive, expect small-caps to underperform large-caps by 0.26% a month, and value stocks to outperform growth stocks by 1.24% a month. Conversely, when the market sentiment is negative, expect small-caps to outperform large-caps by 1.45% a month, and value stocks to underperform growth stocks by 1.04% a month. How one computes “sentiment” is complicated: it is a linear combination of 6 variables: closed-end fund discount, NYSE share turnover, number and first-day returns on IPOs, equity share in new issues, and the dividend premium. (The authors used data from 1963-2001 for this study.) Now, without actually computing all these variables, most would agree that the current sentiment (as of December 2006) is fairly positive. This implies, as Mr. Hulbert noted, that small-cap will underperform large cap in the coming months, contrary to the long-term trend. However, the other long-term trend, that value will beat growth, will still hold in the near future. It is up to the reader to find a pair of ETF’s that will take maximum advantage of this prediction, but I will help here by tabulating some of the available funds.


Further reading:

Bernstein, William (2002), The Cross-Section of Expected Stock Returns: A Tenth Anniversary Reflection.
O’Shaughnessy, James P. (2006), Predicting the Markets of Tomorrow. Penguin Books.

Monday, December 04, 2006

Artificial intelligence and stock picking

There was an article in the New York Times a short while ago about a new hedge fund launched by Mr. Ray Kurzweil, a poineer in the field of artificial intelligence. (Thanks to my fellow blogger Yaser Anwar who pointed it out to me.) The stock picking decisions in this fund are supposed to be made by machines that "... can observe billions of market transactions to see patterns we could never see". While I am certainly a believer in algorithmic trading, I have become a skeptic when it comes to trading based on "aritificial intelligence".

At the risk of over-simplification, we can characterize artificial intelligence as trying to fit past data points into a function with many, many parameters. This is the case for some of the favorite tools of AI: neural networks, decision trees, and genetic algorithms. With many parameters, we can for sure capture small patterns that no human can see. But do these patterns persist? Or are they random noises that will never replay again? Experts in AI assure us that they have many safeguards against fitting the function to transient noise. And indeed, such tools have been very effective in consumer marketing and credit card fraud detection. Apparently, the patterns of consumers and thefts are quite consistent over time, allowing such AI algorithms to work even with a large number of parameters. However, from my experience, these safeguards work far less well in financial markets prediction, and over-fitting to the noise in historical data remains a rampant problem. As a matter of fact, I have built financial predictive models based on many of these AI algorithms in the past. Every time a carefully constructed model that seems to work marvels in backtest came up, they inevitably performed miserably going forward. The main reason for this seems to be that the amount of statistically independent financial data is far more limited compared to the billions of independent consumer and credit transactions available. (You may think that there is a lot of tick-by-tick financial data to mine, but such data is serially-correlated and far from independent.)

This is not to say that quantitative models do not work in prediction. The ones that work for me are usually characterized by these properties:

• They are based on a sound econometric or rational basis, and not on random discovery of patterns;
• They have few or even no parameters that need to be fitted to past data;
• They involve linear regression only, and not fitting to some esoteric nonlinear functions;
• They are conceptually simple.

Only when a trading model is philosophically constrained in such a manner do I dare to allow testing on my small, precious amount of historical data. Apparently, Occam’s razor works not only in science, but in finance as well.