Quantitative Trading: 2014

Friday, November 14, 2014

Rent, don’t buy, data: our experience with QuantGo (Guest Post)

By Roger Hunter

I am a quant researcher and developer for QTS Partners, a commodity pool Ernie (author of this blog) founded in 2011. I help Ernie develop and implement several strategies in the pool and various separate accounts. I wrote this article to give insights into a very important part of our strategy development process: the selection of data sources.

Our main research focus is on strategies that monitor execution in milliseconds and that hold for seconds through several days. For example, a strategy that trades more than one currency pair simultaneously must ensure that several executions take place at the right price and within a very short time. Backtesting requires high quality historical intraday quote and trade, preferably tick data for testing. Our initial focus was futures and after looking at various vendors for the tick data quality and quantity we needed, we chose Nanex data which is aggregated at 25ms. This means, for example, that aggressor flags are not available. We purchased several years of futures data and set to work.

Earlier this year we needed to update our data and discovered that Nanex prices had increased significantly. We also needed quotes and trades, and data for more asset classes including US equities and options.

We looked at TickData.com which has good data but is very expensive and you pay up-front per symbol. There are other services like Barchartondemand.com and XIgnite.com where you pay based on your monthly usage (number of data requests made) which is a model we do not like. We ended up choosing QuantGo.com, where you have unlimited access to years of global tick or bar data for a fixed monthly subscription fee per data service.

On QuantGo, you get computer instances in your own secure and private cloud built on Amazon AWS with on-demand access to a wide range of global intraday tick or bar data from multiple data vendors. Since you own and manage the computer instances you can choose any operating system, install any software, access the internet or import your own data. With QuantGo the original vendor data must remain in the cloud but you can download your results, this allows QuantGo to rent access to years of data at affordable monthly prices.

All of the data we have used so far is from AlgoSeek (one of QuantGo’s data vendors). This data is survivorship bias-free and is exactly as provided by the exchanges at the time. Futures quotes and trades download very quickly on the system. I am testing options strategies, which is challenging due to the size of the data. The data is downloaded in highly compressed form which is then expanded (by QuantGo) to a somewhat verbose text form. Before the price split, a day of option quotes and trades for AAPL was typically 100GB in this form. Here is a data sample from the full Options (OPRA) data:

Timestamp, EventType, Ticker, OptionDetail, Price, Quantity, Exchange, Conditions
08:30:02.493, NO_QUOTE BID NB, LLEN, PUT at 7.0000 on 2013-12-21, 0.0000, 0, BATS, F
08:30:02.493, NO_QUOTE ASK, LLEN, CALL at 7.0000 on 2013-12-21, 0.0000, 0, BATS, F
09:30:00.500, ROTATION ASK, LLEN, PUT at 2.0000 on 2013-07-20, 0.2500, 15, ARCA, R
09:30:00.500, ROTATION BID, LLEN, PUT at 2.0000 on 2013-07-20, 0.0000, 0, ARCA, R
09:30:00.507, FIRM_QUOTE ASK NB, LLEN, PUT at 5.0000 on 2013-08-17, 5.0000, 7, BATS, A
09:30:00.508, FIRM_QUOTE BID NB, LLEN, PUT at 6.0000 on 2013-08-17, 0.2000, 7, BATS, A

These I convert to a more compact format, and filter out lines we don't need (e.g. NO_QUOTE, non-firm, etc.)

The quality of the AlgoSeek data seems to be high. One test I have performed is to record live data and compare it with AlgoSeek. This is possible because the AlgoSeek historical data is now updated daily, and is one day behind for all except options, which varies from two days to five (they are striving for two, but the process involves uploading all options data to special servers --- a significant task). Another test is done using OptionNET Explorer (ONE). ONE data is at 5-minute intervals and the software displays midpoints only. However, by executing historical trades, you can see the bid and ask values for options at these 5-minute boundaries. I have checked 20 of these against the AlgoSeek data and found exact agreement in every case. In any event, you are free to contact the data vendors directly to learn more about their products. The final test of data quality (and of our market model) is the comparison of live trading results (at one contract/spread level) with backtests over the same period.

The data offerings have recently expanded dramatically with more data partners and now include historical data from (QuantGo claims) "every exchange in the world". I haven't verified this, but the addition of elementized, tagged and scored news from Acquire Media, for example, will allow us to backtest strategies of the type discussed in Ernie's latest book.

So far, we like the system. For us, the positives are:

1. Affordable Prices. The reason that the price has been kept relatively low is that original vendor data must be kept and used in the QuantGo cloud. For example, to access years of US data we have been paying
Five years of US Equities Trades and Quotes (“TAQ”) is $250 per month
Five years of US Equities 5 minute Bars $75 per month
Three Years of US Options 1 minute bars $100 per month.
Three Year of CME, CBOT, NYMEX Futures Trades and Quotes $250 per month

2. Free Sample Data. Each data service has free demo data which is actual real historical data where I can select data from the demo date range. This allowed me to view and work with the data before subscribing.

3. One API. I have one API to access different data vendors. QuantGo gives me a java GUI, python CLI and various libraries (R, Matlab, Java).

4. On-Demand. The ability to select the data we want "on demand" via a subscription from a website console at any time. You can select data for any symbol and for just a day or for several years.

5. Platform not proprietary. We can use any operating system or software with the data as it is being downloaded to virtual computers we fully control and manage.

Because all this is done in the cloud, we have to pay for our cloud computer usage as well. While cloud usage is continuing to drop rapidly in price it is still a variable cost and it needs to monitored. QuantGo does provide close to real-time billing estimates and alarms you can preset at dollar values.

I was at first skeptical of the restriction of not being able to download the data vendor’s tick or bar data, but so far this hasn't been an issue as in practice we only need the results and our derived data sets. I'm told that if you want to buy the data for your own computers, you can negotiate directly with the individual data vendor and will get a discount if you have been using it for a while on QuantGo.

As we use the windows operating system we access our cloud computers with Remote Desktop and there have been some latency issues, but these are tolerable. On the other hand, it is a big advantage to be able to start with a relatively small virtual machine for initial coding and debugging, then "dial up" a much larger machine (or group of machines) when you want to run many compute and data intensive backtests. While QuantGo is recently launched and is not perfect, it does open up the world of the highest institutional quality data to those of us who do not have the data budget of a Renaissance Technologies or D.E. Shaw.

===
Industry Update
(No endorsement of companies or products is implied by our mention.)

A new site for jobs in finance was recently launched: www.financejobs.co.
A new software package Geode by Georgica Software can backtest tick data, and comes with a fairly rudimentary fill simulator.
Quantopian.com now incorporates a new IPython based research environment that allows interactive data analysis using minute level pricing data in Python.

===

Workshops Update

My next online Quantitative Momentum Strategies workshop will be held on December 2-4. Any reader interested in futures trading in general would benefit from this course.

===
Managed Account Program Update

Our FX Managed Account program had an unusually profitable month in October.

===
Follow me on Twitter: @chanep

Friday, September 05, 2014

Moving Average Crossover = Triangle Filter on 1-Period Returns

Many traders who use technical analysis favor the Moving Average Crossover as a momentum indicator. They compute the short-term minus the long-term moving averages of prices, and go long if this indicator just turns positive, or go short if it turns negative. This seems intuitive enough. What isn't obvious, however, is that MA Crossover is nothing more than an estimate of the recent average compound return.

But just when you might be tempted to ditch this indicator in favor of the average compound return, it can be shown that the MA Crossover is also a triangle filter on the 1-period returns. (A triangle filter in signal processing is a set of weights imposed on a time series that increases linearly with time up to some point, and then decreases linearly with time up to the present time. See the diagram at the end of this article.) Why is this interpretation interesting? That's because it leads us to consider other, more sophisticated filters (such as the least square, Kalman, or wavelet filters) as possible momentum indicators. In collaboration with my former workshop participant Alex W. who was inspired by this paper by Bruder et. al., we present the derivations below.

===

First, note that we will compute the moving average of log prices y, not raw prices. There is of course no loss or gain in information going from prices to log prices, but it will make our analysis possible. (The exact time of the crossover, though, will depend on whether we use prices or log prices.) If we write MA(t, n1) to denote the moving average of n1 log prices ending at time t, then the moving average crossover is MA(t, n1)-MA(t, n2), assuming n1< n2. By definition,

MA(t, n1)=(y(t)+y(t-1)+...+y(t-n1+1))/n1
MA(t, n2)=(y(t)+y(t-1)+...+y(t-n1+1)+y(t-n1)+...+y(t-n2+1)/n2

MA(t, n1)-MA(t, n2)
=[(n2-n1)/(n1*n2)] *[y(t)+y(t-1)+...+y(t-n1+1)] - (1/n2)*[y(t-n1)+...+y(t-n2+1)]
=[(n2-n1)/n2] *MA(t, n1)-[(n2-n1)/n2]*MA(t-n1, n2-n1)
=[(n2-n1)/n2]*[MA(t, n1)-MA(t-n1, n2-n1)]

If we interpret MA(t, n1) as an approximation of the log price at the midpoint (n1-1)/2 of the time interval [t-n1+1, t], and MA(t-n1, n2-n1) as an approximation of the log price at the midpoint (n2-n1-1)/2 of the time interval [t-n1, t-(n2-n1)], then [MA(t, n1)-MA(t-n1, n2-n1)] is an approximation of the total return over a time period of n2/2. If we write this total return as an average compound growth rate r multiplied by the period n2/2, we get

MA(t, n1)-MA(t, n2) ≈ [(n2-n1)/n2]*(n2/2)*r

r ≈ [2/(n2-n1)]*[MA(t, n1)-MA(t, n2)]

as shown in Equation 4 of the paper cited above. (Note the roles of n1 and n2 are reversed in that paper.)

===

Next, we will show why the MA crossover is also a triangle filter on 1-period returns. Simplifying notation by fixing t to be 0,

MA(t=0, n1)
=(y(0)+y(-1)+...+y(-n1+1))/n1
=(1/n1)*[(y(0)-y(-1))+2(y(-1)-y(-2))+...+n1*(y(-n1+1)-y(-n1))]+y(-n1)

Writing the returns from t-1 to t as R(t), this becomes

MA(t=0, n1)=(1/n1)*[R(0)+2*R(-1)+...+n1*R(-n1+1)]+y(-n1)

Similarly,

MA(t=0, n2)=(1/n2)*[R(0)+2*R(-1)+...+n2*R(-n2+1)]+y(-n2)

So MA(0, n1)-MA(0, n2)
=(1/n1-1/n2)*[R(0)+2*R(-1)+...+n1*R(-n1+1)]
-(1/n2)*[(n1+1)*R(-n1)+(n1+2)*R(-n1-1)+...+n2*R(-n2+1)]
+y(-n1)-y(-n2)

Note that the last line above is just the total cumulative return from -n2 to -n1, which can be written as

y(-n1)-y(-n2)=R(-n1)+R(-n1-1)+...+R(-n2+1)

Hence we can absorb that into the expression prior to that

MA(0, n1)-MA(0, n2)
=(1/n1-1/n2)*[R(0)+2*R(-1)+...+n1*R(-n1+1)]
-(1/n2)*[(n1+1-n2)*R(-n1)+(n1+2-n2)*R(-n1-1)+...+(-1)*R(-n2+2)]
=(1/n1-1/n2)*[R(0)+2*R(-1)+...+n1*R(-n1+1)]
+(1/n2)*[(n2-n1-1)*R(-n1)+(n2-n1-2)*R(-n1-1)+...+R(-n2+2)]

We can see the coefficients of R's from t=-n2+2 to -n1 form the left side of an triangle with positive slope, and those from t=-n1+1 to 0 form the right side of the triangle with negative slope. The plot (click to enlarge) below shows the coefficients as a function of time, with n2=10, n1=7, and current time as t=0. The right-most point is the weight for R(0): the return from t=-1 to 0.

Q.E.D. Now I hope you are ready to move on to a wavelet filter!

P.S. It is wonderful to be able to check the correctness of messy algebra like those above with a simple Matlab program!

===

New Service Announcement

Our firm QTS Capital Management has recently launched a FX Managed Accounts program. It uses one of the mean-reverting strategies we have been trading successfully in our fund for the last three years, and is still going strong despite the low volatility in the markets. The benefits of a managed account are that clients retain full ownership and control of their funds at all times, and they can decide what level of leverage they are comfortable with. Unlike certain offshore FX operators, QTS is a CPO/CTA regulated by the National Futures Association and the Commodity Futures Trading Commission.

===

Workshops Update

Readers may be interested in my next workshop series to be held in London, November 3-7. Please follow the link at the bottom of this page for information.

===
Follow me on Twitter: @chanep

Monday, August 18, 2014

Kelly vs. Markowitz Portfolio Optimization

In my book, I described a very simple and elegant formula for determining the optimal asset allocation among N assets:

F=C^-1*M (1)

where F is a Nx1 vector indicating the fraction of the equity to be allocated to each asset, C is the covariance matrix, and M is the mean vector for the excess returns of these assets. Note that these "assets" can in fact be "trading strategies" or "portfolios" themselves. If these are in fact real assets that incur a carry (financing) cost, then excess returns are returns minus the risk-free rate.

Notice that these fractions, or weights as they are usually called, are not normalized - they don't necessarily add up to 1. This means that F not only determines the allocation of the total equity among N assets, but it also determines the overall optimal leverage to be used. The sum of the absolute value of components of F divided by the total equity is in fact the overall leverage. Thus is the beauty of Kelly formula: optimal allocation and optimal leverage in one simple formula, which is supposed to maximize the compounded growth rate of one's equity (or equivalently the equity at the end of many periods).

However, most students of finance are not taught Kelly portfolio optimization. They are taught Markowitz mean-variance portfolio optimization. In particular, they are taught that there is a portfolio called the tangency portfolio which lies on the efficient frontier (the set of portfolios with minimum variance consistent with a certain expected return) and which maximizes the Sharpe ratio. Left unsaid are

What's so good about this tangency portfolio?
What's the real benefit of maximizing the Sharpe ratio?
Is this tangency portfolio the same as the one recommended by Kelly optimal allocation?

I want to answer these questions here, and provide a connection between Kelly and Markowitz portfolio optimization.

According to Kelly and Ed Thorp (and explained in my book), F above not only maximizes the compounded growth rate, but it also maximizes the Sharpe ratio. Put another way: the maximum growth rate is achieved when the Sharpe ratio is maximized. Hence we see why the tangency portfolio is so important. And in fact, the tangency portfolio is the same as the Kelly optimal portfolio F, except for that fact that the tangency portfolio is assumed to be normalized and has a leverage of 1 whereas F goes one step further and determines the optimal leverage for us. Otherwise, the percent allocation of an asset in both are the same (assuming that we haven't imposed additional constraints in the optimization problem). How do we prove this?

The usual way Markowitz portfolio optimization is taught is by setting up a constrained quadratic optimization problem - quadratic because we want to optimize the portfolio variance which is a quadratic function of the weights of the underlying assets - and proceed to use a numerical quadratic programming (QP) program to solve this and then further maximize the Sharpe ratio to find the tangency portfolio. But this is unnecessarily tedious and actually obscures the elegant formula for F shown above. Instead, we can proceed by applying Lagrange multipliers to the following optimization problem (see http://faculty.washington.edu/ezivot/econ424/portfolioTheoryMatrix.pdf for a similar treatment):

Maximize Sharpe ratio = F^T*M/(F^T*C*F)^1/2(2)

subject to constraint F^T*1=1 (3)

(to emphasize that the 1 on the left hand side is a column vector of one's, I used bold face.)

So we should maximize the following unconstrained quantity with respect to the weights F_iof each asset i and the Lagrange multiplier λ:

F^T*M/(F^T*C*F)^{1/2 -}λ(F^T*1-1) (4)

But taking the partial derivatives of this fraction with a square root in the denominator is unwieldy. So equivalently, we can maximize the logarithm of the Sharpe ratio subject to the same constraint. Thus we can take the partial derivatives of

log(F^T*M)-(1/2)*log(F^T*C*F)^-λ(F^T*1-1) (5)

with respect to F_i. Setting each component i to zero gives the matrix equation

(1/F^T*M)M-(1/F^T*C*F)C*F=λ1 (6)

Multiplying the whole equation by F^Ton the right gives

(1/F^T*M)F^T*M-(1/F^T*C*F)F^T*C*F=λF^T*1 (7)

Remembering the constraint, we recognize the right hand side as just λ. The left hand side comes out to be exactly zero, which means that λ is zero. A Lagrange multiplier that turns out to be zero means that the constraint won't affect the solution of the optimization problem up to a proportionality constant. This is satisfying since we know that if we apply an equal leverage on all the assets, the maximum Sharpe ratio should be unaffected. So we are left with the matrix equation for the solution of the optimal F:

C*F=(F^T*C*F/F^T*M)M (8)

If you know how to solve this for F using matrix algebra, I would like to hear from you. But let's try an ansatz F=C^-1*M as in (1). The left hand side of (8) becomes M, the right hand side becomes (F^T*M/F^T*M)M = M as well. So the ansatz works, and the solution is in fact (1), up to a proportionality constant. To satisfy the normalization constraint (3), we can write

F=C^-1*M / (1^T*C^-1*M) (9)

So there, the tangency portfolio is the same as the Kelly optimal portfolio, up to a normalization constant, and without telling us what the optimal leverage is.

===

Workshop Update:

Based on popular demand, I have revised the dates for my online Mean Reversion Strategies workshop to be August 27-29.

===

Follow me @chanep on Twitter.

Wednesday, July 02, 2014

Another "universal" capital allocation algorithm

Financial engineers are accustomed to borrowing techniques from scientists in other fields (e.g. genetic algorithms), but rarely does the borrowing go the other way. It is therefore surprising to hear about this paper on a possible mechanism for evolution due to natural selection which is inspired by universal capital allocation algorithms.

A capital allocation algorithm attempts to optimize the allocation of capital to stocks in a portfolio. An allocation algorithm is called universal if it results in a net worth that is "similar" to that generated by the best constant-rebalanced portfolio with fixed weightings over time (denoted CBAL* below), chosen in hindsight. "Similar" here means that the net worth does not diverge exponentially. (For a precise definition, see this very readable paper by Borodin, et al. H/t: Vladimir P.)

Previously, I know only of one such universal trading algorithm - the Universal Portfolio invented by Thomas Cover, which I have described before. But here is another one that has proven to be universal: the exceedingly simple EG algorithm.

The EG ("Exponentiated Gradient") algorithm is an example of a capital allocation rule using "multiplicative updates": the new capital allocated to a stock is proportional to its current capital multiplied by a factor. This factor is an exponential function of the return of the stock in the last period. This algorithm is both greedy and conservative: greedy because it always allocates more capital to the stock that did well most recently; conservative because there is a penalty for changing the allocation too drastically from one period to the next. This multiplicative update rule is the one proposed as a model for evolution by natural selection.

The computational advantage of EG over the Universal Portfolio is obvious: the latter requires a weighted average over all possible allocations at every step, while the former needs only know the allocation and returns for the most recent period. But does this EG algorithm actually generate good returns in practice? I tested it two ways:

1) Allocate between cash (with 2% per annum interest) and SPY.
2) Allocate among SP500 stocks.

In both cases, the only free parameter of the model is a number called the "learning rate" η, which determines how fast the allocation can change from one period to the next. It is generally found that η=0.01 is optimal, which we adopted. Also, we disallow short positions in this study.

The benchmarks for comparison for 1) are, using the notations of the Borodin paper,

a) the buy-and-hold SPY portfolio BAH, and
b) the best constant-rebalanced portfolio with fixed allocations in hindsight CBAL*.

The benchmarks for comparison for 2) are

a) a constant rebalanced portfolio of SP500 stocks with equal allocations U-CBAL,
b) a portfolio with 100% allocation to the best stock chosen in hindsight BEST1, and
c) CBAL*.

To find CBAL* for a SP500 portfolio, I used Matlab Optimization Toolbox's constrained optimization function fmincon.

There is also the issue of SP500 index reconstitution. It is complicated to handle the addition and deletion of stocks in the index within a constrained optimization function. So I opted for the shortcut of using a subset of stocks that were in SP500 from 2007 to 2013, tolerating the presence of surivorship bias. There are only 346 such stocks.

The result for 1) (cash vs SPY) is that the CAGR (compound annualized growth rate) of EG is slightly lower than BAH (4% vs 5%). It turns out that BAH and CBAL* are the same: it was best to allocate 100% to SPY during 2007-2013, an unsurprising recommendation in hindsight.

The result for 2) is that the CAGR of EG is higher than the equal-weight portfolio (0.5% vs 0.2%). But both these numbers are much lower than that of BEST1 (39.58%), which is almost the same as that of CBAL* (39.92%). (Can you guess which stock in the current SP500 generated the highest CAGR? The answer, to be revealed below*, will surprise you!)

We were promised that the EG algorithm will perform "similarly" to CBAL*, so why does it underperform so miserably? Remember that similarity here just means that the divergence is sub-exponential: but even a polynomial divergence can in practice be substantial! This seems to be a universal problem with universal algorithms of asset allocation: I have never found any that actually achieves significant returns in the short span of a few years. Maybe we will find more interesting results with higher frequency data.

So given the underwhelming performance of EG, why am I writing about this algorithm, aside from its interesting connection with biological evolution? That's because it serves as a setup for another, non-universal, portfolio allocation scheme, as well as a way to optimize parameters for trading strategies in general: both topics for another time

===
Workshops Update:

My next online workshop will be on Mean Reversion Strategies, August 26-28. This and the Quantitative Momentum workshops will also be conducted live at Nanyang Technological University in Singapore, September 18-21.

===
Do follow me @chanep on Twitter, as I often post links to interesting articles there.

===
*The SP500 stock that generated the highest return from 2007-2013 is AMZN.

Friday, May 09, 2014

Short Interest as a Factor

Readers of zerohedge.com will no doubt be impressed by this chart and the accompanying article:

Cumulative returns of most shorted stocks relative to SPX

Cumulative Returns of Most Shorted Stocks in 2013

Indeed, short interest (expressed as the number of shares shorted divided by the total number of shares outstanding) has long been thought to be a useful factor. To me, the counter-intuitive wisdom is that the more a stock is shorted, the better is its performance. You might explain that by saying this is a result of the "short squeeze", when there is jump in price perhaps due to news and stock lenders are eager to sell the stock they own. If you have borrowed this stock to short, your borrowed stock may be recalled and you will be forced to buy cover at this most inopportune time. But this is an unsatisfactory explanation, as this will result only in a short term (upward) momentum in price, not the sustained out-performance of the most shorted stocks. This long-term out-performance seems to suggest that short sellers are less informed than the average trader, which is odd.

Whatever the explanation, I am intrigued to find out if short interest really is a good factor to incorporate into a comprehensive factor model over the long term.

The result? Not particularly impressive. It turns out that 2013 was one of the best years for this factor (hence the impressive chart above). For that year, a daily-rebalanced long-short portfolio (long 50 most shorted stocks and short 50 least shorted stocks in the SPX) returned 6.9%, with a Sharpe ratio of 2 and a Calmar ratio of 2.9. However, if we extend our backtest to 2007, the APR is only 2.8%, with a Sharpe ratio of 0.5 and a Calmar ratio of 0.3. This backtest was done using survivorship-bias-free data from CRSP, with short interest data provided by Compustat.

Here is the cumulative returns chart from 2007-2013:

Cumulative Returns of LS Portfolio based on Short Interest: 2007-2013

Interesting, trying this on the SP600 small-cap universe yielded negative returns, possibly meaning that short-sellers of small caps do have superior information.

I promise, this will be the last time I talk about factors in a while!

===
Tech Update:

I was shocked to learn that Matlab now offers licenses for just $149 - the so-called Matlab Home (h/t: Ken H.) In addition, its Trading Toolbox now offers API connection to Interactive Brokers, in addition to a few other brokerages. I am familiar with both Matlab and R, and while I am impressed by the large number of free, sophisticated statistical packages in R, I still stand by Matlab as the most productive platform for developing our own strategies. The Matlab development (debugging) environment is just that much more polished and easy-to-use. The difference is bigger than Microsoft Word vs. Google Docs.

A reader Ravi B. told me that there is a website called www.seasonalgo.com if you want to try out different seasonal futures strategies.

Finally, a startup at inovancetech.com offers machine learning algorithms to help you find the best combination of technical indicators for trading FX.

===
Workshops Update:

I am now offering the Millisecond Frequency Trading (MFT) Workshop as an online course on June 26- 27. Previously, I have only offered it live in London and to a few institutional investors. It has two main parts:

Part 1: introducing techniques for traders who want to avoid HFT predators.

Part 2: how to backtest a strategy that requires tick data with millisecond resolution using Matlab.

The example strategy used is based on order flow. For more details, please visit epchan.com/my-workshops.

Additionally, I will be teaching the Mean Reversion and Momentum (but not MFT) workshops in Hong Kong on June 17-20.

Thursday, March 27, 2014

Update on the fundamentals factors: their effect on small cap stocks

In my last post, I reported that the fundamental factors used by Lyle and Wang seem to generate no returns on SP500 large cap stocks. These fundamental factors are the growth factor return-on-equity (ROE), and the value factor book-to-market ratio (BM).

I have since studied the effect of these factors on SP600 small cap stocks since 2004, using a survivorship-bias-free database combining information from both Compustat and CRSP. This time, the factors do produce an annualized average return of 4.7% and a Sharpe ratio of 0.8. Though these numbers are nowhere near the 26% return that Lyle and Wang found, they are still statistically significant. I have plotted the equity curve below.

Equity curve of long-short small-cap portfolio based on regression on ROE and BM factors (2004-2013)

One may wonder whether ROE or BM is the more important factor. So I run a simpler model which uses one factor at a time to rank stocks every day. We buy stocks in top decile of ROE, and short the ones in the bottom decile. Ditto for BM. I found an annualized average return of 5% with a Sharpe ratio of 0.8 using ROE only, and only 0.8% with a Sharpe ratio of 0.09 using BM only. The value factor BM is almost completely useless! Indeed, if we were to first sort on ROE, pick the top and bottom deciles, and then sort on BM, and pick the top and bottom halves, the resulting average return is almost the same as sorting on ROE alone. I plotted the equity curve for sorting on ROE below.

Equity curve of long-short small-cap portfolio based on top and bottom deciles of ROE (2004-2013)

Notice the sharp drawdown from 2008-05-30 to 2008-11-04, and the almost perfect recovery since then. This mirrors the behavior of the equity market itself, which raises the question of why we bother to construct a long-short portfolio at all as it provides no hedge against the downturn. It is also interesting to note that this factor does not exhibit "momentum crash" as explained in a previous article: it does not suffer at all during the market recovery. This means we should not automatically think of a fundamental growth factor as similar to price momentum.

My conclusion was partly corroborated by I. Kaplan who has written a preprint on a similar topic. He found that a long-short portfolio created using the ratio EBITA/Enterprise Value on large caps generates a Sharpe ratio of about 0.6 but with very little drawdown unlike the ROE factor that I studied above as applied to small caps.

As Mr. Kaplan noted, these results are in some contradiction not only with Lyle and Wang's paper, but also with the widely circulated paper by Cliff Asness et al. These authors found the the BM factor works in practically every asset class. Of course, the timeframe of their research is much longer than my focus above. Furthermore, they have excluded financial and penny stocks, though I did not find such restrictions to have great impact in my study of large cap portfolios. In place of a fundamental growth factor, these authors simply used price momentum over an 11-month period (skipping the most recent month), and found that this is also predictive of future quarterly returns.

Finally, we should note that the ROE and BM factors here are quite similar to the Return-on-Capital and Earnings Yield factors used by Joel Greenblatt in his famous "Little Book That Still Beats The Market". One wonders if those factors suffer a similar drawdown during the financial crisis.

===

My online Momentum Workshop will be offered on May 5-7. Please visit epchan.com/my-workshops for registration details. Furthermore, I will be teaching my Mean Reversion, Momentum, and Millisecond Frequency Trading workshops in Hong Kong on June 17-20.

Saturday, February 08, 2014

Fundamental factors revisited, with a technology update

Contrary to my tradition of alerting readers to new and fancypants factors for predicting stock returns (while not necessarily endorsing any of them), I report that Lyle and Wang have recently published new research demonstrating the power of two very familiar factors: book-to-market ratio (BM) and return-on-equity (ROE).

The model is simple: at the end of each calendar quarter, compute the log of BM and ROE for every stock based on the most recent earnings announcement, and regress the next-quarter return against these two factors. One subtlety of this regression is that the factor loadings (log BM and ROE) and the future returns for stocks within an industry group are pooled together. This makes for a cross-sectional factor model, since the factor loadings (log BM and ROE) vary by stock but the factor returns (the regression coefficients) are the same for all stocks within an industry group. (A clear elucidation of cross-sectional vs time-series factor models can be found in Section 17.5 of Ruppert.) If we long stocks within the top decile of expected returns and short the bottom decile and hold for a quarter, the expected annualized average returns of this model is an eye-popping 26% or so.

I have tried to replicate these results, but unfortunately I couldn't. (My program generated a measly, though positive, APR.) The data requirement and the program are both quite demanding. I am unable to obtain the 60 quarters of fundamental data that the authors recommended - I merely have 40. I used the 65 industry groups defined by the GIC industry classifications, while the authors used the 48 Fama-French industry groups. Finally, I am unsure how to deal with stocks which have negative book values or earnings, so I omit those quarterly data. If any of our readers are able to replicate these results, please do let us know.

The authors and I used Compustat database for the fundamental data. If you do not have subscription to this database, you can consider a new, free, website called Thinknum.com. This website makes available all data extracted from companies' SEC filings starting in 2009 (2011 for small caps). There is also a neat integration with R described here.

*** Update ***

I forgot to point out one essential difference between the method in the cited paper and my own effort: the paper used the entire stock universe except for stocks cheaper than $1, while I did my research only on SP500 stocks (Hat tip to Prof. Lyle who clarified this). This turns out to be of major importance: a to-be-published paper by our reader I. Kaplan reached the conclusion that "Linear models based on value factors do not predict future returns for the S&P 500 universe for the past fifteen years (from 1998 to 2013)."

===

Speaking of new trading technology platforms that provide historical data for backtesting (other than Thinknum.com and the previously mentioned Quantopian.com), here is another interesting one: QuantGo.com. It provides institutional intraday historical data through its data partners from 1 minute bars to full depth of book in your own private cloud running on Amazon EC2 account for a low monthly rate. They give unlimited access to years of historical data for a monthly data access fee, for examples US equities Trades and Quotes (TAQ) for an unlimited number of years are $250 per month of account rental, OPRA TAQ $250 permonth and tagged news is $200. Subscribers control and manage their own computer instances, so can install and use whatever software they want on them to backtest or trade using the data. The only hitch is that you are not allowed to download the vendor data to your own computer, it has to stay in the private cloud.

===

Follow @chanep to receive my occasional tweets on interesting quant trading industry news and articles.

===

My online Mean Reversion Strategies Workshop will be offered on April 1-3. Please visit epchan.com/my-workshops for registration details. Furthermore, I will be teaching my Mean Reversion, Momentum, and Millisecond Frequency Trading workshops in London on March 17-21, and in Hong Kong on June 17-20.

Wednesday, January 08, 2014

Variance Risk Premium for Return Forecasting

Folklore has it that VIX is a reasonable leading indicator of risk. Presumably that means if VIX is high, then there is a good chance that the future return of the SP500 will be negative. While I have found some evidence that this is true when VIX is particularly elevated, say above 30, I don't know if anyone has established a negative correlation between VIX and future returns. (Contemporaneous VIX and SP500 levels do have a very nice linear relationship with negative slope.)

Interestingly, the situation is much clearer if we examine the Variance Risk Premium (VRP), which is defined as the difference between a model-free implied volatility (of which VIX is the most famous example) and the historical volatility over a recent period. The relationship between VRP and future returns is examined in a paper by Chevallier and Sevi in the context of OVX, which is the CBOE Crude Oil Volatility Index. They have found that there is a statistically significant negative linear relationship between VRP and future 1-month crude oil futures (CL) returns. The historical volatility is computed over 5-minute returns of the most recent trading day. (Why 5 minutes? Apparently this is long enough to avoid the artifactual volatility induced by bid-ask bounce, and short enough to truly sample intraday volatility.) If you believe in the prescience of options traders, it should not surprise you that the regression coefficient is negative (i.e. a high VRP predicts a lower future return).

I have tested a simple trading strategy based on this linear relationship. Instead of using monthly returns, I use VRP to predict daily returns of CL. It is very similar to a mean-reverting Bollinger band strategy, except that here the "Bollinger bands" are constructed out of moving first and third quartiles of VRP with a 90-day lookback. Given that VRP is far from normally distributed, I thought it is more sensible to use quartiles rather than standard deviations to define the Bollinger bands. So we buy a front contract of CL and hold for just 1 day if VRP is below its moving first quartile, and short if VRP is above its moving third quartile. It gives a decent average annual return of 17%, but performance was poor in 2013.

Naturally, one can try this simple trading strategy on the E-mini SP500 future ES also. This time, VRP is VIX minus the historical volatility of ES. Contrary to folklore, I find that if we regress the future 1 day ES return against VRP, the regression coefficient is positive. This means that an increase of VIX relative to historical volatility actually predicts an increase in ES! (Does this mean investors are overpaying for put options on SPX for portfolio protection?) Indeed, the opposite trading rules from the above give positive returns: we should buy ES if VRP is above its moving third quartile, and short ES if VRP is below its moving first quartile. The annualized return is 6%, but performance in 2013 was also poor.

As the authors of the paper noted, whether or not VRP is a strong enough stand-alone predictor of returns, it is probably useful as an additional factor in a multi-factor model for CL and ES. If any reader know of other volatility index like VIX and OVX, please do share with us in the comments section!

===

My online Backtesting Workshop will be offered on February 18-19. Please visit epchan.com/my-workshops for registration details. Furthermore, I will be teaching my Mean Reversion, Momentum, and Millisecond Frequency Trading workshops in London on March 17-21, and in Hong Kong on June 17-20.