Monday, April 16, 2007

Out-of-sample test on cointegrating basket of stocks

An anonymous reader "L" posted some thoughtful objections to the way I constructed the basket of stocks that is supposed to cointegrate with XLE. His main objection is that even though my basket shows cointegration with XLE in-sample, this is likely to fail out-of-sample. Actually, I agree with him that the strong statistical relationship discovered in-sample is most likely going to be weakened out-of-sample, most often because the nature of the component stocks is always changing, due to various corporate events (management change, restructuring, change of strategic direction, etc.). However, from a practical trading point of view, I believe that the relationship should not be weakened to the point that the trading signals become spurious, at least over a time-scale of a trade which is several months to half-a-year at most.

To demonstrate this, let's break up the dataset over 2 periods: 20010522 - 20030123 and 20030124 - 20070403. In the first in-sample period (with 1,000 data points), we pick our 10 stocks to form the basket, and in the second out-of-sample period we see how well it cointegrates with XLE, and we observe how the spread behaves. I found that in the first period, the t-statistic for cointegration is -3.61934140, indicating the basket cointegrates with over 95% probability. No surprise here. Here is a plot of the spread in this period:


















Now, let's find out what happens in the out-of-sample period. Here the t-statistic is just -2.72, whereas the critical value for cointegration at 90% probability is -3.03. So indeed the basket fails to cointegrate at the 90% confidence level. Does that mean our trades will therefore be losing out-of-sample? Not necessarily. Take a look at the behavior of the spread out-of-sample:



















Even though it is not nicely symmetric around zero as in the in-sample period, the spread is still clearly bounded around zero. If the basket completely falls out of cointegration with XLE, it will show a random drift away from zero as time goes on.

To show that this is not just good luck based on our specific in-sample period, let's try a longer in-sample period of 1500 days (shorter in-sample period won't work, because we need a minimum of 1,000 data points here to construct a good reliable basket.) Here the cointegration t-statistic is a bit worse, at -2.62. If we look at the spread:



















Once again, we see that the spread is bounded, not wandering off to infinity. So in conclusion, I maintain that my method of constructing the basket is good for practical trading, though not necessarily guaranteeing as high a statistical confidence level as might be indicated in the in-sample period.

26 comments:

  1. Ernie,

    Wow, I went back to add a thought about "pair trading" (there is a popular version of it based on correlation of returns between stocks, not cointegration) and saw you've already performed the test I suggested yesterday.

    Your results for out-of-sample #1 are exactly what I have seen in my own quick experiments:

    - the ADF value for out-of-sample residuals is lower than in sample.

    - the residual series is visually much less mean-reverting. The frequency of zero crossing is much smaller, for example. (The fact that it's "bounded around zero" is of little value - exp(-x) is bounded around zero for x>0 but it never crosses zero.) If someone put on a spread position based on in-sample data, they might be disappointed when the spread isn't closing for what, like 2 years, according to your first out-of-sample graph?

    - the amplitude of residual spikes is much higher, possibly implying much higher volatility/risk. That is, if the out-of-sample residuals are still stationary at all -- which in fact they might not be anymore. At least they don't look so very convincingly.

    - what's worse, the change in behavior occurs quite rapidly, almost as soon as the out of sample period begins -- this is food for thought...

    I think both of us would agree on these observations -- but I disagree with your final conclusion. In order to improve things, you increased the training period to 1500 points (your last chart) -- that's an additional 2 years of historical data. But:

    (a) in purely historical backtesting, you might have known to do that out-of-sample check even if I didn't suggest it. But what if X is yesterday and you have no real out-of-sample data because you want to start trading the spread tomorrow?

    (b) in general, I agree that longer training interval is better for stability of predictions. But how to determine the optimal training duration? Use too few training points and your model will not predict well. Use too many and you might be in danger that statistical properties of XLE, its components, and the market can change substantially throughout the time range spanned by training data.

    Thanks for doing the experiment,
    L

    ReplyDelete
  2. I guess trading strategy is all about plausible idea---not proven idea. Anyway it is a game. L, you can kill any trading idea by your Scientific Mind...

    ReplyDelete
  3. L,
    Sure, if you wait until the spread reverts back to exactly zero before you exit, you might have to wait forever. But I don't think any real trading strategy would do that. What I discuss here in this blog is not a complete trading strategy: it is a trading idea that can be developed into a complete trading strategy. On one hand, A) there is a cointegration test; on the other hand, B) there is an implementation of a trading strategy based on the cointegration test. It is not that simple to go from A) to B), otherwise every econometrician will become a trader. As every trader know, real trading result is always inferior to backtest result, therefore a sensible trading strategy need to take that into account.

    Thanks for the discussion!
    Ernie

    ReplyDelete
  4. Checked back and just wanted to respond to this odd comment:

    "L, you can kill any trading idea by your Scientific Mind..."

    Huh? It is quite obvious that Ernie himself applies a "Scientific Mind" to his ideas. I am merely continuing in the same vein. I wanted to highlight how dangerous it might be to believe a trading idea that's not cross-validated. It is rather easy to come up with a trading idea that looks good in-sample -- simply by construction. We all know about "spurious regression", after all. If the synthetic asset is not substantially mean-reverting out of sample, it is very questionable whether there is an "A) -> B)" process that is profitable.

    I hope you agree that it is only consistent to apply the same quantitative metrics out of sample that were used to build the idea in-sample.

    L

    ReplyDelete
  5. Ernie,

    Nice work and points you are making. If you are planning on creating a basket of 10 stocks and trade it against XLE you might be wiped out (commissions).

    1. Why aren't you trying to take into consideration just
    Exxon Mobil Corp.XOM 21.7%
    Chevron Corp. CVX 13.0%
    Conocophillips COP 9.1%

    2. How did you come up with the "weights" each asset has in your basket? Something like...
    XLE = w1*XOM+w2*CVX+w3*COP... where sum(wi)=1?

    http://www.etfconnect.com/select/fundpages/etf_funds.asp?MFID=50829


    3. Do you know a web page where you can look at historical components of XLE?

    4. What type of "index" basket is XLE? I'm desperately looking for a prospect (computation) of XLE.

    Thanks for any inputs you might have,

    Max

    ReplyDelete
  6. http://www.etfconnect.com/select/
    fundpages/etf_funds.asp?MFID=50829

    ReplyDelete
  7. Max,

    Thank you for your interest in my article.

    Trading this basket of 10 stocks vs 1000 shares of XLE costs only $13.25 one way, if your commission is the same as mine (0.5 cent/share) at Interactive Brokers.

    1) The components of this basket are not determined arbitrarily -- they are based on the process described in a published article as well as my own research. Basically, we pick N number of stocks that historically has the highest cointegration likelihood with XLE.
    Therefore, we can't include XOM or CVX just because they constitute the highest weights in XLE.

    2) The weights, expressed as number of shares, are determined by a multivariate regression with the prices of XLE. I assume 1,000 shares of XLE on one side.

    3) http://www.sectorspdr.com/spdr/index.cfm?story=composition&symbol=XLE
    lists all XLE components. But I am not sure about historical components.

    4) According to http://www.sectorspdr.com/spdr/index.cfm?story=disclosure&symbol=XLE,
    XLE should track the energy index in the Select Sector Indexes.

    Ernie

    ReplyDelete
  8. Ernie,

    Thanks for the fast comments on my questions.

    Indeed (with IB) you trade at a discount compared to many brokers in Germany (or elsewhere).

    Where can I find a copy of the paper you mentioned or do you posses a copy that you can send me?

    In (4) I was more thinking about how XLE is calculated. There is a huge difference between Dow (price weighted) and S&P (weighted average market capitalization). However, since it tracks s&p energy it should be market cap.

    Best,
    Max

    ReplyDelete
  9. Max,

    You can find references to various articles in my previous post http://epchan.blogspot.com/2007/02/in-looking-for-pairs-of-financial.html

    Best,
    Ernie

    ReplyDelete
  10. Ernie,

    That is a nice piece of work.

    But please dont forget that ETF's have a cash component and it inevitably leaves you with a statistical tracking error.

    Plus when you try to do a co-integration analysis you are actually ignoring the fact that *unless* you re-create the history of ETF by using current composition you will always have huge out of sample tracking error. This due to the fact that your current composition is also trying to fit the curve to the historical changing composition.

    In all, I think its a nice piece of work but needs a little more forward looking approach

    - Natty Virk

    ReplyDelete
  11. Dear Natty,

    Thank you for your thoughtful comments. I agree with you that the analysis would be made more solid if I had done it with historical components of XLE. Unfortunately I could not find such data. If you know of such a data source, please share with us!

    Ernie

    ReplyDelete
  12. Dear Ernie,

    My apologies, I wasn't very clear in my post.

    When I said forward looking analysis, what I meant was the following :

    1) ETF has been rebalanced at least 20 times in last 5 years hence the returns would have been different if you were holding todays composition of stock for last five years than if you were holding the ETF for last five years. And, theoretically, going forward you are going to hold todays composition.

    2) ETF has dynamic cash component that causes tracking error, False tracking error.

    3) ETF pays dividends quarterly where as the underlying stocks pay dividends at different times. ETF's value does not change when a stock goes ex-dividend but the stock decreases in value and vice versa, another cause of false tracking error.

    So it means that if you hold all the stocks in the current composition of ETF and back test for tracking error, you will perhaps have a huge tracing error.

    Now, if you use some combination of stocks to track this ETF using its historic returns, all you are doing is backward looking analysis.

    So, how to tackle this problem.

    1) Take today's composition of ETF. Create a synthetic return series using todays # of stocks and their respective historic returns divided by its creation unit. This also removes tracking error due to dividend mismatch and cash component.

    2) Create a tracking basket to track this synthetic return cause going forward this will be the ETF composition (at least for a quarter). Re balance this basket every time ETF re balances.

    This also eliminates the need to know historic composition of ETF.

    Please feel free to comment and/or email me at QuantArtistic (AT) gmail (DOT) com

    Also, i was wondering if you have any posts on technical analysis.

    Thanks

    - Natty Virk

    ReplyDelete
  13. Dear Natty,

    Thanks again for your elucidation. Here are my thoughts:

    1) Though I have used a fixed basket of stocks against the ETF in my cointegration analysis, I should have updated the basket every month based on the historic ETF composition at the time, as well as using the latest price data. (This is what I did for my pairs portfolio.) I skipped this step because of the lack of historical ETF composition data. Going forward, one should certainly update this basket this way, as you have suggested.

    2) The cash component is only about 3% of the market value of XLE. Furthermore, this percentage does not fluctuate very much. Therefore I doubt it will introduce much tracking error in the cointegration analysis.

    3) In any backtest analysis, one always use dividend-adjusted historical prices. Therefore, the historical stock price will not suffer a drop on the ex-date. Going forward, one should update the basket composition on the ex-date.

    I haven't written anything about technical analysis because I am not a believer in this technique.

    Ernie

    ReplyDelete
  14. Dear Ernie,

    I appreciate your prompt response.

    There are way too many issues in any basket trading and you always have to make a choice on what is worth the time spent, and as far as any Statistical Arbitrage strategy is concerned, there is no perfect answer to anyone of them.

    Your posts are interesting and thought provoking. And given that you have a decent population reading and replying to them, it makes it worth reading.

    Again, thanks.

    - Natty Virk

    ReplyDelete
  15. Hi

    I know this thread is old but hopefully you'll see this query.

    What is your view on the appropriate period to use for a cointegration test?
    I have read other sites that talk about pairs trading and correlation (I know not the same as cointegration).

    Some authors propose using a back test period equal to what you expect your average future trade length to be.

    So you might run a rolling cointegration test over say 30 days of data.

    Is that a good idea or do you like to see a "robust" cointegration holding over a longer period to give you greater confidence going forward.

    My problem with that is I have run cointegration tests on 1000 days of data and found high cointegration. However, when I re-test on rolling 30 days I get (in some cases) poorer cointegration. So cointegration isn't going to be good for some trade periods.

    I'm not sure what is the best thing to do?

    Thanks

    ReplyDelete
  16. Hi anonymous,
    Cointegration is a long-term phenomenon and cannot be discovered with a 30 day test period, in my opinion. I recommend at least 1 year.
    Ernie

    ReplyDelete
  17. Hi

    As a result of following your website and buying your book I have run extensive cointegration tests across lots of stocks.

    In a suprising number of cases I have found high cointegration, with high critical values, across long time periods for pairs of stocks that are in completely different industries.

    So from an economic point of view I can't justify trading these pairs since in theory they have nothing in common except being part of the SP500. However, the stats for their cointegration is robust over long periods.

    Do you have any thoughts on how to approach this type of situation?

    David

    PS. Love'd your book

    ReplyDelete
  18. Hi David,
    Even though the stocks are in different sectors, they may have a relationship that is not immediately apparent. Perhaps it is supplier-customer? Substitutes? If they cointegrate very well, it is worth spending some effort in looking for economic relationships, maybe consulting industry experts.
    Ernie

    ReplyDelete
  19. Hi Ernie,

    I recently subscribed to your blog and like your ideas. I have been looking at this strategy and have a couple questions.

    1. The live spreads section for XLE strategy contains 4 stocks that are not part of XLE holdings (eg RIG, NE etc). Is it that this spread include previous holdings of XLE and has not be updated in a while?

    2. Regarding the number of shares for each component in the basket, once you have identified which stocks to choose for the basket using co-integration, how did you figure out the weights for each stock? Did you fit a linear model on the 10 stocks in the basket and then choose the number of shares based on the coefficient of each stock in this linear model?

    3. Is there any matlab/excel code in your subscription section that sheds light about figuring out the number of shares for each component of the basket?

    Would appreciate your comments regarding this.

    ReplyDelete
  20. Hi Vanes,
    1) Yes, all the stock components and parameters in that model are fixed since 2006 when I first created the portfolio. It is not updated and should not be used for live trading -- it is purely for illustration of the method.
    2) Yes, a multivariate regression was used to find the hedge ratios.
    3) No, I did not include the code, but you can easily construct it following my detailed steps as outlined in my article posted on the premium content section.
    Best,
    Ernie

    ReplyDelete
  21. Hi Ernie, thanks for your reply, a continuation to my earlier question.

    1. What constitutes 1 unit of the basket? In your live spreads section, you have given the number of shares for each of the 10 stocks in the basket. Is one unit of basket the sum of shares of all 10 stocks?

    Suppose this sum is 500 and the price of each stock was appx $100, then each unit of the basket will have a price of $50,000?

    2. If 1 share of XLE is $50 and as given above, 1 unit of basket = $50,000 and we long 1000 units of basket and short 1000 shares of the XLE, the 1000 units of Basket will require a capital of $50,000 x 1000 = $50Million? doesn't sound right.

    3. If you use options, the co-integration tests done on the equities would not hold anymore since the time series for the options will differ significantly compared to the stocks. In this case, how significant are the co-integration tests between the options on XLE and the underlying stocks?

    4. You said that you could implement the XLE position as options to boost the return. Could you use options for the stocks in the basket as well?

    5. If we have a capital of $10 million, how would you choose the trade size on each side? How effective will be Kelly's criterion?

    Thanks!

    ReplyDelete
  22. Hi Vanes,
    1) Yes and yes, your interpretation is correct.
    2) In the table on epchan.com, I specified 1000 shares of XLE against the respective number of shares of its component stocks. You should not trade 1000x this number. Also, XTO is no longer a valid symbol, hence the incorrect closing price.
    3) I have not tested cointegration of options, mainly because I don't have high quality historical data on options. But as various commentators on my post on options suggest, one should not use options to implement pair trading unless the holding period is very short (< 1 week).
    4) Yes, but some stocks' options are very illiquid.
    5) Assuming you have access to unlimited leverage, you should indeed use half-Kelly to figure out your order size. You would need to backtest this strategy and find out the average returns and volatility in order to use Kelly formula.

    Ernie

    ReplyDelete
  23. Hi Erinie,

    Thank you for the valuable information on your blog.

    I had a quick question for you re: basket construction.

    In the regression is it imperative to drop the constant term? Essentially adding a constant moves the mean to near zero - no? So in that sense a basket may be cointegrated around and bounded by a value far away from zero but by adding the constant allows you to find baskets that could be tradable.

    I wonder whether by allowing a constant term, the out-of-sample relationship has a greater chance of weakening because you cant take a position in the constant.

    Appreciate some guidance here. Thank you.

    ReplyDelete
  24. Hi Ernie,

    This post is interesting....

    Say if you re-estimated the cointegration relationship everyday. You could do a rolling window estimation (i.e. 3 years or whatever time period works). This way you can assess what is going on with the basket. If you already have an open position you could re-balance it based on the new information about the mean and component weights.

    Well one hopes that the basket relationship doesnt completely fall apart but by re-estimating the mean and weights everyday you could avoid data fitting problems. Not sure if the added costs of rebalancing would negate the profit potential and make this an unviable trading strategy.

    I would appreciate your thoughts on this.

    Thank you.

    SeanP

    ReplyDelete
  25. Hi Anon,
    It is not imperative to set the constant offset to the regression to zero, but setting to zero often gives better results out-of-sample because of the reduction of the number of free parameters for fitting.
    Ernie

    ReplyDelete
  26. Hi SeanP,
    For pairs of ETF's, it is not essential to recompute cointegration every day because their relationships do not change so frequently. Once a month is sufficient. However, for stock pairs, or baskets of ETF's or stocks, they do fall out of cointegration fairly quickly, so daily computation is necessary.
    Ernie

    ReplyDelete