Some readers have asked whether there is an Excel version of the ADF test for cointegration (mentioned in articles here or here.) You can download one such package here (Hat tip: Bruce H.).

And as always, you can download the Matlab version from spatial-econometrics.com.

## Tuesday, January 19, 2010

Subscribe to:
Post Comments (Atom)

## 54 comments:

I realize this is self serving, but should you wish to post additional resources, ETF Rewind can automatically calculate ADF cross-matrices for over 50-stocks at a time in Excel. That's quite a grid size.

Keep up the terrific work Ernie.

Sincerely, Jeff

Hi Ernie

You can also get an Excel ADF plug-in here:

http://www.web-reg.de/

Don't worry the site is in English!

Question.

I like to run two ADF tests over the last 504 days, 1:252 then 253:end

You'll often find a year ago (1:252) a pair not cointegrated, but now (253:end) cointegrated.

In your experience do you like to see a couple of years of good cointegration (say tstat > 90%) or are you happy to trade a pair that has only been cointegrated to any significance over the last year (253:end)?

Thanks

Matt

Matt,

Yes, I would often like to see at least 3 years of cointegration before trading a pair. Of course, that's no guarantee that it will continue to cointegrate year-after-year, but I think it is more probable that it would.

Ernie

Jeff,

Thanks for the tip!

Ernie

Hi Ernie

I am starting studying pairs trading using the cointegration model and i have a doubt. How do I measure the trade return. I read some articles and books that says the way to measure the trade return is by simple subtracting the exit spread by the enter spread when spread is LN(asset 1) - (coint.coef. * ln(asset 2).

Is this the tradicional way to measure the return in this kind of strategie ?

Thank you

Hi Paulo,

You can simply calculate the returns of each side of the pair, and assuming you have invested equal capital on each, average these 2 returns.

Ernie

Hi Ernie,

What input time series data should I use for the Excel tool offered below:

http://www.quantcode.com/modules/mydownloads/singlefile.php?cid=9&lid=410

Should I use:

a) difference of prices of stocks 1 and 2?

b) difference of log prices of stocks 1 and 2?

c) residuals of OLS between prices of stocks 1 and 2?

d) residuals of OLS between log prices of stocks 1 and 2?

e) something else?

Thanks!

Hi Anon,

An adf test should be performed on c).

Ernie

thanks, chan, that's what I am looking for, switching from excel to matlab just for an ADF test annoys me, cheers.

sorry, I should call you Ernie, apologize.

Thanks for the reply Ernie. I get different values for the Dickey Fuller Test Statistic if I use OLS residuals from prices vs. log prices. Is there any reason why you chose c) prices instead of d) log prices?

Thanks

Michael

Michael,

You will usually get different t-stat if you use log instead of raw prices, but I bet the conclusions on whether they cointegrate are the same.

I like raw prices instead of log prices is because if you pair trade using log prices, you will have to rebalance your positions daily, whereas if you use raw prices, you can have a fixed number of shares until you exit the pair. But in terms of testing for cointegration, they should be the same.

Ernie

Hi Earnie,

Other than pairs such as IVV/SPY I'm finding that the cointegrating relationship is often unstable if a rolling vector/adf is computed. e.g. the adf may oscillate from -2 (not cointegration) to -4 (cointegrating) during the past 3 yrs. Would you ignore pairs such as these and just focus on ones which are strongly cointegrating at all times?

Thanks, Paul

Hi Ernie, just to add to Paul's question, I have seen many instances of pairs with very negative ADF test statistics (low p-value) when tested from 2006-2009, but their ADF stats for each of 2006, 2007, 2008, 2009 is much less negative (higher p-value). Any thoughts on why this might be? How is it possible that a pair integrate well over the entire 4 years but not so well over each of the individual years?

Thanks

Hi Paul and Michael,

The basic problem with applying ADF test (or in fact, ANY statistical tests) to financial time series is that finance is not like physical science: the statistics changes. So it is no surprise that a pair that used to cointegrate can stop doing so for some period, and then become cointegrating again. But yes, if you find a pair consistently cointegrate over any 1 year period, we can say that it has a better likelihood of repeating the same performance going forward.

Ernie

Thanks Ernie. But why is it possible for a pair to co-integrate with ADF stat of say, -1.50, for each of the INDIVIDUAL years 2006, 2007, 2008, 2009, but to show an ADF stat of -4.50 for the ENTIRE 2006-2009 period? i.e. how is it that the pair co-integrates MUCH better when we look at the entire 4 year period as a whole rather than look at each year individually?

Hi Michael,

That is indeed strange. I have not observed this phenomenon before, but will keep an eye out for it.

Ernie

Thanks for the reply Earnie.

Lets say you have a pair for which the ADF (computed using a rolling 1year window) has oscillated between -2.5 and -4.0 over the past 3 years.

Would you:

a) consider such a pair to be worth trading continuously

b) not trade the pair at all

c) trade the pair only when it was beyond the 5% critical level and deactivating it if it dropped below that (i guess running the risk of 'chasing cointegration')

thanks, Paul

Hi Michael/Paul,

Actually, if a pair has a long half-life, e.g. over a year, then it is quite possible that it will not show cointegration for each individual year, but can be cointegrating over the longer term, such as 4 years.

So to answer your question, Paul, you should compute the half-life of mean-reversion of the pair, and decide if you can tolerate a holding period (and drawdown) of that duration.

Ernie

Ernie, what is half life of a pair and how does one go about computing it? Do you have any useful links on this?

Thanks much

Michael

Earnie, I've been computing my lifehalf by fitting an OU process to my spread, and for those spreads which are strongly cointegrated the HL is indeed short. Which leads me to the question, couldn't I just derive any spread (using the cointegration vector, price ratio or otherwise) and rely solely on the OU fit to determine whether the pair is tradable? Thanks, Paul

Michael,

You can just search for the keyword "half life" on my blog, and you will find several posts on this subject.

Also, it is explained in details in my book Quantitative Trading as well.

Ernie

Paul,

Yes, you can run a profitable short-term mean-reversion strategy on a short-half-life pair, just based on a regression fit.

However, cointegration test is better if you intend to hold the pair over a longer period, since a short half-life does not guarantee cointegration.

Ernie

Ernie,

To get the OLS residual, how do we decide whether to graph stock A on the X-axis and stock B on the Y-axis or vice versa?

Thanks,

A

A,

This is indeed a good point.

As with cointegration test, the order of X vs Y does make some difference. My general belief is that this difference is not big enough to matter. However, if you really want an order-independent way to construct your spread, or conduct your cointegration test, use Johansen rather than cadf test. You can use the eigenvector from the Johansen test to construct your spread.

Ernie

Thanks Ernie,

I apologize for my ignorance when it comes to cointegration. But, I was wondering if you can clarify a few points.

My understanding is that the CADF tests the null hypothesis which is that the series is non-stationary or NOT cointegrated. First, a p value of .05 or smaller is considered statistically significant. Therefore, we would reject the null hypothesis if and only if the p-value is smaller than 0.05, if not, I can't reject the null and my series can be considered as non-stationary or NOT cointegrated. If the p value is less than 0.05 we can consider the series to be cointegrated. Am I correct?

Could you please explain the Dickey Fuller Test Statistic, the lag order, and Rho (1=random walk,0=w noise)? (These are the inputs and outputs in the excel add-in that you provided the link to)

Also, how are the results interpreted?

After we have found that the p value is less than 0.05 for the residuals of OLS between prices of stocks 1 and 2, how do you determine the optimum hedge ratio? Do you just use the slope of the linear regression of stock 1 price vs. Stock 2 price?

I know I’ve asked a lot of questions, but any help would be greatly appreciated.

Thanks,

A

A,

You are right about the p-value.

You can try different lag orders to find out which one will lead to cointegration.

I am not familiar with Rho, as I don't personally use this Excel program. I use the Matlab program instead.

The optimum hedge ratio is determined by linear regression between the 2 price series.

For more details of how this is done, you can read examples 3.6 and 7.2 of my book.

Ernie

Hello,

I run Amibroker and have been trying to find a way to code in the ADF test to get a t-stat or something.

I have already programmed the regression coefficient and the hald life but don't know how to do the ADF test.

Ive looked online and there is no clear explanation(well to me anyways) of what you have to do step by step to obtain this stat. Are you aware of anay such explanation?

Thanks,

J

Hi Ernie,

Does any cointegrated pair have a mean reverting reiduals process? In my opinion, if the optimal lag order p, given by the Augmented Dickey Fuller Test, is different from 0 then the residual of OLS between prices of stocks 1 and 2 is not a stationary process and therefore the corresponding pair cannot be used as a trading pair. In fact, in that case, the residuals process is an AR(p) and thus does not have a constant mean. Am I right?

Thank you,

James.

Hi James,

p is not the lag, it is the order of polynomial. But if p>0, you are right that you have to allow for a non-zero drift in your time series. But that's not a problem with mean-reverting trade: it just says you should also have a long term long or short position, but also trade around it.

Ernie

Ernie,

Have you ever used the adf.test function in R. I ran it in a few time series in pairs, and it keeps giving DF statistics of higher than postive 1.5 and p-values of over 0.99. What do you think could be the cause?

BSendero@gmail.com

Hi intuit,

I actually haven't tried any programs in R myself.

Ernie

Hi Ernie,

Is there a way to conduct ADF test on basket of currencies instead of just a pair? I want to test the mean reverting theory on basket of currencies. If there is, then how to implement or what tool to use. Appreciate your reply.

Sincerely,

Kan

Hi Kan,

If you want to find a basket of currencies that cointegrate, you should use the Johansen test instead.

A free software package for running adf and Johansen tests in Matlab is available from spatial-econometrics.com

Ernie

Ernie,

I´m a beginner at the subject and I´ve downloaded that ADF spreadsheet to test a few stocks for cointegration.

I have the closing price series for these stocks and I would like to know what kind of series I need to enter at that spreadsheet for the ADF-test.

thank you and congratulations for your blog

Michel,

I have not used the Excel adf program myself, but I am guessing that you need to input a price series with ascending order in dates.

Ernie

Ernie,

when I want to use the ADF-test to verify if two stocks (stock1 and stock2) series are cointegrated how should I preoceed? Should I test the ADF on the difference between the stock2 price and the OLS regression line of stock1 and stock2?

Should I use closing price or log of that?

thank you, regards

Michel Caspary

Michel,

You should first use ols regression to find the hedgeRatio between the closing prices of the 2 stocks, then use this hedgeRatio to find the spread of the 2 closing prices, and finally use adf test to find if this spread is stationary.

Of course, if you have the cadf test available, you can run it directly on the 2 closing prices directly to test for cointegration instead.

Ernie

Ernie,

should the coint be done on the residual of stk 1 and 2 or the residual of stk 1 and 2 against time?

Anon,

Both should be tried, and usually the results are the same. But to eliminate order-dependence, use Johansen test where this problem does not arise.

Ernie

Hi Ernie, the link to the Excel package is dead...anywhere else we we can get it?

Hi anon,

It appears that http://www.quantcode.com/modules/mydownloads/singlefile.php?lid=573

has one.

Ernie

Hello Ernie,

so I have a OLS line of a cointegrated pair.

PriceY = PriceX * slope + intercept

lets suppose I want to trade about $1mm of StockY, so I find the number of shares to trade StockY (sharesY). How do I go about finding the number of shares to trade StockX (sharesX) ?

thank you.

Michel

Hi Michel,

In your equation, 1 share of Y is hedged with slope shares of X.

So sharesY will be hedged with slope*sharesY of X.

Ernie

thank you very much

regards.

Michel

Earnie,

I'm using this spreadsheet but I am a little confused on the ln(A)-ln(B) values. Can I use intraday prices (ie. every second prices) or do I have to use EndofDay prices so there is one datapoint per day?

Will using more frequent data screw up my cointegration test?

I have access to submillisecond price data and I want to make sure I can use it instead of EoD prices...

Aditya,

There is no point in using intraday data for cointegration tests, unless you are interested in trading intraday. If you do, you have to concatenate all the intraday prices together, and remove the overnight gap, before you can apply cointegration test.

I also seriously doubt that you can apply Excel on sub millisecond prices due to memory and CPU constraints.

Ernie

Hi Ernie,

I downloaded the excel file you shared a few posts above http://www.quantcode.com/modules/mydownloads/singlefile.php?lid=573

According to the description in the post there it says, "To know if they are mean reverting, simply create a series = Log(MSFT price)-Log(GOOG price). Paste the data into Sheet1 in cell A4"

However in the comments section here you mentioned one should take the residuals and use that as the time series. Is the description for the excel file wrong?

(btw if I regress stock A's return on stock B's return, the residual you talk about would be the error variable and not the standard dev of that error term right? And the whole point of the test would be to test if the residuals are stationary and have a mean of 0 and if so the pair are cointegrated?)

And one more dumb question: in the excel file you linked to, in cell A4 for the time series, do I place the most recent data at the top or the oldest data?

Thanks and kind regards,

Chad

Hi Chad,

I did not write the Excel program, nor did I write the note that we should just use log(MSFT)-log(GOOG) as the signal. My strategy would be to use linear regression to find the best hedge ratio, and use the residual of this LR as signal.

The residual can be standardized or not: either can be used as signal.

I never use that Excel code myself, as I am a Matlab-user, so I am afraid I cannot explain that code to you.

Ernie

Hi Erin,

Am new to all this. Request if you can please send an excel file which can help me understand the working in details along with formulas.

have been trying really hard to understand the concept

Hi Ajay,

I don't actually use that Excel program for my own ADF test, as I am a Matlab user. If you want to understand how ADF test work in general, and to obtain the sample input data files, please see my second book Algorithmic Trading Example 2.1.

Ernie

Hi Ernie,

when you look for the z score of mean reversion pair, do i take the spread 0f the pair for the z score calculation or the residual of that pair?

P.S. i love your books and this blog too : )

Hi,

Thanks for your kind words on my books.

Zscore is based on the spread (i.e. residuals) of prices. One does not take the spread of the Zscores of residuals.

Ernie

Post a Comment