Some readers have asked whether there is an Excel version of the ADF test for cointegration (mentioned in articles here or here.) You can download one such package here (Hat tip: Bruce H.).
And as always, you can download the Matlab version from spatial-econometrics.com.
59 comments:
I realize this is self serving, but should you wish to post additional resources, ETF Rewind can automatically calculate ADF cross-matrices for over 50-stocks at a time in Excel. That's quite a grid size.
Keep up the terrific work Ernie.
Sincerely, Jeff
Hi Ernie
You can also get an Excel ADF plug-in here:
http://www.web-reg.de/
Don't worry the site is in English!
Question.
I like to run two ADF tests over the last 504 days, 1:252 then 253:end
You'll often find a year ago (1:252) a pair not cointegrated, but now (253:end) cointegrated.
In your experience do you like to see a couple of years of good cointegration (say tstat > 90%) or are you happy to trade a pair that has only been cointegrated to any significance over the last year (253:end)?
Thanks
Matt
Matt,
Yes, I would often like to see at least 3 years of cointegration before trading a pair. Of course, that's no guarantee that it will continue to cointegrate year-after-year, but I think it is more probable that it would.
Ernie
Jeff,
Thanks for the tip!
Ernie
Hi Ernie
I am starting studying pairs trading using the cointegration model and i have a doubt. How do I measure the trade return. I read some articles and books that says the way to measure the trade return is by simple subtracting the exit spread by the enter spread when spread is LN(asset 1) - (coint.coef. * ln(asset 2).
Is this the tradicional way to measure the return in this kind of strategie ?
Thank you
Hi Paulo,
You can simply calculate the returns of each side of the pair, and assuming you have invested equal capital on each, average these 2 returns.
Ernie
Hi Ernie,
What input time series data should I use for the Excel tool offered below:
http://www.quantcode.com/modules/mydownloads/singlefile.php?cid=9&lid=410
Should I use:
a) difference of prices of stocks 1 and 2?
b) difference of log prices of stocks 1 and 2?
c) residuals of OLS between prices of stocks 1 and 2?
d) residuals of OLS between log prices of stocks 1 and 2?
e) something else?
Thanks!
Hi Anon,
An adf test should be performed on c).
Ernie
thanks, chan, that's what I am looking for, switching from excel to matlab just for an ADF test annoys me, cheers.
sorry, I should call you Ernie, apologize.
Thanks for the reply Ernie. I get different values for the Dickey Fuller Test Statistic if I use OLS residuals from prices vs. log prices. Is there any reason why you chose c) prices instead of d) log prices?
Thanks
Michael
Michael,
You will usually get different t-stat if you use log instead of raw prices, but I bet the conclusions on whether they cointegrate are the same.
I like raw prices instead of log prices is because if you pair trade using log prices, you will have to rebalance your positions daily, whereas if you use raw prices, you can have a fixed number of shares until you exit the pair. But in terms of testing for cointegration, they should be the same.
Ernie
Hi Earnie,
Other than pairs such as IVV/SPY I'm finding that the cointegrating relationship is often unstable if a rolling vector/adf is computed. e.g. the adf may oscillate from -2 (not cointegration) to -4 (cointegrating) during the past 3 yrs. Would you ignore pairs such as these and just focus on ones which are strongly cointegrating at all times?
Thanks, Paul
Hi Ernie, just to add to Paul's question, I have seen many instances of pairs with very negative ADF test statistics (low p-value) when tested from 2006-2009, but their ADF stats for each of 2006, 2007, 2008, 2009 is much less negative (higher p-value). Any thoughts on why this might be? How is it possible that a pair integrate well over the entire 4 years but not so well over each of the individual years?
Thanks
Hi Paul and Michael,
The basic problem with applying ADF test (or in fact, ANY statistical tests) to financial time series is that finance is not like physical science: the statistics changes. So it is no surprise that a pair that used to cointegrate can stop doing so for some period, and then become cointegrating again. But yes, if you find a pair consistently cointegrate over any 1 year period, we can say that it has a better likelihood of repeating the same performance going forward.
Ernie
Thanks Ernie. But why is it possible for a pair to co-integrate with ADF stat of say, -1.50, for each of the INDIVIDUAL years 2006, 2007, 2008, 2009, but to show an ADF stat of -4.50 for the ENTIRE 2006-2009 period? i.e. how is it that the pair co-integrates MUCH better when we look at the entire 4 year period as a whole rather than look at each year individually?
Hi Michael,
That is indeed strange. I have not observed this phenomenon before, but will keep an eye out for it.
Ernie
Thanks for the reply Earnie.
Lets say you have a pair for which the ADF (computed using a rolling 1year window) has oscillated between -2.5 and -4.0 over the past 3 years.
Would you:
a) consider such a pair to be worth trading continuously
b) not trade the pair at all
c) trade the pair only when it was beyond the 5% critical level and deactivating it if it dropped below that (i guess running the risk of 'chasing cointegration')
thanks, Paul
Hi Michael/Paul,
Actually, if a pair has a long half-life, e.g. over a year, then it is quite possible that it will not show cointegration for each individual year, but can be cointegrating over the longer term, such as 4 years.
So to answer your question, Paul, you should compute the half-life of mean-reversion of the pair, and decide if you can tolerate a holding period (and drawdown) of that duration.
Ernie
Ernie, what is half life of a pair and how does one go about computing it? Do you have any useful links on this?
Thanks much
Michael
Earnie, I've been computing my lifehalf by fitting an OU process to my spread, and for those spreads which are strongly cointegrated the HL is indeed short. Which leads me to the question, couldn't I just derive any spread (using the cointegration vector, price ratio or otherwise) and rely solely on the OU fit to determine whether the pair is tradable? Thanks, Paul
Michael,
You can just search for the keyword "half life" on my blog, and you will find several posts on this subject.
Also, it is explained in details in my book Quantitative Trading as well.
Ernie
Paul,
Yes, you can run a profitable short-term mean-reversion strategy on a short-half-life pair, just based on a regression fit.
However, cointegration test is better if you intend to hold the pair over a longer period, since a short half-life does not guarantee cointegration.
Ernie
Ernie,
To get the OLS residual, how do we decide whether to graph stock A on the X-axis and stock B on the Y-axis or vice versa?
Thanks,
A
A,
This is indeed a good point.
As with cointegration test, the order of X vs Y does make some difference. My general belief is that this difference is not big enough to matter. However, if you really want an order-independent way to construct your spread, or conduct your cointegration test, use Johansen rather than cadf test. You can use the eigenvector from the Johansen test to construct your spread.
Ernie
Thanks Ernie,
I apologize for my ignorance when it comes to cointegration. But, I was wondering if you can clarify a few points.
My understanding is that the CADF tests the null hypothesis which is that the series is non-stationary or NOT cointegrated. First, a p value of .05 or smaller is considered statistically significant. Therefore, we would reject the null hypothesis if and only if the p-value is smaller than 0.05, if not, I can't reject the null and my series can be considered as non-stationary or NOT cointegrated. If the p value is less than 0.05 we can consider the series to be cointegrated. Am I correct?
Could you please explain the Dickey Fuller Test Statistic, the lag order, and Rho (1=random walk,0=w noise)? (These are the inputs and outputs in the excel add-in that you provided the link to)
Also, how are the results interpreted?
After we have found that the p value is less than 0.05 for the residuals of OLS between prices of stocks 1 and 2, how do you determine the optimum hedge ratio? Do you just use the slope of the linear regression of stock 1 price vs. Stock 2 price?
I know I’ve asked a lot of questions, but any help would be greatly appreciated.
Thanks,
A
A,
You are right about the p-value.
You can try different lag orders to find out which one will lead to cointegration.
I am not familiar with Rho, as I don't personally use this Excel program. I use the Matlab program instead.
The optimum hedge ratio is determined by linear regression between the 2 price series.
For more details of how this is done, you can read examples 3.6 and 7.2 of my book.
Ernie
Hello,
I run Amibroker and have been trying to find a way to code in the ADF test to get a t-stat or something.
I have already programmed the regression coefficient and the hald life but don't know how to do the ADF test.
Ive looked online and there is no clear explanation(well to me anyways) of what you have to do step by step to obtain this stat. Are you aware of anay such explanation?
Thanks,
J
Hi Ernie,
Does any cointegrated pair have a mean reverting reiduals process? In my opinion, if the optimal lag order p, given by the Augmented Dickey Fuller Test, is different from 0 then the residual of OLS between prices of stocks 1 and 2 is not a stationary process and therefore the corresponding pair cannot be used as a trading pair. In fact, in that case, the residuals process is an AR(p) and thus does not have a constant mean. Am I right?
Thank you,
James.
Hi James,
p is not the lag, it is the order of polynomial. But if p>0, you are right that you have to allow for a non-zero drift in your time series. But that's not a problem with mean-reverting trade: it just says you should also have a long term long or short position, but also trade around it.
Ernie
Ernie,
Have you ever used the adf.test function in R. I ran it in a few time series in pairs, and it keeps giving DF statistics of higher than postive 1.5 and p-values of over 0.99. What do you think could be the cause?
BSendero@gmail.com
Hi intuit,
I actually haven't tried any programs in R myself.
Ernie
Hi Ernie,
Is there a way to conduct ADF test on basket of currencies instead of just a pair? I want to test the mean reverting theory on basket of currencies. If there is, then how to implement or what tool to use. Appreciate your reply.
Sincerely,
Kan
Hi Kan,
If you want to find a basket of currencies that cointegrate, you should use the Johansen test instead.
A free software package for running adf and Johansen tests in Matlab is available from spatial-econometrics.com
Ernie
Ernie,
I´m a beginner at the subject and I´ve downloaded that ADF spreadsheet to test a few stocks for cointegration.
I have the closing price series for these stocks and I would like to know what kind of series I need to enter at that spreadsheet for the ADF-test.
thank you and congratulations for your blog
Michel,
I have not used the Excel adf program myself, but I am guessing that you need to input a price series with ascending order in dates.
Ernie
Ernie,
when I want to use the ADF-test to verify if two stocks (stock1 and stock2) series are cointegrated how should I preoceed? Should I test the ADF on the difference between the stock2 price and the OLS regression line of stock1 and stock2?
Should I use closing price or log of that?
thank you, regards
Michel Caspary
Michel,
You should first use ols regression to find the hedgeRatio between the closing prices of the 2 stocks, then use this hedgeRatio to find the spread of the 2 closing prices, and finally use adf test to find if this spread is stationary.
Of course, if you have the cadf test available, you can run it directly on the 2 closing prices directly to test for cointegration instead.
Ernie
Ernie,
should the coint be done on the residual of stk 1 and 2 or the residual of stk 1 and 2 against time?
Anon,
Both should be tried, and usually the results are the same. But to eliminate order-dependence, use Johansen test where this problem does not arise.
Ernie
Hi Ernie, the link to the Excel package is dead...anywhere else we we can get it?
Hi anon,
It appears that http://www.quantcode.com/modules/mydownloads/singlefile.php?lid=573
has one.
Ernie
Hello Ernie,
so I have a OLS line of a cointegrated pair.
PriceY = PriceX * slope + intercept
lets suppose I want to trade about $1mm of StockY, so I find the number of shares to trade StockY (sharesY). How do I go about finding the number of shares to trade StockX (sharesX) ?
thank you.
Michel
Hi Michel,
In your equation, 1 share of Y is hedged with slope shares of X.
So sharesY will be hedged with slope*sharesY of X.
Ernie
thank you very much
regards.
Michel
Earnie,
I'm using this spreadsheet but I am a little confused on the ln(A)-ln(B) values. Can I use intraday prices (ie. every second prices) or do I have to use EndofDay prices so there is one datapoint per day?
Will using more frequent data screw up my cointegration test?
I have access to submillisecond price data and I want to make sure I can use it instead of EoD prices...
Aditya,
There is no point in using intraday data for cointegration tests, unless you are interested in trading intraday. If you do, you have to concatenate all the intraday prices together, and remove the overnight gap, before you can apply cointegration test.
I also seriously doubt that you can apply Excel on sub millisecond prices due to memory and CPU constraints.
Ernie
Hi Ernie,
I downloaded the excel file you shared a few posts above http://www.quantcode.com/modules/mydownloads/singlefile.php?lid=573
According to the description in the post there it says, "To know if they are mean reverting, simply create a series = Log(MSFT price)-Log(GOOG price). Paste the data into Sheet1 in cell A4"
However in the comments section here you mentioned one should take the residuals and use that as the time series. Is the description for the excel file wrong?
(btw if I regress stock A's return on stock B's return, the residual you talk about would be the error variable and not the standard dev of that error term right? And the whole point of the test would be to test if the residuals are stationary and have a mean of 0 and if so the pair are cointegrated?)
And one more dumb question: in the excel file you linked to, in cell A4 for the time series, do I place the most recent data at the top or the oldest data?
Thanks and kind regards,
Chad
Hi Chad,
I did not write the Excel program, nor did I write the note that we should just use log(MSFT)-log(GOOG) as the signal. My strategy would be to use linear regression to find the best hedge ratio, and use the residual of this LR as signal.
The residual can be standardized or not: either can be used as signal.
I never use that Excel code myself, as I am a Matlab-user, so I am afraid I cannot explain that code to you.
Ernie
Hi Erin,
Am new to all this. Request if you can please send an excel file which can help me understand the working in details along with formulas.
have been trying really hard to understand the concept
Hi Ajay,
I don't actually use that Excel program for my own ADF test, as I am a Matlab user. If you want to understand how ADF test work in general, and to obtain the sample input data files, please see my second book Algorithmic Trading Example 2.1.
Ernie
Hi Ernie,
when you look for the z score of mean reversion pair, do i take the spread 0f the pair for the z score calculation or the residual of that pair?
P.S. i love your books and this blog too : )
Hi,
Thanks for your kind words on my books.
Zscore is based on the spread (i.e. residuals) of prices. One does not take the spread of the Zscores of residuals.
Ernie
Hi ernie i am unable to download adf excel plugin.
Kindly email "santoshkhooda@gmail.com"
Any one else having pls email "santoshkhooda@gmail.com"
Hi, everyone.
Unfortunately, 'quantcode.com' site closed down a few years ago.
So, can you somebody please send me the downloaded excel file?
My email address is 'zenithgogo@naver.com'.
I'll thank you in advance.
Hj Kim
Hi Ernie,
when you talking about the residual, is it the equation of :(asset 1) - (coint.coef.*(asset 2)
or the residual of regression of the 2 stock ?
And the residual is tested for Adf test for a Delta diff of the residual and a 1 lag residual result.
Is i am right ?
Roger
Hi Ernie,
when you talking about the residual, is it the equation of :(asset 1) - (coint.coef.*(asset 2)
or the residual of regression of the 2 stock ?
And the residual is tested for Adf test for a Delta diff of the residual and a 1 lag residual result.
Is i am right ?
Roger
Hi Roger,
If you are using ADF test for cointegration of two assets, the residual is that of the linear regression between the two prices.
To simplify matters, you can just use the CADF test instead. There will be no need to run a regression, and no need to test residuals.
Ernie
Post a Comment