tag:blogger.com,1999:blog-35364652.post5797275456507718243..comments2024-03-22T10:29:59.088-04:00Comments on Quantitative Trading: Using R to Test for CointegrationErnie Chanhttp://www.blogger.com/profile/02747099358519893177noreply@blogger.comBlogger111125tag:blogger.com,1999:blog-35364652.post-13833955645328909322014-06-13T12:06:39.366-04:002014-06-13T12:06:39.366-04:00Ernie,
Thanks for that clarification. Yes I was r...Ernie,<br />Thanks for that clarification. Yes I was referring to your 2nd book. I'm working through implementing everything in it in R as a way to learn your methods via coding everything as much from scratch as possible! I'm only using R since its basically the programming language I'm most familiar with and feel most comfortable with doing "data wrangling" with, which will be helpful once i attempt applying these pairs trading methods to other datasets which might not be as clean as stock data,etc.<br /><br />Many thanks!<br />-JustinJustinhttps://www.blogger.com/profile/00854498411719627288noreply@blogger.comtag:blogger.com,1999:blog-35364652.post-33895502122376874722014-06-13T12:06:05.204-04:002014-06-13T12:06:05.204-04:00Ernie,
Thanks for that clarification. Yes I was r...Ernie,<br />Thanks for that clarification. Yes I was referring to your 2nd book. I'm working through implementing everything in it in R as a way to learn your methods via coding everything as much from scratch as possible! I'm only using R since its basically the programming language I'm most familiar with and feel most comfortable with doing "data wrangling" with, which will be helpful once i attempt applying these pairs trading methods to other datasets which might not be as clean as stock data,etc.<br /><br />Many thanks!<br />-JustinJustinhttps://www.blogger.com/profile/00854498411719627288noreply@blogger.comtag:blogger.com,1999:blog-35364652.post-68542856988217209452014-06-13T11:34:27.631-04:002014-06-13T11:34:27.631-04:00Justin: You could always plug into Matlab and comp...Justin: You could always plug into Matlab and compare the results =)cheerfulhttps://www.blogger.com/profile/01799630668760078146noreply@blogger.comtag:blogger.com,1999:blog-35364652.post-84013245797174643082014-06-13T11:13:25.433-04:002014-06-13T11:13:25.433-04:00Hi Justin,
When you wrote "my book", I a...Hi Justin,<br />When you wrote "my book", I am assuming you are referring to my second book. In Chapter 2 of that book, I don't think I specified that the y-intercept of the regression between EWA and EWC should be 0. Indeed, as you have noticed, I have included a column of ones in the independent variable of the ols function, indicating that we expect a non-zero intercept. This differs from Paul's R lm fit, which assumes zero intercept.<br /><br />ErnieErnie Chanhttps://www.blogger.com/profile/02747099358519893177noreply@blogger.comtag:blogger.com,1999:blog-35364652.post-13973625586283196782014-06-13T10:53:14.468-04:002014-06-13T10:53:14.468-04:00Thanks Ernie. One more clarification if you don&#...Thanks Ernie. One more clarification if you don't mind.<br /><br />I notice in your ols() expression when computing the hedgeRatio, you include a vector of ones along with the 'x' variable vector, specifically the code in your book below Figure 2.5: ols(y, [ x ones(size(x)) ] ) <br /><br />Is this how you tell ols() to *not* use an intercept with the regression, and effectively fit the model of just: <br />y = beta * x<br /><br />I ask, because the way to accomplish this in R (from Paul's code) is:<br /> <br />lm( y ~ x + 0 ) <br /><br />and I'm getting confused with how in R i'm effectively using '0''s but in Matlab its apparently '1''s ! ;)<br />Justinhttps://www.blogger.com/profile/00854498411719627288noreply@blogger.comtag:blogger.com,1999:blog-35364652.post-29601675632228040562014-06-13T08:22:10.266-04:002014-06-13T08:22:10.266-04:00Hi Justin,
I definitely used adjusted closes for t...Hi Justin,<br />I definitely used adjusted closes for that backtest. However, please note that Yahoo's adjustment is based on a multiplier, and so if you take spreads (differences) using the current data, the spread will depend on when you download the data and how many adjustments have been made since my backtest.<br /><br />Yes, lm is the R equivalent of ols of spatial-econometrics.com.<br /><br />ErnieErnie Chanhttps://www.blogger.com/profile/02747099358519893177noreply@blogger.comtag:blogger.com,1999:blog-35364652.post-10378553687656496882014-06-13T01:14:49.746-04:002014-06-13T01:14:49.746-04:00Hi Ernie (and Paul),
I'm trying to reproduce t...Hi Ernie (and Paul),<br />I'm trying to reproduce the EWC-hedgeRatio*EWA spread chart in Ernie's book (Figure 2.6) in R using Paul's recommended method of computing the hedgeRatio: lm( EWC ~ EWA + 0 ) and my spread plot looks different from Figure 2.6. Mine goes from low points of -3 to highs of +3.5 . I'm using Adjusted Close prices from Yahoo between "2006-4-4" and "2012-4-9" as used in the book. Perhaps the prices used to generate the plots in the book used in the book were the non-Adjusted prices and thus resulted in a different spread? <br /><br />Ernie, can you confirm? As well, can you confirm the method Paul recommends for computing the hedge ratio in R is analogous to ols() method from your Matlab implementation?<br /><br />Thanks!!Justinhttps://www.blogger.com/profile/00854498411719627288noreply@blogger.comtag:blogger.com,1999:blog-35364652.post-86365598647976463012014-06-13T01:14:19.188-04:002014-06-13T01:14:19.188-04:00Hi Ernie (and Paul),
I'm trying to reproduce t...Hi Ernie (and Paul),<br />I'm trying to reproduce the EWC-hedgeRatio*EWA spread chart in Ernie's book (Figure 2.6) in R using Paul's recommended method of computing the hedgeRatio: lm( EWC ~ EWA + 0 ) and my spread plot looks different from Figure 2.6. Mine goes from low points of -3 to highs of +3.5 . I'm using Adjusted Close prices from Yahoo between "2006-4-4" and "2012-4-9" as used in the book. Perhaps the prices used to generate the plots in the book used in the book were the non-Adjusted prices and thus resulted in a different spread? <br /><br />Ernie, can you confirm? As well, can you confirm the method Paul recommends for computing the hedge ratio in R is analogous to ols() method from your Matlab implementation?<br /><br />Thanks!!<br /> Justinhttps://www.blogger.com/profile/00854498411719627288noreply@blogger.comtag:blogger.com,1999:blog-35364652.post-57661856384335183362014-05-21T11:19:10.996-04:002014-05-21T11:19:10.996-04:00Hi cheerful,
We should only be concerned with Shar...Hi cheerful,<br />We should only be concerned with Sharpe ratio after transaction costs.<br />ErnieErnie Chanhttps://www.blogger.com/profile/02747099358519893177noreply@blogger.comtag:blogger.com,1999:blog-35364652.post-90727534259945469132014-05-21T11:10:48.623-04:002014-05-21T11:10:48.623-04:00Ernie: Sharpe ratio of 2 is before or after transa...Ernie: Sharpe ratio of 2 is before or after transaction cost?cheerfulhttps://www.blogger.com/profile/01799630668760078146noreply@blogger.comtag:blogger.com,1999:blog-35364652.post-44843706635374224142014-05-21T11:09:57.086-04:002014-05-21T11:09:57.086-04:00Dr Ernie,
Sharpe ratio of 2 is before or after tr...Dr Ernie,<br /><br />Sharpe ratio of 2 is before or after transaction cost for backtesting?<br /><br />cheerfulhttps://www.blogger.com/profile/01799630668760078146noreply@blogger.comtag:blogger.com,1999:blog-35364652.post-66565956380081828142013-06-05T18:49:57.617-04:002013-06-05T18:49:57.617-04:00@David:
There are serious problems with the code ...@David:<br /><br />There are serious problems with the code in the 2010 blog post that you cite; and the blog comments confirm that it was problematic.<br /><br />Rather than cutting and pasting that experimental code, I suggest using the O-U functions of a proven R package, such as:<br /><br />- <a href="http://cran.r-project.org/web/packages/sde/" rel="nofollow">sde package</a><br /><br />- <a href="http://cran.r-project.org/web/packages/ouch/" rel="nofollow">ouch package</a><br /><br />Good luck with your modeling.<br /><br />PaulPaul Teetorhttps://www.blogger.com/profile/07598717206066693795noreply@blogger.comtag:blogger.com,1999:blog-35364652.post-16125751976494713102013-06-01T17:49:36.360-04:002013-06-01T17:49:36.360-04:00Thanks Ernie,
Do you know of any similar R code?
...Thanks Ernie,<br /><br />Do you know of any similar R code?<br /><br />I tried to re-create the code from your book in R but to no avail.<br /><br />Thank you in advance.<br />DavidAnonymousnoreply@blogger.comtag:blogger.com,1999:blog-35364652.post-78927234031549832192013-06-01T09:08:28.538-04:002013-06-01T09:08:28.538-04:00Hi David,
The halflife calculation on my spreads p...Hi David,<br />The halflife calculation on my spreads page is based on very old data - only the spread itself is updated live, while the average, stddev, and halflife are all computed using 2007 data. This is deliberate, because we want to see how stationary the spread really is. <br /><br />I have Matlab code for halflife calculation in my first book example 7.5.<br /><br />Ernie<br />ErnieErnie Chanhttps://www.blogger.com/profile/02747099358519893177noreply@blogger.comtag:blogger.com,1999:blog-35364652.post-85229547063679419692013-06-01T00:17:50.283-04:002013-06-01T00:17:50.283-04:00Hi Ernie (and Paul):
I am new at this but really ...Hi Ernie (and Paul):<br /><br />I am new at this but really enjoying your blog and two books. I too am trying to use R for my analysis, and I found this code for calculating half-lifes:<br />http://pcweicfa.blogspot.com/2010/08/r-implementation-of-ornstein-uhlenbeck.html<br /><br />This (after running the fixes he mentions) seems to produce very different results for half-lifes than what you have on your subscriptions/spreads page. Am I missing something or is the code incorrect? Is there anywhere I can find working Ornstein-Uhlenbeck code?<br /><br />Many, many thanks. Now back to reading your new book!<br />DavidAnonymousnoreply@blogger.comtag:blogger.com,1999:blog-35364652.post-19397453798489632902013-03-21T14:12:35.590-04:002013-03-21T14:12:35.590-04:00Hi Anon,
Yes, you can trade the exact same spread ...Hi Anon,<br />Yes, you can trade the exact same spread that you use to generate signals.<br /><br />In my book, I trade a spread with same dollar amount on both sides for simplicity only. Strictly speaking, the shares on the two sides should be weighted by the hedgeRatio.<br /><br />ErnieErnie Chanhttps://www.blogger.com/profile/02747099358519893177noreply@blogger.comtag:blogger.com,1999:blog-35364652.post-90753074201067111632013-03-21T10:26:39.105-04:002013-03-21T10:26:39.105-04:00Ive read your algo trading book and its great! I h...Ive read your algo trading book and its great! I have recommended it as a great overview of the area. The difficult thing is to get an overview. After reading it you can choose yourself what specialty you want to study more.<br /><br />I got a fundamental question from a trader: why do I look for a signal on something, and trade on a totally another thing (why is the trade signal differently weighted than the spread)??<br /><br />If I find a cointegrated pair, I also get the hedge ratio. Thus, I can form the <br />spread = s1 - (hedgeratio * s2)<br /><br />So I listen to trade signals, calculated where stock1 and s2 has equal weight (typically the signal is s1/s2 and then we check the std dev). <br /><br />But I trade the spread, where s1 and s2 has different weight (s2 has a weight of Hedgeratio).<br /><br />Should not the trade signal be calculated from s1 and (hedgeratio * s2)? Now the signal is calculated from s1 and s2. I look for a signal on s1/s2, and trade on a different thing: s1 - hedgeratio * s2???Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-35364652.post-28342036170814075492012-11-07T05:40:53.016-05:002012-11-07T05:40:53.016-05:00@Paul Teetor/@Ernie Chan:
Thanks for the great po...@Paul Teetor/@Ernie Chan:<br /><br />Thanks for the great post. I like your sites and blog. I'm a new fun of R and quant trading. I understand that adf.test function removes intercept AND trends before unit root test. that means also that if adf.test says it's cointegrated, the mean of spread reverts to a linear equation (Y = a*X + b). In practice, we need to reconstruct this relation by lm(Y ~ X) and update mean everyday to fit this equation. Is my understanding correct ? and just one question stupid maybe, why also to update stds everyday (is it because the trading threshold changed everyday because of trends of spread) ?. <br /><br />thx in advanceA Leihttps://www.blogger.com/profile/07081163151828413407noreply@blogger.comtag:blogger.com,1999:blog-35364652.post-10633523699033426542012-11-07T05:39:33.652-05:002012-11-07T05:39:33.652-05:00@Paul Teetor/@Ernie Chan:
Thanks for the great po...@Paul Teetor/@Ernie Chan:<br /><br />Thanks for the great post, I like your sites and blog. I'm a new fun of R and quant trading. I understand that adf.test function removes intercept AND trends before unit root test. that means also that if adf.test says it's cointegrated, the mean of spread reverts to a linear equation (Y = a*X + b). In practice, we need to reconstruct this relation by lm(Y ~ X) and update mean everyday to fit this equation. Is my understanding correct ? and just one question stupid maybe, why also to update stds everyday (is it because the trading threshold changed everyday because of trends of spread) ?. <br /><br />thx in advanceA Leihttps://www.blogger.com/profile/07081163151828413407noreply@blogger.comtag:blogger.com,1999:blog-35364652.post-73848424395552745672012-04-29T18:33:38.347-04:002012-04-29T18:33:38.347-04:00Pete,
My web site has become unstable and occasio...Pete,<br /><br />My web site has become unstable and occasionally becomes unavailable. Obviously, that needs to be fixed. In the meantime, that code is still available at the site (when it's up), http://quanttrader.info/public/ .<br /><br />PaulPaul Teetorhttps://www.blogger.com/profile/07598717206066693795noreply@blogger.comtag:blogger.com,1999:blog-35364652.post-87863631204167533902012-04-26T22:23:17.271-04:002012-04-26T22:23:17.271-04:00Hi Ernie,
I have been following your blog regularl...Hi Ernie,<br />I have been following your blog regularly, great work! I don't have Matlab but can use R. I want to use Paul T's R code to test co-integration, but somehow his website seems no longer available. <br /><br />Do you or anyone else happen to have saved his code/note? <br /><br />Thanks a lot,<br />PeteAnonymousnoreply@blogger.comtag:blogger.com,1999:blog-35364652.post-82557376303562334542012-02-09T09:23:17.664-05:002012-02-09T09:23:17.664-05:00hi guys,
are there any java libraries that does t...hi guys,<br /><br />are there any java libraries that does the cointegration test using ADF?<br /><br />i have found one that (Suanshu) that uses Johanson test; but not the above.<br /><br />regards.<br />issy.Alphahttps://www.blogger.com/profile/11487103564597525710noreply@blogger.comtag:blogger.com,1999:blog-35364652.post-64284768214898566412011-11-06T16:13:24.230-05:002011-11-06T16:13:24.230-05:00Sony:
Thanks for your questions.
It seems your f...Sony:<br /><br />Thanks for your questions.<br /><br />It seems your first question is essentially asking, does a shorter half-live indicate a pair is more favorable for trading? It indicates the pair could generate profits more quickly. It does not show the pair is more likely to be mean reverting.<br /><br />As for correlation, it is not useful for trading mean-reverting pairs.<br /><br />I am not exactly clear what you mean by "negative statistics". In any event, I would never execute a mean-reversion trade on a pair whose half-life was negative. That strikes me as illogical.Paul Teetorhttps://www.blogger.com/profile/07598717206066693795noreply@blogger.comtag:blogger.com,1999:blog-35364652.post-32481543006698670942011-10-08T07:20:50.357-04:002011-10-08T07:20:50.357-04:00Dear all,
Great discussion here. Just want your ...Dear all,<br /><br />Great discussion here. Just want your help in understanding better here:<br /><br />Are pairs with 1.) high chance of cointegration (very negative statistics below critical value in the Matlab CADF test), 2.) high correlation(measure by return correlation) and 3.) small half-life (less than 20 days for 1yr time series daily data) more favorable than pairs with only item 1&3? Alternatively, how does "Correlation" come to help a pair selection?<br /><br /><br />Q2: Someone mentioned that in calculating half life, the figure could be negative which means no-mean reversion. But is it possible one could get a high negative statistics yet combined with NEGATIVE half-life? What does that mean?Sunyhttps://www.blogger.com/profile/06931972329321723345noreply@blogger.comtag:blogger.com,1999:blog-35364652.post-52890679610153357242011-03-18T13:57:41.855-04:002011-03-18T13:57:41.855-04:00Hi Paul -
As for your comment on detrending. I ha...Hi Paul -<br /><br />As for your comment on detrending. I have read through your guide on cointegration on pairs located here:<br /><br />http://quanttrader.info/public/testForCoint.html<br /><br />In your comments in this blog you indicate that you detrend your time series spread before running a ADF. In your guide you don't explicitly mention anything on detrending. Is it because the ADF test in R automatically detrends? <br /><br />I'm just trying to better understand. Thank you very much. It would help me a great deal.<br /><br />PeteAnonymousnoreply@blogger.com