Quantitative investment and trading ideas, research, and analysis.
Thursday, May 07, 2009
My pairs trading workshop in London
I will be holding a 2-day, hands-on, pairs trading workshop in London, October 14-15. It will be held in conjunction with the Automated Trading 2009 conference organized by the Technical Analyst magazine. Please see details here.
Put an irrelevant message here. Here is a question about application of Kelly's rule, which confused me for a long time.
Kelly's rule is about how to set up an optimal leverage based on normal distribution assumption. In real case, we have some restriction, like one time loss should be less than 2% and max loss should less than 10% etc. In most case not only can we not use leverage but also we should reduce the capital we put in the market. In this case how do we combine Kelly's rule and money management together?
Thanks a lot.
This seminar sounds pretty cool. Wish I could attend, but flying over to London would be a bit much.
Pairs trading comprises 80% of my trading.
I've always wondered about cointegration and time horizon. I've noticed that the half lives of some of your example cointegrated pairs are quite long- days and even weeks. How would one develop a high frequency strategy with such a long time horizon?
Is cointegration meaningful in the very short term?
You are right about the gaussian assumption underlying the simple Kelly formula. We should always impose a stricter constraint on leverage based on your expectation of outlier events. Kelly formula merely establishes an upper limit on leverage.
For high frequency pairs trading, the cointegration concept is especially useful. By its very nature, cointegration refers to the long-term behavior of a pair, not the intraday behavior. Hence a pair that is not cointegrating can be a good candidate for intraday trading. On the other hand, a pair that is highly cointegrating can suggest a good candidate for both long and short-term trading.
If Cointegration is long term in nature, why do you say it is especially useful for intraday pairs trading?
If a pair is not cointegrated, what use is that info for intraday trading?
And if it is, and its long-term, again what is the usefulness of that info?
I meant to say cointegration is not especially useful for intraday pair trading. And a pair that is not cointegrating can still be a good candidate for intraday trading. Thanks for catching these errors.
Would love to get across to the UK for your workshop but can't.
Do you have any plans for maybe a DVD of the workshop or making the powerpoints available on your premium section of your website?
The DVD idea is a good suggestion. I will discuss it with Technical Analyst magazine once the workshop is over.
I wonder what the usefulness of a stationarity test is. In the end, because a market was stationary in the prior period, does that really mean it will be stationary in the next one?
And if a market is non-stationary in the prior period, does that mean it will be non-stationary in the next one?
I find the whole thing a bit overrated. Similar to looking at a chart and saying that a time series is consolidating in a range so its useful to sell tops and bottoms.
As soon as the range is identified, the time series starts to drift. How is staionarity useful in predicting the future properties of a time series?
You are right that past stationarity does not guarantee future stationarity, and vice versa. However, I believe there is a higher probability that the future price series will be stationary if you find that it has been stationary for a sufficiently long past period. After all, this is the whole foundation of pair trading. You can in fact estimate this probability by dividing the data set into 2 parts and backtest both. If my statement were not true, no pair trading would be successful.
I bought your book few days ago and I have a small questoin regarding your example 3.6 (pairs trade). You regress the prices of the two ETFs. In other articles I saw that it is often they regress the log price. what is the difference? If we asume that the price returns are log normal distributed isn't it more correct to regress the log price?
If 2 price series cointegrate, then regressing their raw prices vs. log prices makes no difference, since the spread between them will no longer be log-normal. If the 2 price series do not cointegrate, then again their log prices will not cointegrate either.
if cointegration is not useful for intraday pair trading, what IS useful for intraday pair trading?
For short-term pair trading, just use the backtest returns and/or Sharpe as a guide to whether that pair is suitable.
Would like to ask a question after reading your book. What is the optimal window length, from your experience, for cointegration test? I would guess, the longer it was stationary in the past, the more chances it would stay stationary in the future. So, another question arise, what is the minimum window length for the test?
Thanks in advance.
The minimum window for cointegration test is about 1 year. As for the optimal window, it really depends on the halflife of mean reversion for the pair. The longer the halflife, the longer the optimal test period.
What data source do you use for your calculations?
Would you consider doing a workshop in Toronto if I organize it?
I use Yahoo Finance for my pairs backtest.
Sure, I would be interested in holding a workshop in Toronto and elsewhere too.
Have you ever used the Activex API of Interactive Brokers Via MATLAB? Technically it should be easy I guess. Any comments about the efficiency / reliability?
Yes, the Matlab2IBAPI is great for automating trading. I will be posting an article on this soon to my book / premium content website.
Dr Chang do you publish your real p&l
My fund's performance is published monthly to current and prospective investors.
Look forward to this workshop in Toronto.
Thanks writing the Quantitative Trading book! I enjoyed it very much and it has given me new inspiration and ideas in my search for a robust automatic trading system. In chapter about pairs trading you mention using the Dickey Fuller test for see if two pairs are cointegrated, why not the Johansen test? Check this paper, http://ses.library.usyd.edu.au/bitstream/2123/4072/1/Thesis_Schmidt.pdf. Seems Johansen test is preferred. Is the ADF test more than good enough for our purposes of checking for mean reversion?
Thanks, Yaser! Will let you know if this Toronto workshop materializes.
I find that Dickey-Fuller is quite adequate for a pair of stocks. If you have more than 2 stocks, Johansen test is required. But there is a limitation on how many stocks you can run on Johansen tests (somewhere around 15 I think).
What's the downside of publishing your p&l
on the blog? Thought this will be excellent
The SEC does not look favorably on hedge fund managers marketing to the general public. Hedge fund investors are supposed to be "qualified investors" as defined by SEC regulations.
However, interested "qualified" investors are always welcome to email me privately for further discussions.
Do you have some pictures of your trading room ? just for kicks.
I have understood the pair trading concept but I am wondering how to implement quatitativly ..I have some queris....
1.would like to know which language would be best for quatitative analysis of pair trading MAtlab or c language or if excel is also sufficient.
2.would like to know how to split one time series into nonstationary and stationary time series(according to common trend model it assumed to be sum of these two terms) becuase for getting cointegration coefficient i want that stationary part.
please help me with pair trading and tell me if I am going in wrong direction.
My trading room is too boring to take pictures of ... just 2 desks with 6 monitors and 1 TV on top.
I would recommend Matlab for pairs.
Regarding testing for cointegration, you should buy my book which comes with detailed instructions on this test, as well as Matlab codes.
is there all coding suggestions also given in your book?? because i m poor in coding
I have analyzed all statistical arbitrage completely but just have pain in coding and all so please let me know if your book has all coding suggestions also for all time series analysis and arbitrage then i will definitely purchase this book...
Yes, I have included a tutorial on programming in Matlab as an appendix in the book.
if we filter the time series from kalman filter then can we get stationary part of time series by subtracting kalman filtered value and time series value????
can we assume kalman filtered value as nonstationary part of time series?
It may be possible to extract the stationary part of a time series by Kalman filter, but no guarantees here. If your time series is inherently non-stationary, like the price of GOOG, then no amount of Kalman filtering will make it stationary. However, you can always run a stationarity test to confirm your findings.
so how could we get stationary part and nonstationary part separatly from time series data
If a stationary does in fact exist (no guarantee of that!), you can use some VAR or "error correction model" to extract that. See spatial-econometrics.com, or the book Market Model (http://www.amazon.com/dp/0471899755?tag=quantitativet-20&camp=14573&creative=327641&linkCode=as1&creativeASIN=0471899755&adid=1YJJCANYM25W60J2MBP9&)
hiii i made all algorithm based on EG method cointegration and then ADF unit root test.
now when I tested this on second by second data of two stocks of 1 day then it showed the residual non stationary .
as i tried on many combinations it give all non stationary.
where I can get detail regarding selection of stocks and amount of data and tick by tick data or day by day.
If you are a reader of my book or a subscriber to my Premium Content, you can get these 2 articles on http://www.epchan.com/subscription on how to construct a basket of stocks that cointegrate with an ETF, where the residuals are stationary:
1. Index tracking and arbitrage using cointegration
2. Arbitrage between XLE and its Component Stocks
Great Blog Ernest. I learned quite a few things here.
I have a query, if i decide my pair trades based solely upon the ADF test. How do i decide my stop loss and target for the same?
The entry and exit of your pair is determined by the zscore of the spread. For e.g., you might buy when the spread is at 2 std below the mean, and exit when it is at 1 std. For details, you can refer to the GLD-GDX example in my book.
i have been going through your blogs and they have been very helpful..
i am modelling the residuals between the pairs using the OU process.i have estimated the parameters using AR(1) process.is there any way to find the probability of mean reversion or waiting time for mean reversion?the application of half life to OU process seems a bit rigid to me(pl correct me if i am wrong)
i also had a problem with the frequency of observations.considering my time span is fixed should the 'end of day' and 'intra day(per minute)' data give me the same rate of reversion?
it would be great if you can help me with these things..
In order to estimate probabilities of mean-reversion (i.e. derive the probability distribution of the spread value), you need to solve the stochastic differential equation called Ornstein-Uhlenbeck formula. You can google this term and see if you can solve it analytically or numerically.
I expect the mean-reversion rate to be different intraday vs interday. Which one to use depends on your desired trading horizon.
In your pair trading ..How do you decide which critical value out of 3 values of ADF statistics to use?
Thanks once again for your valuable help.
Generally, 90% probability of cointegration is good enough for a working trading model.
Like I said I have estimated the rate of reversion of the OU process using AR(1) now if I use the method of regressing the difference between the spread with the spread itself and get the regression coefficient (which is the rate of reversion), should both the methods give me similar values if the data is the same. I feel they should because both are estimating the rate of reversion.If no, why?
Also, once the above regression is performed how good should its R-square value be.
I expect both methods should give very similar half-lifes.
Thanks ernie for replying.
My question was however:
we have three critical value for testing for cointegration,
1. Critical value for ADF no constant no trend.
2. Critical value for ADF constant but no trend.
3. Critical value for ADF constant and a trend.
When we test for cointegration from lets say matlab, we can use three functions to evaluate the above mentioned three values, now the question is which value to consider as most important one?
You said in a comment above " If my statement were not true, no pair trading would be successful."
I strongly disagree with this.
If you run 50 simulation on a pair with random buying/selling/donothing at each bar and look at the equity curves.... the average equity curve (without costs) will be close to zero.... but a certain percentage will be highly profitable and a certain percentage will be the opposite (big losers).
The point is.... that if you look on every pairs trader in the world as a random agent a certain percentage will be highly profitable just by chance.
I'm doing a PhD in rule-based trading.... and the more research and backtesting i do of technical strategies (both sampled and tickbytick testing) the more i find myself pondering the role of luck in technical trading.
Insider trading is of course not down to luck.... but technical trading is a different story. No profitable technical trader can prove his profits are the result of real skill or luck.
I think to fully understand this concept is paramount to being a successful investor. One must understand the role of chance.
So essentially what you saying is that, all the traders, every logic applied for decision making in trade, every science ( both Quantitative and Qualitative ) is fake?
It all boils down to random luck?
Phew ! Thats a great accusation that you are making.
Of course I am not saying that..... however..... the price series is the emergent behaviour from the market..... and what is the market..... only an "ecology of competing trading strategies" to quote Lo.
In my opinion...... the only strategies that you can guarantee as not being largely due to chance are pure arbitrage (eg flash order strategies - before they were banned) and trades on inside information.
Models induced from historical data are a different story. Let's take an example...... Train a population of trading rules on a dataset...... then take the best rule and test out of sample..... now plot the out of sample equity curve.... lets say it looks good..... Great! we have a good strategy??
Now...... repeat the above steps 1000 times and plot the out-of-sample equity curve from each run...... and tell me what you see. You will most likely see a distribution centred on 0.
My point is that...... there are 1000s of technical traders worldwide backtesting strategies on the likes of Tradestation, Ninjatrader, and other such software, and trade strategies they think gives them an edge.... but....... everyone thinks the same thing....... and if you plot the equity curve of every technical trader in the world i would bet my bottom dollar that the distribution would look much the same as my example above.
Now i'm only including technicl traders in my example..... there are other agents in the market...... ultra high freq arb funds..... market makers .... etc.... who have a very different profile. So I am not generalising.
Retail technical traders have an expected value of zero in my opinion..... and so long as there is fresh blood coming into the game the brokers are happy as they are the real winners in the long run.... along with the market makers and pure arb guys etc.
I'm not saying everything above is correct...... I'm just playing devil's advocate for a minute...
Food for thought...
You can test all 1, 2 and 3, and see which one generate the best cointegration t-statistic -- the best statistic will determine the best model.
My comments above refer to whether stationarity will persist, not whether a pair trading strategy's profitability will persist. There is a substantial difference here: stationarity tests use daily prices for statistical testing, and thus are much more robust and reliable than backtesting a strategy, which may only have a few trades a month.
Your point about the role of luck in trading certainly has some merits (and it is similar to the argument made by Taleb's book Fooled by Randomness.) However, it is highly unlikely that the success of those high frequency traders who make more than 10,000 bets on different symbols a day, and win almost every day (yes, such people exist!) are due to random luck.
Good book by the way.....
In my comment I said I am only including one specific agent: retail technical traders.
Other agents in the game such as high freq arb funds are not taking pure directional bets on models induced from historical data (which is the case for most retail). It is most likely some variation on a market making strategy..... or a latency sensitive strategy ..... rebate strategies..... statistical front running..... much more sophisticated species ....
their edge is mainly technological.......
very very different species to the average retail technical trader...... the market is much like the natural ecological system.... there are different species competing at different levels in the food chain. I would argue that the expected value of the retail technical trader population is zero....... but there are always new retailers coming into the game.... believing they can be successful as they have heard about the profitable onces (the upper end of the equity curve distribution i mentioned) when in fact it is largely down to chance whether or not their basic technical rules profit in the long term.
Once again....... devil's advocate.
A lot of the so-called technical rules are nothing but market making strategies in various guises. That's the reason they work for a lot of traders.
I also disagree with your assertion that the only edge traders can have is technological. Market-making does not have to occur at the extreme high frequency end to be profitable. There are a lot of ways to reduce/diversify the risk in market-making at different frequencies. Some of these ways are in fact incorporated in the technical trading rules that people follow.
How come you regress the direct price series against each other? Why don't you regress the daily return of each price series instead?
Or perhaps a variation of this question is, regress the price time series against each other to find the spread that we are trying to "arb", but then when we actually put on the trade we determine our hedgeratio using the coefficient of the regression from a dailyreturn regression? Intuitively thinking about it, it makes more sense to me to regress daily returns for determining the proper hedgeratio if i'm worried about the day-to-day fluctuations of the trade.
As an example, I'm using the timeframe from your book for GLD-GDX. Regressing price yields the coefficient/hedgeratio from your book of 1.6766*GDX, however regressing daily returns gives a coefficient of 0.4436. If I wanted to put $100,000 in the GLD side of the trade, using 1.6766 as the hedgeratio for GDX vs. 0.4436 would seem to make a world of difference (my dates for my regressions are startdate=5/23/2006 enddate=1/30/2008). Am I missing something really simple here in my logic?
Just for clarification in my post immediately above.
when I say "daily returns" i specifically mean "daily return in percent" (not in dollars in cents).
In pairs trading, we do not worry about daily fluctuations of the P&L. We are concerned whether the round trip trade is ultimately profitable. I have discussed the foundation of pair trading in my book and elsewhere on this blog quite extensively -- the foundation of pair trading is cointegration, which is based strictly on prices, not daily returns. In particular, we don't choose a hedge ratio to minimize the standard deviation of daily returns.
Hope this helps.
I'm tring to build pair trading model using kalman filter. my question is , what is the frequency used to update the fair value of the spread using the filter ( daily/ weekly or at the end of trading period ...)
I think daily updates would be appropriate.
Free Pair Trading Tool - http://catalystcorner.com/index.php?m=pair_tool
Go long/short at the same time. Tool is completely free.
Post a Comment