Consider the oil futures ETF USO and its evil twin, the inverse oil futures ETF DNO*. In theory, if USO has a daily return of x%, DNO will have a daily return of -x%. In practice, if we plot the daily returns of DNO against that of USO from 2010/9/27-2016/9/9, using the usual consolidated end-of-day data that you can find on Yahoo! Finance or any other vendor,
we see that though the slope is indeed -1 (to within a standard error of 0.004), there are many days with significant deviation from the straight line. The trader in us will immediately think "arbitrage opportunities!"
Indeed, if we backtest a simple mean reversion strategy on this pair - just buy equal dollar amount of USO and DNO when the sum of their daily returns is less than 40 bps at the market close, hold one day, and vice versa - we will find a strategy with a decent Sharpe ratio of 1 even after deducting 5 bps per side as transaction costs. Here is the equity curve:
Looks reasonable, doesn't it? However, if we backtest this strategy again with BBO data at the market close, taking care to subtract half the bid-ask spread as transaction cost, we find this equity curve:
We can see that the problem is not only that we lose money on practically every trade, but that there was seldom any trade triggered. When the daily EOD data suggests a trade should be triggered, the 1-min bar BBO data tells us that in fact there was no deviation from the mean.
(By the way, the returns above were calculated before we even deduct the borrow costs of occasionally shorting these ETFs. The "rebate rate" for USO is about 1% per annum on Interactive Brokers, but a steep 5.6% for DNO.)
In case you think this problem is peculiar to USO vs DNO, you can try TBT vs UBT as well.
Incidentally, we have just verified a golden rule of financial markets: apparent deviation from efficient market is allowed when no one can profitably trade on the arbitrage opportunity.
*Note: according to www.etf.com, "The issuer [of DNO] has temporarily suspended creations for this fund as of Mar 22, 2016 pending the filing of new paperwork with the SEC. This action could create unusual or excessive premiums— an increase of the market price of the fund relative to its fair value. Redemptions are not affected. Trade with care; check iNAV vs. price." For an explanation of "creation" of ETF units, see my article "Things You Don't Want to Know about ETFs and ETNs".
- Quantiacs.com just recently registered as a CTA and operates a marketplace for trading algorithms that anyone can contribute. They also published an educational blog post for Python and Matlab backtesters: https://quantiacs.com/Blog/Intro-to-Algorithmic-Trading-with-Heikin-Ashi.aspx
- I will be moderating a panel discussion on "How can funds leverage non-traditional data sources to drive investment returns?" at Quant World Canada in Toronto, November 10, 2016.
- October 22 and 29, Saturdays, Quantitative Momentum Strategies online workshops.
A senior director at a major bank wrote me: "…thank you again for the Momentum Strategies training course this week. It was very beneficial. I found your explanations of the concepts very clear and the examples well developed. I like the rigorous approach that you take to strategy evaluation.”
Hi, any idea about the reason behind?
For those dates which you see the significant deviations from the regression line would there be any chance to arb if we use the BBO 1-min data?
How depressing. Thanks Ernie!
The consolidated closing prices are the trade prices at some random exchange at the close. They can be very far from the official, "primary exchange", auction prices that our MOC or LOC orders are guaranteed to get filled at. These consolidated prices are very noisy. As white noise, they show strong mean reversion, but we cannot trade on them because we won't get filled at those prices.
To add, very often those last traded prices are simply trades from a few seconds or minutes ago, if the ETP is thinly traded. So, you are comparing a 4:00pm price of one with a 3:55pm price of another; no wonder they appear to be out of whack!
Thanks Ernie I think that means we should get the intraday price data e.g. 1 min data, and then use those "about to close" price as the backtesting data instead.
I kind of disagree with your statement.
There will be a trade at 4pm even for a thinly traded ETP, because all exchanges (primary or otherwise) will run an auction at 4pm for every product. So the closing price reported for the day will be one of those auction prices, just not necessarily the primary one!
Yes, in this research, I just used the closing bid/ask quote of the 3:59pm 1-min bar to generate the signal and determine the execution price.
thanks for an excellent article. We face the same issue on our website (your example using our backtester: https://www.pairtradinglab.com/backtests/V-EJVnruUX8rjvTi) - we use consolidated prices too, so it makes backtests inflated for pairs like this one, so we advise our users for now to stay away from pairs closing to 1 or -1 in correlation and avoid pairs with low CAGR with equity looking too good to be true.
In future we want to mitigate this problem by moving to better data provider (who provides closing prices of the primary exchange). Unfortunately no luck yet. You have mentioned that you backtested the example also with 1min bid-ask bars. May I ask who is your data provider? All providers of 1min data we have seen just assemble 1min bars from trades, not quotes, so bid-ask information is lost.
We research all the possibilities before signing up to QuantGo tick data package, which is overkill for us in both price and amount of storage/processing needed.
Thank you for your kind words.
We download the bid-ask bars from Interactive Brokers' API. They are free for customers.
Thanks Ernie. As an IB customer myself, I have access to that data, but I cannot use them to power the website, because of license restrictions and rather unpleasant IB API access rate limits.
Any experience with ActiveTick? They seem to offer tick data too. I might just process their tick data to create less noisy daily data.
Btw one late idea about your example - apparently there is no arbitrage opportunity when testing on last 1min bid-ask bar...but do you think there may be any opportunity if backtesting on for instance last 10 min every day using just raw tick data? Maybe there could be some better opportunity to enter. Also - maybe - while thinking about some smarter method to enter (like keep posting limit order in between the spread for the less liquid leg and then entering via market order for the second leg). What do you think?
Also, do you think that for markets where there is just one single exchange available (I mean certain European, Indian and Asian stock markets), consolidated closing prices in 1D data are fine to use for modeling mean-reversion strategies?
As someone has commented in your original post with link below about low frequency data you are making inductions from noise trading. The strategy you have described is noise trading and performance is naturally affected by frequency. We expect better from you.
If you are trying to sell high frequency data is OK with me but do not do that based on false inductions.
No, I haven't tried ActiveTick myself. And indeed, IB won't allow you to distribute those data!
It won't better whether you use higher frequency data to backtest. I only used the closing bid-ask of the 1-min bar. I could have used 10-min bar, or 1-sec bar, and will get the same bid-ask prices for 4pm ET.
Using limit orders are unlikely to work either: most likely they won't get filled in this situation. But sure, you can try intraday opportunities. Those have not been ruled out by my test.
I am not familiar with markets outside of the US, so I don't know if similar problems exist there.
You don't actually need high frequency data for this backtest if you have access to the primary exchange closing prices, available from Bloomberg. You can also get the closing bid-ask quotes from CRSP historical data.
I believe I have already answered your narrow, and I believe erroneous, argument on noise trading previously. So I won't repeat it here.
Hello Ernie, thanks for article.
What you demonstrated is true for the particular example you chose where the payoff is of the same order with the bid-ask spread. But this is not a problem when payoff is an order of magnitude larger than bid-ask spread. Obviously you would not backtest a trend-following system based on monthly moving averages with minute data. Maybe for your own purposes and models this is true but it is not true for the majority of low frequency traders. The other day I saw an article by Michael Harris where the opposite was shown to be true, a backtest on minute data inflated performance by little http://bit.ly/2dT0yr4
Yes I agree if you are doing arbitrage then backtesting on higher frequency data is necessary but for the majority of traders and system in low frequency trading in liquid markets this is really not required when the payoff is a lot larger than the bid-ask spread and slippage.
You are correct. The difference between consolidated close and the midprice mainly causes problems for mean reversion strategies, as I explained in my earlier blog post. The issue is not whether the payoff is big or small, the issue is whether the signal is much bigger than the noise. The signal in this case is the sum of returns of 2 highly anti-correlated instruments, hence the magnitude is small, and comparable to that of the data noise. However, many pair trading strategies suffer from the same problem.
Also, as I explained in a comment above, it is not an issue of whether we use 1-min, 1-hour, or daily data. CRSP provides daily data, with closing bid-ask, and that is perfectly suitable for backtesting this strategy. If you use primary exchange closing prices, those also will not have this problem.
An unrelated but I hope it is a proper question to ask here. Ernie in your commentary email you mentioned there are some events such as fed rate announcement and presidential debate both have some influence to your strategies.
In general, when we set the parameters for our strategies should we strictly follows quantitative measures such as kelly criterion, backtest results etc? Or should we put in personal "forecasted" discretion in the parameters settings, such as expecting some market fluctuations due to presidential debate?
Yes, parameters should be set based on quantitative measures: otherwise a backtest is meaningless. However, a risk manager can always override a model, since the model cannot possibly incorporate all new geopolitical developments. For example, we also ceased trading FX prior to the Brexit vote, which saved us a bundle.
Would you please recommend references for market making strategies?
Any mean reverting strategies can be considered market making - you can find examples of that in my book Algorithmic Trading (see the link on my Recommended Books list on the right sidebar on my blog.) For high frequency market making, please refer to the Algorithmic and High Frequency Trading book by Cartea, also on my Recommended Books list.
Hey Ernie, I was reading part of your second book again and saw the mean reverse between XLE and oil future/USO. I check the 10 years performance, XLE has raised 14.64% while USO has dropped 79.83%. This is around 95% different in 10 years, how to think about the strategy? In intraday/days swing range with tight stop lose point? Thanks.
The XLE-USO is not a mean reversion strategy. It is a momentum strategy. Hence the divergence between XLE and USO prices should not have negative impact on it.
Why do you suggest using limit orders won't be helpful to this strategy?
Using limit orders will just mean that most of the times we won't get filled.
I've been reading your blog with great interest and have a question I was hoping you could answer. With the slower models, say with an expected return horizon of 30 days, would you use non-overlapping 30 day returns in the model estimation and, if so, how do you overcome the issue of a very small sample size?
Thanks for your interest, John.
I would re-estimate the model everyday with a lookback of 30 days or more. There is no need to use non-overlapping periods - you should just assume you have 30 portfolios, with each holidng 30 days, but each one is updated on a different day.
I'm a big fan of your blog and books and I thank you for sharing you knowledge. I have a model that takes positions in ETFs on the official opening print. Is it feasible to expect decent liquidity in the opening auction for ETFs? How would you model slippage and overall trading costs?
I've been told that liquidity for ETF is horrible around the open but I have to assume that for issues that trade in large size this shouldn't be a very big problem.
Thanks for your interest in my writings.
Yes, the opening auction for ETFs with large AUM has good liquidity. There is no slippage if you submit MOO/LOO orders. But that's assuming you have access to the primary exchange open and close prices, which are rarely available from data vendors (except Bloomberg and high frequency data vendors that give you the crossing tag.) Otherwise, you have to use midprice at open/close to estimate the execution price.
Post a Comment