Quantitative Trading: Reader suggested a possible trading strategy with the GLD

Friday, November 17, 2006

Reader suggested a possible trading strategy with the GLD - GDX spread

Steve Hansen of Raymond James Ltd. in Vancouver, B.C. suggested to me that a good trading signal for the GLD - GDX spread is when it exceeds 2 standard deviations from its mean. He observed that these are roughly at +/- $250 based on my definition of the spread, and that there were 3 such (immensely profitable) signals since the inception of GDX. Here is Mr. Hansen's plot:

This certainly looks like a fairly safe strategy. Of course, if one desires more frequent signals, one can always enter into smaller positions at smaller spread values.

By the way, just when we were celebrating the reversion of the GLD - GDX spread this morning, the QM - XLE spread plunged to another multi-year low. With crude oil prices down about 30% from its all-time-high, XLE, the energy stocks ETF, is still within 5% of its all-time high. Does this make any sense? We shall see after this quarter's earnings from the oil companies are announced ...

33 comments:

NASaturday, November 18, 2006 at 4:54:00 PM EST
EC-

What do you think about the historical SPY ETF and S&P e-mini futures spread? If its not too much trouble do their st. deviation analysis like you did for GLD-GDX. thanks a lot!
ReplyDelete
Replies
Ernie ChanSaturday, November 18, 2006 at 5:10:00 PM EST
Yaser: What you suggested is similar to the classic "index arbitrage" between SP500 stocks and stock index futures. Since the index arb desks of investment banks have been doing this for many years, the spread is likely to be very narrow and typically no longer profitable in recent years. Maybe you heard otherwise? Of course, one can also consider trading the spread between a basket of SP500 stocks and SPY ETF also. But I think the portfolio manager of SPY must have taken care to remove most arbitrage opportunities for us.(However, I like your earlier suggestion of using OIH instead of XLE. I will post an analysis of that next week.) -Ernie
ReplyDelete
Replies
Ernie ChanMonday, November 27, 2006 at 8:27:00 AM EST
I should add that the standard deviation of GOLD-GDM is $207, not $125 based on the short history of GLD-GDX. So following Mr. Hansen's suggestion of 2xstddev, the entry point would be $414.
ReplyDelete
Replies
UnknownWednesday, January 19, 2011 at 7:55:00 PM EST
Sir,

I am newbie to algo trading.Currently I try to follow your book,now I have some questions.

According to example 3.6

% ensure existing positions are carried forward
unless there is an exit signal positions=
fillMissingData(positions);

I am not really understand how to carry forward those positions and also I can't find the fillMissingData function in spatial-econometrics.com

Thank you very much for your suggestion.
Art
ReplyDelete
Replies
Ernie ChanThursday, January 20, 2011 at 8:02:00 AM EST
Hi Art,
The fillMissingData function is available for download from epchan.com/book. See last paragraph of page 34 of my book for password.

It simply copies the previous bar's value (price? position? whatever?) to the current bar *if* the current bar's value is NaN.

Ernie
ReplyDelete
Replies
AnonymousMonday, August 20, 2012 at 10:14:00 PM EDT
Hi Ernie,
I've been trying to use the backtesting code for cointegrating pairs, and have been using it in conjunction with scraping data from yahoo, for some pairs that I backtest on (most) the sharpetrainset comes up as NaN. any idea whats going on?

Cheers
ReplyDelete
Replies
Ernie ChanTuesday, August 21, 2012 at 7:47:00 AM EDT
Hi Anon,
Mostly likely you have included non-numeric data in your input.
Ernie
ReplyDelete
Replies
UnknownFriday, July 4, 2014 at 11:37:00 AM EDT
Hi Ernie,
When I took the GDX.xls and GLD.xls files from http://epchan.com/book/ I got same error as anonymous. However, when I took the files straight from Yahoo Finance I got no NaN result. But I got 3.0 and 5.6 for test and training sharpe ratios respectively, results which did not match what you suggested in your matlab code for example 3.6

Cheers,
Manuel
ReplyDelete
Replies
Ernie ChanFriday, July 4, 2014 at 11:55:00 AM EDT
Hi Lu,
Did you use the same periods for training and testing using Yahoo data?

Your Sharpe ratios sound too good to be true.

When you load my .xls files into Matlab, make sure your price arrays do not have NaN.

Ernie
ReplyDelete
Replies
JPSTuesday, May 26, 2015 at 4:55:00 AM EDT
Dear Arnie,
in the book quantitative trading you calculate the sharp ration for the spread ( for the training set )as

sqrt(252)*mean(pnl(trainset(2:end))).
/std(pnl(trainset(2:end)))

I am not using matlab ( as I dont have the software), how can I use the Sharp ratio for the spread which in excel is in the form of ln(x)-0.78765*ln(y) where x and y are the future prices of the two scripts.
ReplyDelete
Replies
Ernie ChanTuesday, May 26, 2015 at 8:44:00 AM EDT
Hi JPS,
You can easily compute the average and standard deviation of daily returns in Excel. Taking their ratio and multiplying by sqrt(252) gives you the annualized Sharpe ratio, assuming 0 risk free rate.
Ernie
ReplyDelete
Replies
Ernie ChanTuesday, May 26, 2015 at 8:44:00 AM EDT
Hi JPS,
You can easily compute the average and standard deviation of daily returns in Excel. Taking their ratio and multiplying by sqrt(252) gives you the annualized Sharpe ratio, assuming 0 risk free rate.
Ernie
ReplyDelete
Replies
UnknownWednesday, December 30, 2015 at 12:27:00 AM EST
First of all I want to thank you for this precious book. It is really well written and it is among my bests until now.

I have a question regarding the pair strategy spread : Why are you regressing prices on each other and then checking the spread? Isnt better the calculate the daily return and then check the absolute spread, and finally apply the same strategy as you propose (using the right parameters). Or we can make a regression of the returns and not the prices?

Amine,
ReplyDelete
Replies
Ernie ChanWednesday, December 30, 2015 at 8:03:00 AM EST
Hi Amine,
Thank you for your kind words.
A cointegrated pair of price series means that their price spread will eventually mean revert, but we don't know exactly when. If you look at returns of a fixed lookback, you are forced to look for mean reversion of a fixed timeframe. Please see p.126-133 of my book Quantitative Trading.
Ernie
ReplyDelete
Replies
JuanjoWednesday, December 30, 2015 at 1:44:00 PM EST
I'm also with the good comments on the book Mr Chan. I appreciate your effort in putting together such a good piece of knowledge (excuse my English as it is not my mother tongue).

I'm having troubles with program example3_6.m. I have downloaded GLD and GDX series from Yahoo Finance, check whether there are NaNs around but I still get values for the Sharpe Ratio in the test and in the training set NaN. I have checked the code but I don't know where the problem may be.

Trying to debug the code I have come across that positions is pass to fillMissingData with row 1 (both columns) equal to NaN, so the result is NaN. Could this be the problem? Why are these values NaN in the first place?

Thanks in advance and have a nice Xmas and a my best wishes for the next year!

JJ
ReplyDelete
Replies
UnknownThursday, December 31, 2015 at 12:37:00 AM EST
Thanks for the reply,

I have two questions :

1/ I read the page p126-133 again and I did the cointegration test on the pair GDX,GLD using data from 01/01/2014 to 29/12/2015. Below you can see my results

Augmented DF test for co-integration variables: GLD2Y,GDX2Y
CADF t-statistic # of lags AR(1) estimate
-3.12366596 1 -0.046204

1% Crit Value 5% Crit Value 10% Crit Value
-3.880 -3.359 -3.038

hedgeRatio =

5.6534

--------------------------------------------------------------

As you can see they are not cointgrated up to 90% chance. By drawing the residus it shows that the time series is not stationnary.

However, I tried spread= (dailyret(GDX)-dailyret(GLD)) without any regressions or cointegration test ... the spreadseries look meenreverting and stationnary (around 0).

What do you think of this approach?

2/ you mentionned in page 133 : For example, the Canadian
dollar/Australian dollar (CAD/AUD) cross-currency rate is quite
stationary. how can I trade the spread? I did not get the point. should I be long CAD/AUD and short AUD/USD for example? could you explain me more please
ReplyDelete
Replies
Ernie ChanThursday, December 31, 2015 at 8:39:00 AM EST
Hi Juanjo,
Thank you for your compliments.
Yes, you should manually set the first row of the position array to zero, before applying the fillMissingData function.
Ernie
ReplyDelete
Replies
Ernie ChanThursday, December 31, 2015 at 8:42:00 AM EST
Hi Amine,
1) Indeed GDX vs GLD sometimes falls out of cointegration, and your test period may be one of those regimes.
But as you observed, even though a pair may not be strictly cointegrating, it can still mean-revert to some extent. You can apply a short term mean reversion strategy to see if it is profitable.

2) AUDCAD is a readily available for trading as one instrument on many FX broker. Just buy low and sell high and vice versa.

Ernie
ReplyDelete
Replies
JuanjoMonday, January 4, 2016 at 10:53:00 AM EST
Hi Ernie,

I have been "debugging" my code and there is a mistake I was stubbornly doing over and over again: I didn't assign positions back after calling to fillMissingData. The code in the book was part in a comment and that got me wrong. I should have read more carefully.

I am now debugging the second part (to detect the look-ahead bias). The code in the website is far more "complicated" than the one suggested in the book. Is there any new revision of the book I'm not aware of?

Kindest Regards,

JJ
ReplyDelete
Replies
Ernie ChanMonday, January 4, 2016 at 11:03:00 AM EST
Hi Juanjo,
The code on my website sometimes get updated (maybe due to bugs discovered, or other improvements.)

No, I haven't published any revised edition of my books.

Ernie
ReplyDelete
Replies
JuanjoSunday, January 10, 2016 at 2:10:00 PM EST
Hi Ernie,

Thank you very much.

Juanjo
ReplyDelete
Replies
UnknownThursday, July 28, 2016 at 10:18:00 AM EDT
Hi Ernie :

I used your code in your book to backtest the GLD-GDX pair trading. But I got a different sharpe ratio of trainset, which is 0.7856(your book is 2.3). I think the code that I write based on your book is correct. So I wonder maybe the date we used are not the same. I used the price of GLAD and GDX from 5/23/06 to 11/30/07, 385 daily price, are these data you used in the example in the book?

Thank you!

Lucas
ReplyDelete
Replies
Ernie ChanThursday, July 28, 2016 at 10:38:00 AM EDT
Hi Lucas,
As I have described numerous GLD-GDX strategies in the past, please specify exactly which book and which example you are referring to.
Thanks,
Ernie
ReplyDelete
Replies
David C.Tuesday, August 30, 2016 at 5:41:00 AM EDT
Hi Ernie,
I've been reading your book recently to learn to become a independent algo trader. It really helps me get to know quant trading quickly. Thanks for your knowledge and experience in the book.
I'm looking forward to your 3rd book (you mentioned it in Quora and said that it will have some A.I. discussions), since I am aiming to build my trading strategy based on A.I..

Cheers,
David
ReplyDelete
Replies
Ernie ChanTuesday, August 30, 2016 at 10:34:00 AM EDT
Hi David,
Thank you for your kind words.
Yes, there is a chapter on AI in my 3rd book.
The book is called "Machine Trader", and it is due out in 2017 Q1.
Ernie
ReplyDelete
Replies
Michael LamWednesday, December 29, 2021 at 4:26:00 PM EST
Hi Ernest,
Thank you for generously sharing your time and knowledge in writing such an instructive book.

I have a question on Example 3.6 Pairs Trading of GLD and GDX in the 2nd edition of "Quantitative Trading". Specifically, the positions to take in the python code provided
df.loc[df.zscore>=2, ('positions_GLD_Short', 'positions_GDX_Short')]=[-1, 1] # Short spread
df.loc[df.zscore<=-2, ('positions_GLD_Long', 'positions_GDX_Long')]=[1, -1] # Buy spread

Where spread is defined by
spread = GLD - hedgeRatio*GDX

My understanding is that to short the spread, I need to short GLD and long GDX, and to buy the spread through long GLD and short GDX. Is that correct?

If my understanding is not flawed, I would have expected the code to be
df.loc[df.zscore>=2, ('positions_GLD_Short', 'positions_GDX_Long')]=[1, 1] # Short spread
df.loc[df.zscore<=-2, ('positions_GLD_Long', 'positions_GDX_Short')]=[1, 1] # Buy spread

Clearly I am missing something here as the Matlab, Python and R codes are consistent. Could you please advise? Thank you!
ReplyDelete
Replies
Ernie ChanFriday, December 31, 2021 at 9:16:00 AM EST
Hi Michael,
Yes, your understanding is correct. When we short a spread, we short GLD and long GDX. The first column in the df is for GLD, and so we apply -1 to it. (We distinguish long and short positions in the spread by _Long vs _Short, but a negative position in either case denotes a short position.)
Hope this helps!
Ernie
ReplyDelete
Replies
plmThursday, February 2, 2023 at 1:09:00 AM EST
Hey Ernie,

For the python example 3.6 pairs tarding of gld and gdx in 2nd edition of "quantitiatve trading":

per your book, if you set the following codes for the initial params as:

df['positions_GLD_Long'] = 0
df['positions_GDX_Long'] = 0
df['positions_GLD_Short'] = 0
df['positions_GDX_Short'] = 0

and then after going through the z-score parameterrs, if you do the

df.fillna(method='ffill', inplace=True).

my question: what function does this serve if there are no NaN values since we set starting to zero when creating the columns? should be param be set to np.nan first?

Also, for the codes between matlab, python, and r, are the datasets slightly different which is why different resulting sharpe ratio is given for train and test sets? e.g., python sharpeTrainset = 1.9182, matlab = 2.0822, r = 2.3278. Pardon ignorance, as I have only done the Python example, but am assuming each programming language is the same in terms of logic execution.
ReplyDelete
Replies
Ernie ChanThursday, February 2, 2023 at 6:30:00 AM EST
Hi plm,
You are correct. My intention was to fill the first row of the positions table with 0 only, and leave the rest as nan. Then after setting the entry and exit signals, we fill forward with the sign of the entry or exit signals.
I will look into modifying the code when I have more time, but perhaps you can try doing it yourself as an exercise to see if it reproduces the answer in Matlab?
Best,
Ernie
ReplyDelete
Replies
UnknownMonday, May 1, 2023 at 9:32:00 PM EDT
Hey Earnie! Really enjoying your book. On the topic of example 3.6, I notice we calculate the spread using the difference between gold and the hedge ratio of GLD/GDX. Maybe this is explained somewhere, but what is the logic behind using this ratio? Also, why do we use it for calculating the spread but not in our position sizing?

Thanks!
Mayan
ReplyDelete
Replies
Ernie ChanTuesday, May 2, 2023 at 6:22:00 AM EDT
Hi Mayan,
Thanks for your kind words!
Yes, the hedge ratio is calculated based on a linear regression between GLD and GDX prices. As GLD is the dependent (Y) variable, and GDX the independent variable, this hedge ratio represents how many shares of GDX is needed to hedge 1 share of GLD.
Strictly speaking, you should use that for position sizing too, but I just used equal sizes for simplicity, and just use the hedge ratio for determining the spread and generate trades.
Ernie
ReplyDelete
Replies
BirdEggXWednesday, March 5, 2025 at 9:22:00 PM EST
Your book is incredibly well-written, with such rich details. I'm truly grateful.
ReplyDelete
Replies

Add comment