A reader recently asked me whether setting a stop loss for a trading strategy is a good idea. I am a big fan of setting stop loss, but there are certainly myriad views on this.

One of my former bosses didn't believe in stop loss: his argument is that the market does not care about your personal entry price, so your stop price may be somebody else’s entry point. So stop loss, to him, is irrational. Since he is running a portfolio with hundreds of positions, he doesn’t regard preserving capital in just one or a few specific positions to be important. Of course, if you are an individual trader with fewer than a hundred positions, preservation of capital becomes a lot more important, and so does stop loss.

Even if you are highly diversified and preservation of capital in specific positions is not important, are there situations where stop loss is rational? I certainly think that applies to trend-following strategies. Whenever you incur a big loss when you have a trend-following position, it ususally means that the latest entry signal is opposite to your original entry signal. In this case, better admit your mistake, close your position, and maybe even enter into the opposite side. (Sometimes I wish our politicians think this way.) On the other hand, if you employ a mean-reverting strategy, and instead of reverting, the market sticks to its original direction and causes you to lose money, does it mean you are wrong? Not necessarily: you could simply be too early. Indeed, many traders in this case will double up their position, since the latest entry signal in this case is in the same direction as the original one. This raises a question though: if incurring a big loss is not a good enough reason to surrender to the market, how would you ever decide if your mean-reverting model is wrong? Here I propose a stop loss criterion that looks at another dimension: time.

The simplest model one can apply to a mean-reverting process is the Ornstein-Uhlenbeck formula. As a concrete example, I will apply this model to the commodity ETF spreads I discussed before that I believe are mean-reverting (XLE-CL, GDX-GLD, EEM-IGE, and EWC-IGE). It is a simple model that says the next change in the spread is opposite in sign to the deviation of the spread from its long-term mean, with a magnitude that is proportional to the deviation. In our case, this proportionality constant θ can be estimated from a linear regression of the daily change of the spread versus the spread itself. Most importantly for us, if we solve this equation, we will find that the deviation from the mean exhibits an exponential decay towards zero, with the half-life of the decay equals ln(2)/θ. This half-life is an important number: it gives us an estimate of how long we should expect the spread to remain far from zero. If we enter into a mean-reverting position, and 3 or 4 half-life’s later the spread still has not reverted to zero, we have reason to believe that maybe the regime has changed, and our mean-reverting model may not be valid anymore (or at least, the spread may have acquired a new long-term mean.)

Let’s now apply this formula to our spreads and see what their half-life’s are. Fitting the daily change in spreads to the spread itself gives us:

These numbers do confirm my experience that the GDX-GLD spread is the best one for traders, as it reverts the fastest, while the XLE-CL spread is the most trying. If we arbitrarily decide that we will exit a spread once we have held it for 3 times the half-life, we have to hold the XLE-CL spread almost a calendar year before giving up. (Note that the half-life count only trading days.) And indeed, while I have entered and exited (profitably) the GDX-GLD spread several times since last summer, I am holding the XLE - QM (substituting QM for CL) spread for the 104^{th} day!

## 147 comments:

Wondering if you could break down how the half life is calculated...?

Hi,

See the Ornstein-Uhlenbeck formula at http://en.wikipedia.org/wiki/Ornstein-Uhlenbeck_process

If you believe the spread is mean-reverting, this formula will describe its time-evolution. You will notice that the time-evolution is governed by an exponential decay -- hence the notion of half-life.

Ernie

Good article. One question: How do you use the estimated "theta" to trade? What is your trading strategy after the estimation? Do you still apply threshold rule to trade? Thanks

Dear Volat,

The estimated half-life can be used to determine your maximum holding period. Of course, you can exit earlier if the spread exceeds your profit-cap.

Ernie

I assume half-life means 1/2 of the time for the spread to revert to its mean. Therefore a half-life a 16 days means that it takes 32 days for the spread to revert to its mean (correct me if I am wrong). And with this number, you basically open the position at day 0 and close the position at day 32, then open the position at day 64, and then close the position at day 96...Is that right?

Volat: Yes, that's right. But as I said, you can exit early due to profit cap.

Ernie

Hi - first I want to say thank you for publishing your OT book - excellent writing.

In this article (as well as in the book), you said "...linear regression of the daily change of the spread versus

the spread itself"...by looking the the formula and Matlab example, should this be a more accurate sentance:"...linear regression of the daily change of the spread vesus

the spread's deviation from mean"?

Hi Anonymous,

Thanks for your compliments. Actually, whether you subtract the mean of the spread or not will yield the same regression coefficient.

Ernie

So when you run the regression, which regression coefficient is the theta?

If you regress the change in spread against the spread itself, the resulting regression coefficient is theta.

Ernie

Ah, makes a lot of sense. Of course. I also looked over the code again and saw at the end that you tell the computer that OLS beta = theta. Thanks.

Do you use the adjusted cointegration coefficient as the hedge ratio or the normalized. And if you do a cointegration test on the actual instruments and you get little chance of no conintegration coefficients and an over 55% chance of at most one; can you use that, or is it always better to run the test on first differences where there is always a high probability of two coefficients (using eviews output)?

Once I find cointegration is confirmed, I actually performed my own regression to find the hedge ratio.

Ok, that makes sense. What do you think about playing with the hedge ratios with the upper and lower bound being the Beta from a regression and Beta from a cointegration regression and seeing which number in between gives you the most stationary series?

By the way, thank you for being a fantastic resource and answering questions. I find that unless you are a math major (which I am not) some of the statistical arbitrage literature is impossible to get through. Once properly explained it seems fairly simple.

That is not a bad idea. In reality, however, I am not too concerned about the precise value of the hedge ratio. The optimal hedge ratio going forward is likely to differ from the optimal in a backtest period.

Ernie

Pardon my ignorance but I was under the impression that Brownian Motion (dWt) and Ornstein Uhlenbeck were both modeling processes to simulate expectations...not to analyze past data. If I am wrong please correct me and explain: a) the length of period to use for the mean and s.d. b) how should I calculate dWt using past data (the formulas I see use a random function to generate data points to use...should I simply replace the randomly generated points with the actual historicals?)

I guess my real confusion goes back to my assumption that BM and O-U are forward modeling tools...

Also, once you have calculated an half-life, how should it be applied? Should my calculations produce a new (smaller) half life everyday as the price reverts to the mean? Or will my calculations produce a static (somewhat static) number for the half life and I must then figure out where in the mean reversion process to start counting from?

William,

If you assume your price process follows a mean-reverting random walk, then you can fit the parameters of the O-U equation based on historical data.

The fitting procedure is a simple regression fit, described in details in my book, as well as explained in the previous comments on this blog post.

Half-life should be fairly constant over time. It should not decrease as the prices mean-revert. You can use any reasonable number of time periods to calculate the half-life. Typically 3 months to a year are suitable.

Hope I answered your question?

Ernie

Perhaps, it isn't necessarily the case that the shortest half life leads to the best trade?

Would a metric like the following:

sdev(spread)/half_life(spread) not be a useful method of ranking spreads? Here, sdev() would need to be expressed as a % of the mean. Alternatively, the percent excursion from a moving average might be used, since that metric is representative of the expected gain?

Hi geegaw,

Thanks for your suggestion. It is an interesting idea. However, the ultimate performance measure for trading a pair is the Sharpe ratio, regardless of holding period.

Ernie

Ernie, how do you calculate Z score for the Ornstein Uhlenbeck process? I could not find it in your book. Thanks

Hi Anonymous,

Ornstein-Uhlenbeck process does not generate zscores. It is used to calculate half-life of mean-reversion. Zscore is simply the value of a spread divided by its standard deviation.

Ernie

Hi Dr Chan,

Thanks for this nice article.

I was initially thinking about fitting AR(1) to the spread and then calculate the half life.

But your method seems more robust.

I have a question about the result you have posted on this article.

What time frame do you use for the GLD-GDX pairs half life calculation?

Also,when I calculate with the data(GLD1,GDX1) posted on your premium content,I get -ln(2)/theta equal to 8.

Thanks,

Vishal

Hi Vishal,

I don't recall what time frame I used for this half-life calculation. But in my book, the same example uses 20060523-20071221, and the half-life obtained is 10 days.

Ernie

Hi Vishal,

I don't recall what time frame I used for this half-life calculation. But in my book, the same example uses 20060523-20071221, and the half-life obtained is 10 days.

Ernie

Hey Ernie,

I am trying to plug in the numbers in the ornstein formula but I can't seem to get a number. Also the number varies wildly. I tried regressing the changes in the spread against the spread itself but it makes no sense the results. Is there a place where the formula is applied so I can check on how to use it?

Thanks,

J

Hi J,

You can look up Example 7.5 in my book on halflife calculation.

Ernie

Thats where I am trying to understand it. But it is written in matlab code. I don't have Matlab. I am trying to do it in an excel spreadsheet.

i have 101 cells of data.

i make a=yesterdaysclose-avgoflast100yesterdaysclose;

b=today-yesterdaysclose;

then regress a on b for last 100 bars. I get a weird answer all the time.

Hi Anon,

You regression formula appears correct. But when you mentioned yesterday's "close", are you expecting the price series to be mean-reverting? Most price series are not mean-reverting -- only spreads are.

Ernie

is there a difference between terms "quantitative trading" and "algorithmic trading" ??

and, what are best materials to learn quantitative trading and investment strategies ??

Hi Sunil,

Algorithmic and quantitative trading are basically the same.

One of the better places to learn about the subject is my book Quantitative Trading!

Ernie

Dear Ernie,

If I run a OLS regression on the dz vs. prevz-mean(prevz) on pg141 of your book to estimate the theta for the Half Life. This regression has very low Rsquare in general. And the t-statistic for theta is usually quite negative which we will reject the null hypothesis. So does this mean this estimate of half life is not very accurate in general as opposed to what you suggest in your text which is "Since we make use of the entire time series to find the best estimate of Theta, and not just on the days where a trade was triggered, the estimate for the half life is much more robust than can be obtained directly from a trading model."

Suny,

Half-life estimates do not need to be very accurate to be useful. If you calculate the t-statistic of the mean return of your trading strategy, you will likely find a much worse result.

Ernie

Whenever I short one of the positions in the hedge it seems thst a few weeks later I get a buy back notice from the brokerage firm so that I have to close out the position. Any suggestions ?

Thanks

Anon,

I suggest you find a better broker.

Or short instruments that are in better supply.

Ernie

Please Dr Chan, specify the input data (which stock, which dates) you use when you calculate something in your book. Confusion arises on several blogs because of this:

http://pcweicfa.blogspot.se/2010/08/r-implementation-of-ornstein-uhlenbeck.html

He gets halfday of 8.8 days. The reason? Because you use more data points than he does. This took me several hours to figure out, which code I should trust: his or yours? The discrepancy is because he did not know which input you used.

Now I am trying to find out why your code can not find the half life lambda (which is 3) here

http://www.sitmo.com/article/calibrating-the-ornstein-uhlenbeck-model/

His ansatz gives 3.12 which is close to the correct answer. Your code gives an answer of -20.787, which is totally off. Could you please investigate this further? I am using your code in R version (see the first blog). Why can your ansatz not find the correct halflife, which is 3?

Hi Anon,

Sorry, but I am mystified by why you find the input confusing. The entire set of input data in example 7.2 is available for download (from epchan.com/book) as an Excel file, and there are just 2 ETFs in questions: GLD and GDX. I don't see how anyone using this same data could be using different number of data points.

Ernie

Dr Chan,

I want to calculate the expected time for the current price to revert to the mean. How can I use the OU half life for this?

My reason is because the de-trended (stationary) time series has significant drift. For example 30% per annum. The expected price is the current mean price + time to revert x drift.

Is the z-score also a factor?

Imagine one time series with constant OU half life. At different samples, the price is 0, 1 and 2 stdevs from the mean. Is the expected time to revert constant or sensitive to the z-scores 0, 1, 2?

Ken,

By definition, a detrended price series should have zero returns. How can it be 30%?

The OU halflife is the expected time to revert about half of the current deviation from the mean. It is independent of what the current zScore is. If the zScore is 2, it is the time to get to 1. If it is 1, then it is the time to get to 0.5.

Ernie

Dear Ernie,

In your post about calculating the half life of the spread you talk about linear regression of the daily change of the spread versus the spread itself.

so if yt is my daily spread at time t,am i correct in doing the following regression.Where C is the regression symbols?

yt - y(t-1) c y(t-1)

Hi JPS,

Yes, your regression is correct.

Ernie

Dear Arnie,

I have two stocks which are cointegrated as per the Johansen co integration test.An on running the VECM i check the residuals of the regression.My observations are as follows.

1. There is no serial correlation among the residuals

2. There is HETEROSCEDACITY in the residuals

3.The residuals are not normally distributed.

Is this model acceptable?

is there any way of removing heteroscedacity in VECM?

Hi JPS,

To avoid non-constant variance, try log prices instead of prices. But I don't see why heteroscedacity is a problem for creating a profitable trading model. We also don't particularly care if the residues are normal.

Ernie

HI Ernie,

Thanks a lot for the clarifications.I was reading your blog about creating more efficient pairs using combination of more stocks.My query regarding creating combo of such type (for example a combination of stocks taken from index to be combined with the index itself) is as follows.

1.Do these stocks individually need to be cointrgrated with the index

2.Do these stocks individually need to br Cointegrated with each other also

in other words what I am conjecturimg can we have two stocks which are not cointegrated on one to one basis become conitegrsted when combined with a third stock in a combination of three stocks.

Hi JPS,

For the index arbitrage strategy between stocks and an index instrument, the stocks individually should cointegrate with the index, but not necessarily with each other.

Ernie

Dear Ernie,

While going through some literature for Unit Root testing( which I implement on the stock price time series ,I came across PPURootTest ( The phillip Perron Unit Root Test) that also checks for the Structural breaks in the data.The NUll of the test is that the Series has the unit root ( at Level) with a structural break at a particular date which I interpret as that accepting the Null Hypothesis means the Series is Non Stationary at Level and the data has a structural break at a particular date at a specified date suggested by the test.But What to do with the information abut the structural break ?

Hi JPS,

Yes, failure to reject null hypothesis means the price series may be non-stationary. You need to find out a fundamental reason if possible for the structural break, then you may learn how to devise a trading strategy that would avoid its effects. For example, if the break is due to a central bank interest rate announcement, maybe you can liquidate before such scheduled announcements.

Ernie

Dear Ernie ,

In the Book Quantitative Trading you have mentioned (while calculating sharp ratio) that in the dollar neutral strategy there is no need to subtract the risk free return since the portfolio is self financing.But in some markets for example Indian Stock Markets the short selling of equities is allowed only for intraday trades.So whatever long short strategy one needs to follow necessarily has to be using futures.What should be the approach of calculating sharp ration in that case.

Hi JPS,

If you can only hedge a long stock position with short future position, then you do need to subtract the risk free rate when computing Sharpe ratio.

Ernie

Dear Ernie,

I have a mean reverting spread in the form say for example ln(X)-0.7654*ln(Y)+3.41, where X and Y are two scrips.

Can Sharp Ration for such a spread can be calculated and if Yes then How?Please suggest

As I was going through the book Quantitative Trading I came across the spread of the form A-B whose calculation is very beautifully explained.

Hi JPS,

It doesn't make sense to compute Sharpe ratio for a spread. You should only compute Sharpe ratio on the returns of a strategy or a portfolio.

Ernie

Hi Ernie,

Thanks a lot for the clarification.My next question is then what performance parameter/parameters ( if any) can one use to grade the performances of the Spreads if not the SHARPE RATIO

Hi JPS,

A spread as such does not have performance. It is a trading strategy on a spread that has performance. You can certainly measure the Sharpe ratio on a trading strategy, since it generates returns.

Ernie

Dear Ernie,

In the book quantitative Trading while writing the code for back-testing you calculate the Z sore for the training set Data (I guess I have interpreted it right) by means of following code

set % mean of spread on trainset

spreadMean=mean(spread(trainset));

% standard deviation of spread on trainset

spreadStd=std(spread(trainset));

% z-score of spread

zscore=(spread - spreadMean)./spreadStd;

Now while testing it on the test-set do we need to calculate the Z score of spread of the test-set separately( using mean and standard deviation of the spread of test-set period) and then try to see how it performs on the Deviations of the z score calculated from the Trainset?

or

the z score of the spread of the test-set period is also calculated using the Mean and Stddev of the Training set period and then performance tested using the Z score deviations which were calculated by using trainset data.

Hi JPS,

Whether to use the trainset or the testset to determine the mean and std is optional. I would actually recommend using a moving average to determine mean and a moving std to determine std, as in Bollinger bands.

Ernie

Hi Ernie,

Thanks for the prompt reply.What I gather from your suggestion is that the Z scores should be calculated dynamically rather than statically ( with a look back window as per the flavor of the spread).So what you suggested ,I have tried to jot down logically as follows.Please correct me if I am wrong.

Decide some look-back period = period( SAY 21 DAYS)

step 1: calculate the moving average of the spread = MA (spread,period)

step 2: Zscore(DYNAMIC) = (CURRENT VALUE OF THE SPREAD)- (CURRENT VALUE OF MA OF SPREAD)/STDEV(SPREAD,PERIOD)

Hi JPS,

Yes, you implemented my suggestion correctly.

Ernie

Dear Arnie,

Can we safely take the half life of the dynamic ZScore of the spread (calculated based on your suggestion in the previous blog) as a parameter for the exit strategy.As you had previously suggested you consider the 3 times the half life a sufficient indicator to exit the spread position.

Hi JPS,

Yes, some small multiple (<10) of this half life can be used to set the max holding period.

Ernie

Dear Ernie,

As per your recent comment " I would actually recommend using a moving average to determine mean and a moving std to determine std, as in Bollinger bands."

What I gather is that the central line will be the MA of the Spread with the upper and bottom envelop being the 2 STD of the MA of the Spread and one trades when the spread is at the upper or the lower envelop.But this tool suffers from the same old problem of Bollinegr band being LAGGARD as the actual shape it gonna take will emerge as the time progresses.

And further in this case what is the role of Z score as calculated by the above mentioned formula

step 1: calculate the moving average of the spread = MA (spread,period)

step 2: Zscore(DYNAMIC) = (CURRENT VALUE OF THE SPREAD)- (CURRENT VALUE OF MA OF SPREAD)/STDEV(SPREAD,PERIOD)

Hi JPS,

If you don't want a lagging volatility, you need to use a volatility prediction model such as GARCH. Similarly, if you don't want to use moving average to predict mean, you need to use a time series model such as ARIMA.

Ernie

Dear Ernie,

Can you suggest any statistical indicator that can be used in conjunction with the Bollinger Band of the spread ( Consisting of MovA of the spread and the stdev at upper and lower levels).Basically this additional indicator I wanna use for an early exit if the trade goes against the punt.

Hi JPS,

You can always impose a trailing stop - but I don't recommend it for a mean reverting strategy for the reason I stated in my article and book.

Ernie

Dear Ernie,

While reading your book algorithmic trading (Section on Trading Pairs using Price Spreads,Log Price Spreads, or Ratios) I got a bit confused.

lets suppose I find that two stocks X and Y are co-integrated and after regressing X on Y one gets Y coefficient as 0.5467 and Constant 26

If X and Y are the future prices does this mean that for each lot of future of X long one need to short half a lot for Y ( which means after normalizing it :for 2 lots long of X one short lot of y) .But what is the significance of constant 26 in this equation .How do one take care of this in trading terms

Dear Ernie,

In the book Algorithmic Trading you have mentioned that the values of the eigen vectors give us the proportion of the shares for each script.

A) Does the Negative sign of the eigen vector indicates that that particular scrip needs to be shorted while creating one "unit" of the portfolio.For example x-0.577y+9.0z where x y and z are the scrips found to be coinetegrated and 1,-0.577,9.0 are the corresponding vlaues of eigen vectors.Does this mean that while creating 1 "unit" one needs to short 0.577 units of y and long 9 unts of z and long 1 unit of x

B) when as per the Z score we (say at Z=+1) short 1 "unit of the portfolio , how does it translates in terms of "short" and "long" in terms of individual scrips ?I am really stuck at this point as to what will be the trading action...Please help me clarify this doubt

B) what if x,y and z are the future and not shares?

C) if we get all the information from the eigen vectors of the Johansen test,is ther any need to proceed to conduct VECM?

Hi JPS,

You can ignore the y-intercept in your regression. We are trading the mean reversion of the residual - adding a constant to it won't affect your trading if you are using Bollinger bands.

Negative component in an eigenvector means you should short the shares or futures contracts. E.g -20.4 means short 20.4 shares or futures contracts. For futures, make sure you multiply the prices with suitable multipliers first, otherwise the result won't make sense.

Ernie

Dear Ernie ,

Many thanks for the clarifications.When you talk about multiplying by suitable multiplier (in case of futures)you intend to this so that the future lots come in whole numbers( or near whole numbers)?

JPS,

By futures "multiplier", I meant the dollar per points. E.g. 50 for ES.

We don't really care about whole numbers or not in backtest.

Ernie

Dear Ernie,

In then the book Algorithmic Trading ,while calculating the Kalman Filter based dynamic regression the expression for R(t | t − 1) is cov(β(t) − ˆβ(t | t − 1)). which eventually helps in calculating the value of K.

1) What is β(t) here...Is it the actual value of β at time t that one gets after regressing the dependent variable on independent variable.

2) Also what should be the period of regression( I mean how many data points one should choose to calculate β(t) β hat (t) etc.??

Dear Ernie,

In the calculation of the kalman filter while calculating Q(T)

Q(t)=x(t, :)*R*x(t, :)’ WHAT DOES THE TERM x(t, :)’ DO?

and what is the dimension of Q(t)?

Hi JPS,

Beta is a 2x1 vector which denote both the intercept and the slope of the linear relation between the two instruments. The slope is often called the hedge ratio. Beta is updated at every time step using the Kalman filter. The whole point of using Kalman filter is that we do not need to use linear regression with a fixed lookback to compute the hedge ratio. Instead, it adapts to every new data point, and uses the entire history.

Ernie

Hi JPS,

Since R is a 2x2 covariance matrix, and Q is just a scalar variance of forecast errors, we need to multiply R on both sides with a vector to turn a matrix into a scalar.

As a variance of prices, Q has the dimension of price squared.

Ernie

Hi Ernie,

Thanks for the explanation. Is R a 2x2 diagonal matrix with the starting value as Delta (which in the code we have taken as 0.0001?

So from what you said above I can conclude that the term

x(t, :)*R*x(t, :)’ in the code mathematically equivalent to Square of Price X Det R ( Determinant of R)

Dear Ernie,

Further to my earlier query I guess X(t) has been made a Tx2 matrix so that x(t, :)*R*x(t, :)’ is equivalent to a 1x2 matrix multiplied by 2x2 ( R the covariance matrix)and then 2x1 ( transpose of x(t) ) which results in a scalar.

and further K is also a 2x1 matrix like beta

Am i correct in my interpretations?

Hi JPS,

R starts off as a zero matrix, then get updated in the for loop as displayed on page 79. It is generally not a diagonal matrix.

I am not sure that what you meant by Square of Price X Det(R), since Price is a vector, and the result Q should be a scalar.

Ernie

JPS,

Yes, x*R*x' = Q is a scalar.

You can easily verify all these days by checking out the numerical results in Matlab or R.

Ernie

Dear Ernie,

Many thanks for the earlier replies! It has indeed propelled me in to study further the details of the books

I have few other queries regarding the sections of the book where you have described trading ETF and Triplets.

1.Does the Formation of a long-only portfolio (logMktVal_long=sum(log(yN),2)) to check that it co-integrate with SPY is necessary in all cases or it is the compulsion of Long only Portfolio?

2. What if one discovers that Stock1(futures) ,Stock4(futures) and Stock5(futures) Co-integrates individually with the

Index and I want to create a long short portfolio among these 4. In this case Can't I straight away run the Co-integration

between the series consisting of [ stock1(futures) stock4(futures) stock5 (futures) ] and [ index].Get the weights from

the eigen vector and create a Long Short Portfolio?

3. If after getting the weights from the eigen vectors one gets a Spread of the Form of :

stock1(futures) + 0.90* stock4(futures) - 2.89 * stock5(futures)

Please correct me IF I am wrong ( with respect to the aforementioned portfolio)

LONG PORTFOLIO TRADING DECISION :

Long I unit of stock1(futures) , Long 0.90 unit of stock4(futures) and Short 2.89 units of stock5(futures)

SHORT PORTFOLIO TRADING DECISION :

Short 1 unit of stock1(futures) , Short 0.90 unit of stock4(futures) and Long 2.89 units of stock5(futures)

Hi JPS,

1) The strategy is based on the mean reversion of the hedged portfolio. If we don't carefully select the stocks that cointegrate with SPY in the long side, we cannot expect mean reversion.

2) Yes, but no cointegration test allows you to test more than 12 stocks. Furthermore, many of those stocks will have negative weight. We don't want a long portfolio with short stock components.

3) That is correct.

Ernie

Dear Ernie,

Greetings for the Day!

Thanks for the prompt reply.I have got an interesting situation where the components stocks don't co-integrate individually with the index but as a combination( component stocks and index) are getting co-integrated.

In your opinion is it prudent to go ahead and create a spread between index and the components stocks to trade knowing well that they ( component stocks) are not co-integrating with the index on one-to-one level?

Hi JPS,

Yes, you can do that - but make sure none of the components have a negative capital allocation. Otherwise, you will be unduly concentrated on those stocks. We want a long-only stock portfolio to arbitrage against an index ETF/future.

Ernie

Hi Ernie....I keep current on all your publications/blogs/lectures and your recent podcast. I am spending considerable time and effort on cointegrating "triplets" using the Johansen Test. One issue that doesn't seem to get any attention is the normalized eigenvector values that make up the resulting proportions of the synthetic portfolio elements. Those particular eigenvector values outputted from the Johansen test are NOT adaptive and represent the whole time period in question. As we well know the Beta of a normal regression will determine a similar proportion when only 2 components are being evaluated and of course that Beta changes throughout the time period in question. Kalman filters can produce an adaptive Beta. How can we approach the same situation when using the Johansen test to determine an "adaptive" proportion of components in a triplet?

Thank you

LB

Hi Larry,

Yes, the Johansen test will give you a stationary portfolio in the historical period, but does not guarantee stationarity in the future.

Kalman filter can be applied to any number of times series that will give you an adaptive beta. Also, you can just run Johansen test on a moving lookback period. Even if it suggests that the statistical significance is not high enough to determine cointegration, you can still use the eigenvector as the components of your portfolio.

Ernie

Dear Arnie,

I am facing a practical problem while applying the adaptive beta( kalman filter) strategy to the futures.

Lets suppose I have a triplet and the hedge ratios at the start of the trade are

1 0.5 0.71 ( 1 being for the dependent variable and 0.5 and 0.7 being for the independent variables).But both dependent and independent variable come in different lot sizes (minimum quantity that one can buy which is also the $/point), for example the dependent its 260 and two independents is 1400 and 600.

Can we Simply multiplying these lot sizes to the hedge ratios?Will not that change the ratios as suggested by the filter?

If we can not the simple multiplications like above what is the way out?

Hi JPS,

If you are applying these methods to futures with different multipliers, you need to multiply the futures' points by their multipliers in order to convert them to dollar values first. For e.g. for ES on Globex, you need to multiply the points by 50.

Ernie

Dear Ernie,

What i gather from what you suggested is that the future price series should first be multiplied by their respective multipliers before putting them under analysis.

Am I correct in my interpretation?

Hi JPS,

Yes, you are correct.

Ernie

Dear Ernie,

Greetings for the Day!

As suggested by you in response to my earlier query regarding the implementation of KALAMAN FILTER strategy to the futures, one needs to multiply by the respective multipliers of the futures to the price series to arrive at dollar values.

Exchanges ( at least Indian Stock exchange does that) change the multiplier values periodically as and when the prices of the underlying go very high or low and in case of splits. While in case of splits of the underlying stock the total dollar value does not change but in other case it gives a sudden spike in the dollar values of the futures.

1.Does this impact the accuracy of the analysis in terms of hedge ratios that are arrived at and my give us unrealistic results?

2. Is there any way to get around this problem?

Hi JPS,

Naturally, you need to use the multiplier that was in force at the time a price point was recorded in your historical time series. So you need a historical time series of the multiplier itself.

Ernie

Dear Ernie,

Thanks for the prompt reply.I do have the historical multiplier series,but my concern is about the spurts in the dollar values as and when the new multiplier comes in to force.

Does these sudden spikes in dollar values impact the accuracy of the analysis?

Hi JPS,

If you apply the historical multipliers to the historical prices correctly, there should be no jumps in the adjusted price series that you feed into various algorithms. A jump in such adjusted series indicates that your historical multipliers were not correct.

Ernie

Dear Ernie,

Greetings for the day!

The code that has been used in most of the strategies in the books for PnL is as follows:

positions=repmat(numUnits, [1 size(y2, 2)]).*[-beta(1, :)' ones(size(beta(1, :)'))].*y2; % [hedgeRatio -ones(size(hedgeRatio))] is the shares allocation, [hedgeRatio -ones(size(hedgeRatio))].*y2 is the dollar capital allocation, while positions is the dollar capital in each ETF.

pnl=sum(lag(positions, 1).*(y2-lag(y2, 1))./lag(y2, 1), 2); % daily P&L of the strategy

Considering a scenario where the price series is the dollar value instead of pure price.

I guess the Pnl is the daily return matrix of the strategy, which is used to calculate APR and Annualized return.

This matrix actually does not tell us about the Net Cash Inflow or the Outflow (if one follow the trade signal of Long and short) which further Mean the sum of the matrix Pnl does not give us the Net Cash Inflow/Outflow even if the Series under analysis are dollar value series.

A I right in the interpretation of the code?

Hi JPS,

The pnl in the code above is not the portfolio return. It is the portfolio P&L. In order to get the return, you have to divide that by the gross market value of the portfolio, given by sum(abs(lag(positions, 1)), 2).

In our backtest, we are generally not interested in cash flow. I.e. we don't care if the P&L is realized or unrealized. We compute the total P&L, which is the sum of realized + unrealized P&L. The same goes for returns computation.

Ernie

Dear Ernie,

Greeting for the Day!

You have mentioned in the Book Quantitative Trading, that while calculating the Sharpe Ratio in the Dollar Neutral Strategy there is no need to subtract the risk free return as the portfolio is self financing. Does the same principle applies to the dollar neutral portfolio using futures? In Indian exchanges one needs to maintain the margins both for the short and log futures positions and one does not get any interest on the maintained balance from the exchange.

if the answer to the above question is yes then ,since it is the double cost for maintaining the portfolio, does one need to subtract double the value of risk free interest rate from the portfolio return, while calculating the Sharpe Ratio?

Hi JPS,

There is never any need to subtract risk free rate from the raw returns when computing a futures strategy's Sharpe ratio in the US. I am not familiar with Indian futures exchanges, but are you sure that you cannot hold government bonds as collateral, and thus earning interest on those?

Even if you don't collect interest on the collateral, you don't have to pay interest on your nominal futures positions. So you still don't have to subtract interest rate from raw returns.

Ernie

Dear Ernie,

Greetings for the Day!

While reading Kalman Filter as Market-Making Model in the book Algorithmic Trading you have mentioned making Ve as a function of t as well.

" If we denote the trade size as T and the benchmark trade size as Tmax".....

My question is what is T and Tamx ( the bench mark trade size) especially when one is dealing with future?

Hi JPS,

T or Tmax refers to trade size, which in futures markets refer to the number of contracts traded.

Ernie

Dear Ernie,

Thanks for the reply!

So we are directly incorporating the volume of the contracts traded in the equation, but what should be the BENCHMARK trade size (Tmax)?

is it the maximum numbers of contracts traded for that script historically?

Hi JPS,

As I wrote in my book, Tmax "... can be some fraction of the total trading volume of the previous day, for example, where the exact fraction is to be optimized with some training data."

Ernie

Dear Ernie,

Greetings For the Day!

I am trying to understand "Kalman Filter as Market-Making Model" as per given in the in the book Algorithmic Trading .I have few queries , as the model seems to be a bit different from the model described in the previous example ( using hedge ratio).

1.Does all the attributes while designing this model like P,R,K are just single value variables as compared to matrices in the kalman filter for pairs and triplets?

2. The value of Q(t)=var(m(t)) .. Does this mean Variance of all the values of m till time t?

3.Though Ve has been made dependent upon time and volume but the variable Vw is not.In the new changed scenario for time variability of Ve does the role of Vw changes or it still to be a fixed constant which needs to be fixed for optimum output?

Hi JPS,

1) Yes.

2) No. This is merely a definition of Q, which is updated at every step.

3) Vw is set to 0 here.

Ernie

Dear Ernie,

Thanks a lot for the reply!!!

Would Greatly appreciate If you could throw some light as to how to calculate Q(t) in a single series Kalman filter case.

As here we have only R available and no other independent variable (like x(t) ) where we could calculate Q(t)= x(t,:)* R *x(t,:)' +Ve

Dear Ernie,

Thanks a lot for the reply!!!

Would Greatly appreciate If you could throw some light as to how to calculate Q(t) in a single series kalman filter case,

As here we have only R available and no other variable (like x(t) ) where we could calculate Q(t)= x(t,:)* R *x(t,:)' +Ve

Hi JPS,

All the required iterative equations are displayed on Eq. 3.14-3.20. We have just used m(t) instead of x(t). The KF equations in Box 3.1 is valid irrespective of the dimension of the variables, which in this case is just 1.

Ernie

Dear Ernie,

So can we safely say m(t) is the 1 X N (N total number of observations) vector of the mean values of the price series where mean for a particular time instance 't' can be calculated as (H+L)/2 where H ans L are the low and high price of the observable variable for a particular time instance t.

Hi JPS,

m(t) is indeed a 1xN matrix, but it has nothing to do with the high or low at any time. It is a quantity to be deducted from the Kalman update equations.

Ernie

Dear Ernie,

Thanks for the clarifications.Just one more doubt, what would be the state co-variance prediction in this case as in this case we have set Vw=0 so R=P+Vw is not doing anything actually?

Hi JPS,

What you wrote is incorrect. Please look at Eq. 3.19.

Ernie

Dear Ernie,

Greetings for the Day and Thanks for such a wonderful help all along really appreciate the help.

My doubt stems from the fact that eq (3.19) is state variance update

R(t | t) = (1 − K(t))R(t | t – 1)) and is akin to the equation P =R*(1-K).

But what should be the parallel to the R=P+Vw; the state co variance prediction (Equation 3.8)

Since as R=0 initially, R will not increment with each iteration if we have Vw=0.

Hi JPS,

I notice on error in one of my previous responses to you. m(t) is akin to beta(t) in Eq. 3.7, not to x(t). It is the hidden variable.

Indeed the state variance prediction is just the identity relation here. It does not mean, however, the R(t) remains constant at every t: it is still updated through the state variance update step (after an observation).

Ernie

Dear Ernie,

Greetings for the day!

While Discussing the Time Sereis Momentum Strategies in the chapter 6 of the book " Algorithmic Trading", you have mentioned that if

" If look-back is greater than the holding period, we have to shift forward

by the holding period to generate a new returns pair. If the holding

period is greater than the look-back, we have to shift forward by the lookback

period "

But in this piece of code ( Finding Correlations between Returns of Different Time Frames) we are doing just the reveres.Is there any specific reason for that?

if (lookback >= holddays)

indepSet=[1:lookback:length(ret_lag)];

else

indepSet=[1:holddays:length(ret_lag)];

end

Hi JPS,

You are right - there is an error in the book. lookback and holddays should be reversed after [1:.

Thanks for pointing that out!

Ernie

Dear Ernie,

Greetings for the day!!!

Thanks for the reply to an earlier query.I have 2 small queries regarding the Time Series Momentum Strategies Example 6 (chapter 6) of the book " Algorithmic Trading",

1 as you have mentioned that there has been a slight modification in the strategy adopted by (Moskowitz, Yao, and Pedersen, 2012) .“Instead of making a trading decision every month, we will make it every day, each day investing only one twenty-fifth of the total capital”. What is meant by taking the trading decision each day? If we are doing this on daily basis what is the significance and purpose of the “holding period”( 25 days in this case)? I suppose that once we take position we hold it for " Holddays" number of days.

2.While calculating the returns of the strategy I have confusion.The return is calculated as follows.

ret=(backshift(1, pos).*(cl-backshift(1, cl))./backshift(1, cl))/holddays

Lets consider for example the holding period is 3 days then the pos Matrix can have the values ( assuming at time =0 it has +1 due to 1 in the long matrix) 3,1 or -1 after the completion of the loop “for h=0:holddays-1” . But while calculating the return we are multiplying pos matrix with the return of only previous day? What about the returns of the previous 2 days?

Hi JPS,

1) On each day, you decide whether to buy or short to do nothing, based on the momentum entry rule. Once entered, you hold the position for 25 days.

2) Yes, if Holddays=3, pos can be -3 to +3. Hence the unlevered return that we compute using the formula you wrote must be divided by 3. That return capture all 3 positions, whether they are entered on the same day or not.

Ernie

Dear Ernie,

Greetings for the Day!!!

Thanks a lot for a wonderful help that you are for novices like us.Thanks for writing a wonderful book and for a constant source of guidance.

Dear Ernie,

Greetings for the Day!

While reading the book Algorithmic Trading ( Second time :) ) I came across the section where you have explained the calculation of the of the Hurst Exponent ( page 45) using the matlab function genhurst.

his function computes a generalized version of the Hurst exponent

defined by ⟨|z(t + τ) − z(t)|2q⟩ ∼ τ2H(q), where q is an arbitrary

number. But here we are only interested in q = 2, which we specify as

the second input parameter to genhurst.

But from the definition of the hurst Exponent to find out the Stationary series we are interested in Exponent value 2 in the above expression by which logic the value of q should be 1.

Am I correct in my interpretation?

Hi JPS,

Yes, you are right - q should be 1. Thanks for pointing that error out!

Ernie

Dear Ernie,

Greetings for the Day!

How can one implement "Stop Loss" to the Time Series Momentum Strategies for stock futures( as described in the book "Algorithmic Trading") along with the exit/reversal condition defined the holding period, even in those cases where the holding period is short (for example 1 or 2 days) ,for small traders for whom the protection of capital is of utmost importance?

Is there any quantifiable mechanism for that?

Hi JPS,

You can always put in a stop order once you entered into a position.

However, since the futures market is closed over the weekend, stop order is of little use to prevent a gap up or down once the week begins.

Ernie

Hi Ernie,

Thanks so much for the blog. I've been trying to comprehend as much as possible but I am not a math major so it is difficult. I'm pairs trading based on 20day 2std Bollinger Bands. Am I wrong to assume half-life is shorter for the 20-day Bollinger Bands vs 50 or 100day Bollinger Bands? If so, which time frame is your half-life calculation based on? Thanks!

Hi Syk,

Half life of a pair is independent of what lookback you choose for your Bollinger Band. In fact, sometimes it is good to choose the lookback to be the same as the half life. Half life is calculated based on a linear regression fit between the one day change of the spread against the spread itself. See Example 7.5 of my book Quantitative Trading. It is an intrinsic property of the spread, and not something you can adjust.

Ernie

Dear Ernie,

Greetings foe the Day!

In the Book "Algorithmic Trading" Chapter *, while describing how to use the Kelly's No. to find out the Optimal Leverage you mentioned the an inline function for calculating the compounded growth rate based on leverage f and return per bar of R which is given as follows.

g=inline(‘sum(log(1+f*R))/length(R)’, ‘f’, ‘R’);

and R is nothing but the "ret" ( daily return one column matrix) that one has obtained in the Back test.

So what we are doing here is multiplying each entry of the "ret" matrix with the "f" and then take sum of log of the all such product and dividing the sum by the length of the ret matrix.

But I am facing a problem to calculate "g" as the term log(1+f*R) is undefined when the term (1+f*R) comes out as negative.

Am i interpreting this formula correctly?

Hi JPS,

If 1+f*R is negative, that means at some point your account equity is 0. Your leverage is too high for this strategy.

Ernie

Dear Ernie,

Greetings for the Day!

My question stems from the chapter on optimizing Leverage in the book Algorithmic Trading.

Is there any specific reason that while optimizing and growth rate (g) and the leverage (f) our focus is on maxDD but we are not at all considering the maxDDD ( the days in the draw-down).?

Hi JPS,

For most investors, they are more concerned about the % loss than the duration of a drawdown. Also, it is relatively easy to limit a drawdown % (just stop trading!), but there is no easy way to limit a drawdown duration, as the market determines that.

Ernie

Hi Ernie...

I would like to use the half life as my time stop loss...can you help me about the algorithm for that

Hi Salvatory,

Sure, just exit whenever you have been in a position for more than two times halflife.

Ernie

Dear Ernie!

Just read through the Time Series Momentum Strategy as described in you book.My question is that though strategy gives the holding period but does not have any stop loss mechanism. what if the market goes against one's trade? Do one still wait for the holding period of days to exit the position?

Hi JPS,

Sure, you should backtest a stop loss for that strategy.

Ernie

Hello Ernie,

I am not quite sure if the application of OLS is a robust method for estimation of theta in OU-process. One can see it very quickly on a simple sin-function. Here we know exactly which halftime to expect = 1/4 of the period. If we simply play a bit with frequency parameter, we will see that the ordinary least-squar will not provide right results in all cases. OK, pure sinus will result in circle shape in the dz(zprev)-space, which is difficult for ordinary linear regression. But also providing heavy random varinace for sin-function does not provide accurate results.

What do you think about that?

A linear regression fit will only makes sense if the data you are fitting is actually randomly distributed around a straight line. In many cases where they are not, the linear regression will output a nonsensical answer. The most famous example for that is the Anscombe Quartet.

Hence this calculation of half life already assumes that your data follows the OU process. If it doesn't, it is a nonsensical calculation.

To test whether your data fits the OU process, you can use ADF test.

Ernie

Hi Ernie,

thanks for your answer. I mean stationary mean-reversal stochastic process. We can hypothetically assume that the sin-function describes our spread (or z-score). As you wrote, in special cases (nicely deterministic behaviour), OLS will fail. But also when we add randomness to this function, we do not get reliably expected halftime.

Since halftime is an important parameter for exit strategy I was wondering if there is any better method to find it. Maybe one can try to fit PDF?

Thanks

A.

Hi A.,

Adding gaussian noise to a sine wave does not necessarily turn it into an OU process. There is no point to fit that to the SDE that describes a OU process, as it won't. You will have to derive the SDE for such a process from scratch, and then derive the solution and find out whether an exponential decay describes the time dependence of its expected value.

I am not that skillful with solving SDE analytically, but if you are, I would be interested in learning about the solution to the sine wave + noise!

Ernie

Hi Ernie, you are absolutely right. Sin can be one possible path but it is not the whole stochastic process.

Thanks

A.

Hi Ernie,

First of all, thanks to your great blog and books. I learn a lot about algorithmic trading from them. Actually, I rewrote most of your strategy examples from book2 into R (as I use R) to understand them better.

Now I have a question about coding stop-losses. I haven't found stop-losses in your matlab codes, though you mentioned them. Do you know if there is a way to code them in vectorized manner as in the most of your codes? If possible, I'd like to see how you code them in matlab.

thank you

Pavel

Hi Pavel,

Thank you for your kind words.

To compute stoploss, you can compute the cumulative return of a position, and send an exit signal when the cumulative return drops below the stop level. This can be done in a vectorized way, just reset cumret to 0 every time the position is 0, and set the position to 0 every time curmet < stoplevel.

cumret=backshift(1, position).*(price-backshift(1, price))./backshift(1, price);

position(cumret < stoplevel)=0;

Hope this helps.

Ernie

Hi,

Yes, this helps. Thank you.

Pavel

I trade the wheat - corn spread quite actively for mean reversion. This one is strongly cointegrated (>95% Johansen). It has a half-life of about 50 days.

What puzzles me: when I back-test a strategy, the fast-moving strategies (with a short moving average and low standard deviation) perform by far the best. For example, a 10 day moving average with a 1 standard deviation barrier. These trade a lot, mostly every week. The slower-moving strategies (for example, 40-50 days) perform a lot worse and often don't have a positive Sharpe ratio.

I don't understand how this discrepancy between half life (which seems to favour slow-moving trading) and actual/backtested trading (which is fast) can be reconciled.

Any views are greatly appreciated!

Hi Simon,

This may have something to do with the sampling frequency.

If you compute the halflife using daily bars, you may get 50 days.

If you compute it using hourly bars, you may get a shorter halflife.

Mean reversion may happen at different time scales - something not captured by the usual OU process, but is quite apparent when you plot the Hurst exponent as a function of sampling frequency. See my blog post on the term structure of volatility: http://epchan.blogspot.com/2016/04/mean-reversion-momentum-and-volatility.html

Ernie

Thanks Ernie,

I forgot to specify this: I use daily (closing) price data for both. For calculating the half-life and for my strategy, where the trading signal is only generated once per day. I get your point, but here it cannot explain the discrepancy in time horizons I think...

Simon,

Halflife just gives a ball park figure for mean reversion.

There is no reason it has to be the optimal lookback or holding period for your specific mean reversion strategy.

Ernie

I know Ernie that they are two different concepts...still you to admit that there is some, equally ballpark, relationship between the two. One would expect a strategy on a market with a half life of 200 to trade less frequently than one for a market with a 20 half life.

Simon,

I agree that ordering should be preserved.

Ernie

Hi Ernie

I am looking at calculating the half life of mean reversion. I feel I have the steps correct but my question is, how long a look back do you use for the calcualtion? if I use SPY for example, the entire sample inception to present the half life is 939.657. Do you typically shorten the look back to use for the half life clacualtion?

This is my procedure:

1. Set our price close lagged by -1 day.

2. Subtract todays price close against yesterdays lagged close

3. Subtract (yesterday price close) – mean(of the -1 lagged price close )

4. Perform a linear regression on (todays price – yesterdays price) ~ (yesterday price close) – mean(of the -1 lagged price close )

5. -log(2)/coef(result)[2]

Hi Andrew,

It doesn't surprise me that the halflife for SPY is 939 days. Nobody says SPY is mean reverting on a daily basis!

You should use at least 100 data points to compute halflife - i.e. 100 days. Otherwise, it won't have much statistical significance.

Ernie

Hi Ernie

Thanks for the swift response. I took the last 100 days and the half life of mean reversion was 11.24267 days for the SPY.

This is my R code below:

Note as using vectors, when shifting to create lags etc... have made vector length adjustments to keep all vectors the same length:

# Calculate yt-1 and (yt-1-yt)

y.lag <- c(random.data[2:length(random.data)], 0) # Set vector to lag -1 day

y.lag <- y.lag[1:length(y.lag)-1] # As shifted vector by -1, remove anomalous element at end of vector

random.data <- random.data[1:length(random.data)-1] # Shift data by -1 to make same length of vector

y.diff <- random.data - y.lag # Subtract todays close - close from yesterday

y.diff <- y.diff [1:length(y.diff)-1] # Adjust length of vector

prev.y.mean <- y.lag - mean(y.lag) # Subtract yesterdays close from the mean of lagged differences

prev.y.mean <- prev.y.mean [1:length(prev.y.mean )-1] # Adjust length of vector

final <- merge(y.diff, prev.y.mean) # Merge

final.df <- as.data.frame(final) # Create final data frame

# Linear Regression With Intercept

result <- lm(y.diff ~ prev.y.mean, data = final.df)

half_life <- -log(2)/coef(result)[2]

half_life

# Linear Regression With No Intercept

result = lm(y.diff ~ prev.y.mean + 0, data = final.df)

half_life1 = -log(2)/coef(result)[1]

half_life1

Hi Ernie,

I'm studying Kalman Filter from your book. When trying to find appropriate delta for determining V_w for new pair, I learned that MLE and Bayesian inference are possible solutions. However, they suffered from local maximum and other restrictions, then particle filter came out as a better solution. Unfortunately, MCMC and particle filter are too hard for me to learn. So I'm seeking some help.

1. Are those techniques necessary for implementing Kalman Filter for profit?

2. Do we have simpler ways to find delta? ALS mentioned in your book?

3. Some webpages said Kalman Filter has parameters too sensitive which prevents it from being traded in reality. What's your opinion?

Hi DC911,

1) I don't believe MLE solution for Kalman parameters suffers from too many minima. MLE should be quite sufficient - as shown in my examples in Chapter 3 of my 3rd book "Machine Trading".

2) Again, Matlab Econometric Toolbox ssm (State Space Models) have pre-packaged solution for solving KF, as shown in my third book.

3) That really depends on which pairs you are applying KF to. I have known traders who deployed this technique successfully.

Ernie

Thank you for the response. I'll experiment and learn deeper about MLE using in KF and study your 3rd book. By the way, your previous books are really big help, and being successful in trading is painfully hard. ;)

David

Hi DC911,

I agree - it is hard and getting harder! (Not just my opinion, but D.E.Shaw's too.)

Ernie

Post a Comment