F=C

^{-1}*M (1)

where F is a Nx1 vector indicating the fraction of the equity to be allocated to each asset, C is the covariance matrix, and M is the mean vector for the excess returns of these assets. Note that these "assets" can in fact be "trading strategies" or "portfolios" themselves. If these are in fact real assets that incur a carry (financing) cost, then excess returns are returns minus the risk-free rate.

Notice that these fractions, or weights as they are usually called, are not normalized - they don't necessarily add up to 1. This means that F not only determines the allocation of the total equity among N assets, but it also determines the overall optimal leverage to be used. The sum of the absolute value of components of F divided by the total equity is in fact the overall leverage. Thus is the beauty of Kelly formula: optimal allocation and optimal leverage in one simple formula, which is supposed to maximize the compounded growth rate of one's equity (or equivalently the equity at the end of many periods).

However, most students of finance are not taught Kelly portfolio optimization. They are taught Markowitz mean-variance portfolio optimization. In particular, they are taught that there is a portfolio called the

*tangency portfolio*which lies on the efficient frontier (the set of portfolios with minimum variance consistent with a certain expected return) and which maximizes the Sharpe ratio. Left unsaid are

- What's so good about this tangency portfolio?
- What's the real benefit of maximizing the Sharpe ratio?
- Is this tangency portfolio the same as the one recommended by Kelly optimal allocation?

I want to answer these questions here, and provide a connection between Kelly and Markowitz portfolio optimization.

According to Kelly and Ed Thorp (and explained in my book), F above not only maximizes the compounded growth rate, but it also maximizes the Sharpe ratio. Put another way: the maximum growth rate is achieved when the Sharpe ratio is maximized. Hence we see why the tangency portfolio is so important. And in fact,

**the tangency portfolio is the same as the Kelly optimal portfolio F**, except for that fact that the tangency portfolio is assumed to be normalized and has a leverage of 1 whereas F goes one step further and determines the optimal leverage for us. Otherwise, the percent allocation of an asset in both are the same (assuming that we haven't imposed additional constraints in the optimization problem). How do we prove this?
The usual way Markowitz portfolio optimization is taught is by setting up a constrained

*quadratic*optimization problem - quadratic because we want to optimize the portfolio variance which is a quadratic function of the weights of the underlying assets - and proceed to use a numerical quadratic programming (QP) program to solve this and then further maximize the Sharpe ratio to find the tangency portfolio. But this is unnecessarily tedious and actually obscures the elegant formula for F shown above. Instead, we can proceed by applying Lagrange multipliers to the following optimization problem (see http://faculty.washington.edu/ezivot/econ424/portfolioTheoryMatrix.pdf for a similar treatment):
Maximize Sharpe ratio = F

^{T}*M/(F^{T}*C*F)^{1/2 }(2)
subject to constraint F

^{T}***1**=1 (3)
(to emphasize that the

**1**on the left hand side is a column vector of one's, I used bold face.)
So we should maximize the following unconstrained quantity with respect to the weights F

_{i }of each asset i and the Lagrange multiplier λ:
F

^{T}*M/(F^{T}*C*F)^{1/2 - }λ(F^{T}***1-**1) (4)
But taking the partial derivatives of this fraction with a square root in the denominator is unwieldy. So equivalently, we can maximize the logarithm of the Sharpe ratio subject to the same constraint. Thus we can take the partial derivatives of

log(F

^{T}*M)-(1/2)*log(F^{T}*C*F)^{ - }λ(F^{T}***1-**1) (5)
with respect to F

_{i}. Setting each component i to zero gives the matrix equation
(1/F

^{T}*M)M-(1/F^{T}*C*F)C*F=λ**1**(6)

Multiplying the whole equation by F

^{T }on the right gives
(1/F

^{T}*M)F^{T}*M-(1/F^{T}*C*F)F^{T}*C*F=λF^{T}***1**(7)

Remembering the constraint, we recognize the right hand side as just λ. The left hand side comes out to be exactly zero, which means that λ is zero. A Lagrange multiplier that turns out to be zero means that the constraint won't affect the solution of the optimization problem up to a proportionality constant. This is satisfying since we know that if we apply an equal leverage on all the assets, the maximum Sharpe ratio should be unaffected. So we are left with the matrix equation for the solution of the optimal F:

C*F=(F

^{T}*C*F/F^{T}*M)M (8)
If you know how to solve this for F using matrix algebra, I would like to hear from you. But let's try an

*ansatz*F=C^{-1}*M as in (1). The left hand side of (8) becomes M, the right hand side becomes (F^{T}*M/F^{T}*M)M = M as well. So the ansatz works, and the solution is in fact (1), up to a proportionality constant. To satisfy the normalization constraint (3), we can write
F=C

^{-1}*M / (**1**^{T}*C^{-1}*M) (9)
So there, the tangency portfolio is the same as the Kelly optimal portfolio, up to a normalization constant, and without telling us what the optimal leverage is.

===

**Workshop Update:**

**Based on popular demand, I have revised the dates for my online Mean Reversion Strategies workshop to be August 27-29.**

_{}

===

## 75 comments:

This is a true PHD level post:）

-HK

I never thought about Kelly in these terms, I should have done it before. Really elegant solution at the problem.

http://nightlypatterns.wordpress.com

Thank you for this post. I too have always wondered why Markowitz is taught but not Kelly portfolio optimization.

Hi Ernie,

Great post.

I have a question about sharpe ratio. Let's say I have an intraday fx strategy and I'm using hourly data. fx trading is 24 hours but my strategy is only active 8 hours a day. Is the correct way to calculate sharpe:

sqrt(252*8) * mean(ret)/std(ret)

or

sqrt(252*24) * mean(ret)/std(ret)

where

ret = hourly return vector (24 hours). Data is actually from 5pm EST Sundays until 4pm EST Fridays

Many thanks

You should use sqrt(252*8)*mean(ret)/std(ret),

where the ret has only 8 hourly bars. Otherwise the zeros in the other 16 hours will distort the true picture.

Ernie

Hi Ernie,

Is it possible to trade intraday US stocks pairs?

Could we get better sharpe ratio?

How long do we need to do backtesting for intraday strategies?

Thanks.

Sure, intraday stock pairs may improve Sharpe ratio as it may have more trades than interday pairs.

Even with intraday pairs, we need to backtest whether the strategy held up during extreme events, so it is best to start with 2007.

Ernie

Hi Ernie,

For pairs trading, how do we size every trade to have a potential equal dollar impact on our portfolio?

Hi Ernie,

I find there are 1 second, 5 secs, 15 secs, 30 secs, 1 min , 5 mins bars for intraday.

Do we need to test all of them?

Thanks.

You can set each side of a pair trade to equal dollar value, or you can use a hedge ratio derived from linear regression or Johansen test to set the number of shares. The returns and risks differ for each approach.

Ernie

Ideally, you can backtest at the highest frequency. If you desire longer holding period, there are many ways to accomplish that without artificially limiting yourself to lower frequency data. There is nothing magical about 5-min bars vs. 1-min bar. The latter is a superset of the former.

Ernie

Hi Ernie,

Thank you for response.

I mean, to size trades from different stocks pairs in our portfolio.

It seems some pairs have higher returns while some have lower returns. Do we need to balance them to smooth equity curve?

It seems this is more important for trend following strategies.

Allocating capital to different pairs is similar to allocating capital to different stocks. So the topic of my current article is relevant here. Other practitioners prefer allocating capital inversely proportional to volatility, resulting in a minimum variance portfolio. All these different approach are discussed comprehensively in Prof. Ang's book, on top of my Recommended Books list on this blog.

Ernie

Hi Ernie,

So we can just use Kelly formula to allocate capital to different stock pairs in our portfolio?

Yes.

Ernie

Hi Ernie,

Could we trade strategies on IB TWS if the holding period is 2 mins?

Hi Ernie,

Would you use option in some strategies? Seem like a lot of hedge funds using option with stock/future for statistical arbitrage.

-HK

Yes, IB's latency is short enough for you to trade at 2-min bars.

Ernie

Hi HK,

Yes, I have considered using options to implement statarb strategies. (See an earlier blog article of mine on this topic.)

However, I generally find that the bid-ask spread is too large for my strategies.

Ernie

Hi Ernie,

To compute returns in Kelly formula,

usually, how long is "one-period"?

Could we set risk-free rate as zero?

Or where could we get that number?

We typically take one period to be one trading day, but of course it can be one minute, one hour, or one week depending on your strategy.

Risk-free rate is zero if your portfolio or strategy is self-financing. Otherwise, you need to look up the Federal Reserve's website (for US investors) to look up the 3-month treasury rate.

Ernie

Hi Ernie,

How long is the lookback window to compute expected returns and variance in Kelly?

Pairs trading is self-financing if I hold dollar neutral positions?

Minimum of 3 years. Ideally, the lookback will include periods of market stress, just as in a backtest.

Pairs trading is self-financing.

Ernie

Hi Ernie,

If "one-period" is one day,

we need to calculate leverage, F* every day using moving windows, 3 years? Drop one return, add one every day?

Btw, IB only have only one year BID, ASK one min bars. Where could we get longer historical BID, ASK bars?

Thanks.

Yes.

Follow the link called "High Frequency Historical Data" in the Links section of the right side bar of my blog. That is the cheapest source. (See also the Tech Update section of my article Short Interest as a Factor.)

Ernie

Hi Ernie,

Thanks for the reply. There are many inactive ETF in Hong Kong market which let me trade worldwide index. For example, there is Brazil index ETF. Most of the time the bid-ask spread is still reasonable with one official unit different, sometimes it would be two official unit different. I guess the ETF management companies would not take advantage in the price since they suppose to earn from management fee.

So is there any disadvantage to trade inactive ETF?

-HK

Hi HK,

Besides the bid-ask spread, one should pay attention to the bid-ask sizes as well. Are they big enough to support your proposed order? If so, there is no reason not to trade these ETFs.

Ernie

Hi Ernie,

I just find that we can buy Quote Booster packs to get 4 years historical data in IB. So we may get 4 years 1 min BID/ASK bars in IB.

Have you heard about that?

Good to know that - thanks!

Ernie

Hi Ernie,

If we trade intraday stock pairs,

we need to read companies news every day?

Yes, if you want to avoid pairs that dis-cointegrate.

Ernie

Hi Ernie,

Stocks Pairs trading is a little bit like gap trading.

I mean, every morning, we have stocks gap up, gap down, which generate buy/sell signals for stock pairs.

Dear Ernie,

What do you think about LPPL model for bubble burst forecasting? Is there any easier way to understand it and implement it with coding?

-HK

Hi HK,

I have not studied the LPPL model, and it has a low priority for me because we tend to trade market-neutral models, so bubbles in asset prices are not of major concern to us.

Ernie

Hi Ernie,

For stocks sectors and industries categories, there are some different lists. I am a little bit confused. Would you please recommend one?

There are many more industry groups than sectors. Industry group is a fine-grained categorization, while sector is more coarse-grained.

Which one to use depends on your strategy.

Ernie

Hi Ernie,

Thank you for quick response.

Would you please recommend websites or documents which provide US stocks lists for different industry groups and sectors?

I find that, to some extent, they are all different. For example,

sectors and groups in IB TWS filter are different those in Yahoo finance.

Ideally, in the same groups or sectors, companies have similar business activities.

Thanks.

Yahoo Finance has list of stocks in various industries or sectors. E.g. http://biz.yahoo.com/p/821conameu.html

or http://biz.yahoo.com/p/515conameu.html

Ernie

Hi Ernie,

For a stock pair, I got two sets of statistic. The first has Sharpe ratio 3.36, 33% return, 147 trades a year(69% wins), max drawdown 3.95%.

The second has Sharpe ratio 3.45,20% return, 33 trades a year(79% wins),

max drawdown 1.12%.

Which one would you pick to trade in real-time?

Thanks

For a mean reverting strategy, I am concerned about tail risk, so I like to look at Calmar ratio as well.

The first one has Calmar ratio=8.4, the second one is 16.7. So the second one has higher Sharpe and Calmar ratio, and I prefer that.

Ernie

Hi Ernie,

I just read your managed accounts.

You only have one big drawdown.

May I ask what happened in the market on Sep. 2011 causing that drawdown?

Do you still trade ETFs pairs or stock pairs? Or you focus on fx.

In Sept 2011, there was a strange day for the Mexican Peso, which moved more than 2.5% in a few hours. Their central bank since then has decided to intervene in the markets to keep peso from changing more than a certain band daily and therefore this tail risk was eliminated.

We still trade long-short stock portfolios. Not trading ETFs at the moment, but may start again soon.

Ernie

Hi Ernie,

What is the difference between

long-short stock portfolios and stock pairs trading?

I find some people trade stock pairs in the opposite way (not mean-reverting, but directional). Do you have any comments about that?

Thanks.

Long-short stock portfolios involve many long and short stocks, not just a pair. We bet on "cross-sectional" mean reversion or momentum.

A example where we can trade a pair using momentum strategy is when it is a merger arbitrage.

Ernie

Hi Ernie,

IS it ok to trade 40 stock pairs at the same time if we find they are profitable in backtesting?

Why not?

Ernie

Great post Ernie! Just what I was looking for.

Many thanks, Tom

Hi Ernie, I don't want to waste too much of your time but if you have a minute, would you care giving a little more detail of the step where you go from the derivative of the log() to the next expression. You mention that you set Fi=0, wouldn't that set the whole expression to zero? I would be grateful for some more explanation. Thanks so much.

Hi Tom,

When I wrote "...we can take the partial derivatives of

log(FT*M)-(1/2)*log(FT*C*F) - λ(FT*1-1) (5)

with respect to Fi. Setting each component i to zero ..." I did not mean setting Fi to zero. I meant setting the expression that results from taking the partial derivative w.r.t. Fi to zero.

Hope this helps.

Ernie

Doh! Of course, now I get it. Thank you so much for your help. Really good work!

Tom

Hi Ernie,

I am reading through your 2nd book. Like your style, code and details. You have provided nice examples with code. However, I was sad to see no code for example 8.1, though it was implicitly referred in another part of the book. I wanted to see your implementation.

Regards

Anon

Example 8.1 does not require code. I have already displayed the entire arithmetic calculation on what amount of stock to sell under constant leverage. The arithmetic calculation can be done on a simple calculator or by hand.

Ernie

I meant something on p164 for, "The APR of trading xxx is 15 percent with a Sharpe ratio of 1.8 from October 12, 2011, to October 25, 2012"

Regards,

Anon

The strategy described on p. 164 is very simple to backtest - hence no code was provided. You can do this on Excel.

Ernie

Hi Ernie

What happens if there is a positive value constraint for portfolio allocation?

Hi Yifan,

Generally speaking, imposing inequality constraints makes the optimization problem insolvable analytically. So you would have to resort to numerical solutions. See http://en.wikipedia.org/wiki/Karush%E2%80%93Kuhn%E2%80%93Tucker_conditions

Ernie

After reading some theory (e.g. Estrada, 2010, "Geometric mean optimisation"), it seems to me that the Kelly criterion does not lead to sharpe maximisation, but to growth maximisation, and that the two generally lead to different outcomes. I.e. if you use CAPM, you maximise sharpe, if you use Kelly, you maximise the expected geometric return.

Hi manuka,

The formula for the maximum growth rate is g=r+S^2/2, where r is the risk free rate and S is the Sharpe ratio (see http://www.edwardothorp.com/sitebuildercontent/sitebuilderfiles/KellyCriterion2007.pdf). So maximum growth rate coincides with maximum Sharpe ratio.

Ernie

Ernie,

Thanks for your excellent comment, and sorry for my very tardy reply - I was travelling and could not access the blog. Having read the paper you refer to, you are absolutely right. Intuition failed me: I thought if Kelly tells us how much to allocate to each investment, then surely we should just look to maximise Kelly, i.e. we could use Kelly both for sizing and for choosing between investments...no doubt this is wrong. In fact, playing around with some simple examples shows that a higher sharpe ratio does not always lead to a higher Kelly: for example suppose the risk free rate is 0, and we have two strategies A and B. mu A = 0.3, std A=0.15; mu B=0.2, std B=0.11. Here srategy A has a higher sharpe, but Kelly guides us to invest a higher portion into B. And this is the path that leads to maximised expected geometric growth. Another very interesting thing about the link you referred me to is that Thorpe's derivation shows a reason for maximising the Sharpe ratio (i.e. maximum geometric growth of wealth) which has nothing to do with minimum variance portfolio theory...i.e. he derives the capital asset pricing model key result with a whole different set of assumptions. Thanks for the help.

Hi manuka,

I think you may be a little confused. We can indeed use Kelly to choose the optimal leverage, and to optimally allocate investments (see my book Quantitative Trading's chapter 6 for examples.)

The maximum growth rate formula I quoted before only works when we are levered at exactly the Kelly leverage. Otherwise maximizing Sharpe ratio does not in general lead to maximum growth. You can definitely simultaneously optimize leverage and the Sharpe ratio: Kelly ratio of a portfolio will tell you how much you need to leverage, it is independent of the internal asset allocations. But if you compute the Kelly ratio of each individual asset taking into account their covariances of returns, then each individual Kelly ratio will tell you the asset allocation as well as the individual (and portfolio level) leverage to use.

Ernie

Ernie,

Many Thanks again for your insightful reply. I did the derivation for half Kelly and found that it implies a geometric (per period) growth of r+3/8*S^2, i.e. a bit lower than the one for the full Kelly, but still implying that the growth is maximised when sharpe ratio is maximised. Hence my current understanding is that in order to maximise the geometric growth: 1. Always pick investment strategies with the highest sharpe ratios 2. Allocate between them using Kelly (or fractional Kelly), 3. Leverage according to Kelly (relatively). I think this is in line with what you are saying in both your books, but part of my confusion was the wrong view that I can forget about sharpe ratios altogether and just pick those strategies that show the highest Kelly. Thanks again, and I hope I got it right now.

Hi Manuka,

Yes, I agree with your latest statements.

Ernie

Hi Ernie,

The Kelly formula may give negative weight to a strategy. For a long-only portfolio, can the Kelly formula still apply with minor revision? Thank you!

Wen

Hi Wen,

Yes, you can add the positivity constraint when you maximize the Sharpe ratio in formulas 2-3 above.

Ernie

Hi Ernie,

Am I right that this formula F=C-1*M implies equal leverage for different components of a portfolio? For example, if we have two components with F = (2.5, 4.5) than overall leverage will be 7.0 and weights 2.5/7.0=0.36 and 4.5/7.0=0.64. So each component has the same leverage 7 and weights 0.36 and 0.64 summing up to 1. If we have 100$ than we have to buy component one by value 100*7*0.36=252$ and component two by value 100*7*0.64=448$ summing up our portfolio to 700$ according to leverage 7.

What if we calculate leverage using Kelly formula for each component separately before building portfolio? And then use leveraged returns as input into portfolio optimization with constraint on weights summing them to 1? My intuition tells me that in this case we will get different weights (from the first approach) and as we used different leverages for each component, the overall leverage of portfolio will remain the same (as in the first approach where it was 7 in the example). What do you think?

thank you

Pavel

Hi Pavel,

No, F=C^(-1)*M most certainly does not imply equal leverage. The whole point of using this formula is to find out what the optimal leverages are for each instrument in the portfolio. Those leverage are given by F. So if F=(2.5, 4.5), you should apply 2.5 leverage to the first stock, and 4.5 leverage to the second. Remember, leverage is with respect to account equity, which of course is the same for all stocks in the account.

Assuming that account has $1M, you would trade stock 1 with market value $2.5M, and stock 2 with market value $4.5M.

Ernie

David Varadi posted a study using the kelly maximization formula for multiple trading strategies with spearman's rank correlation along-side pearson. Apperently spearman worked better, which I can see the reasoning for (taking non-linear relationships into account) but seems to lack a mathematical justification. Do you think it would make sense to use spearman in the same formula derived using standard correlation?

Emil,

The simple formula for Kelly assumes normality of returns. If one has to use Spearman correlation because returns are not normal, then the entire derivation won't work.

Ernie

Hello Ernie, I got to your blog through some Youtube videos I saw. I must say that its the first time I see someone with such a firm theoretical mathematical approach to these topics, I really like the style and for us "engineer-minded" folks it really helps.

Sorry to come back to such an old post but by no means outdated, let me cut to the chase.

I am trying to find good information about asset allocation for active strategies, but I am stuck with buy and hold approach...

How would you adapt this analysis for strategies such as mid-term trend following, where the distribution of assets keep changing constantly, but still want to have a large invested ratio of the total capital?

If I optimize the portfolio (either Kelly or Markowitz) for each asset "Under surveillance", then I will get a distribution of assets that will then not be real since I may not be invested in all of them if the signals are not triggered for the strategy. On the other hand, recalculating and re-balancing too often seems impractical and not too cost effective.

I would appreciate your insight on this topic and any reference on where to get further data on how to apply asset allocation for active strategies.

Thanks!

Hi Franco,

The returns discussed in this post do not have to refer to "assets". It can refer to a portfolio of strategies. So the capital allocation would be to different strategies. If you only have one strategy, then Kelly optimization would just give you the optimal leverage to apply on that strategy.

I actually discuss this in some details in the first chapter of my new book "Machine Trading".

Ernie

Having only one strategy but applied to different assets should bring a matrix of optimal leverages for that mix, since the means and variances of returns of the same strategy applied to different assets should be substantially different if the correlations are low and the assets have different dynamics, shouldn't it? Eg, if I want to trade oil, gold, S&P on the same strategy, then each one will have its optimal leverage ratio (same strategy on different underlying is actually like a new strategy).

So this means that every time one asset has an entry signal or exit signal leverages need to be recalculated for the set of assets that are currently with open positions for the one strategy being applied?

I will try to get a hold of your book in the meanwhile...

Thanks for the comments.

Hi Franco,

Yes, if you apply same strategy to different assets, Kelly formula will determine both asset allocation and overall leverage simultaneously.

However, this doesn't mean that you should recalculate the leverage on each asset every time there is an entry or exit. The leverage applies to the maximum position you should hold, irrespective of whether your strategy actually recommends a position at any given moment.

Ernie

I am a little confused: what if the covariance matrix is singular (or almost singular), as is usually true in "real life" market applications for multiple securities. Your formula blows up, which is a bit counterintuitive...

The formula at the beginning of the post blows up if the covariance matrix is singular (or close to it). This seems counter-intuitive (in particular, if you have two strategies, with the returns of one simply equal to the returns of other plus, say 5 bp /day. You should obviously allocate all your money to the "better" strategy", there is no obvius instability created).

Hi igrivin,

The singular case where 2 assets are perfectly correlated is indeed interesting, but the allocation scheme in that case isn't what you described. It would be to short the asset with the lower return by an infinite amount, and use the money to long the other by an infinite amount. Hence the mathematical difficulty.

This difficulty can be avoided by taking the limit of the correlation going to 1. For e.g. correl(1, 2)=0.9, var(1)=0.1, var(2)=0.1, M=[1 0.1]', hence C=[0.1 0.9*sqrt(0.1*0.1); 0.9*sqrt(0.1*0.1) 0.1], inv(C)*M=[453 -447]'. If correl(1,2) -> 0.999, inv(C)*M -> [4503 -4497]. You see the pattern?

Ernie

Post a Comment