Hi,
Hi, 
Cumulative returns are more meaningful. P&L depends on the size of your orders and is arbitrary.
Ernie
Hi Ernie,

When we do backtesting, shall we use cumulative pl or cumulative return to evaluate performance? What are their strength or weakness?

Thanks.
Hi,
The typical way is to subtract some fixed percentage from each trade's returns. My first book Quantitative Trading has detailed examples on this.
Ernie
Hi Ernie,

Thank you for quick response!
Hi Ernie,

Thank you for quick response!

When we evaluate performance via profit&loss(pl) and cumulative pl, it is straightforward to include transaction costs.

When we evaluate performance via net return and cumulative return, how do we include transaction costs?

Many thanks.
Hi,
Log return = log(price(t))-log(price(t-1))
Hi,
Log return = log(price(t))-log(price(t-1))

Net return = price(t)/price(t-1) -1

Ernie
Hi Ernie,

What is the difference between log return and net return?

Thanks.
Hi,
Yes, for signal generation, not for performance evaluation. The latter need to conform to the industry norm - which uses net return.
Ernie
Hi Ernie,

Do you use log return during backtesting?

Thanks.
Hi Alex,
1) SVM is a classification algorithm, can only be used to predict direction of price moves.
2) Random forest can be applied to either classification or regression trees. The latter can be used to predict magnitude as well as direction of moves.
3) There is no reason to believe one is superior to the other with respect to various out-of-sample (OOS) performance measures.
4) Daily price is good enough if you aggregate >1,000 stocks and build the same model for all.
5) If a feature was found to work OOS, then of course one should retain it in library. There aren't that many features that work OOS, so no worries about "too many". We can only have too many unproven features - that lead to data snooping bias.
6) My book references many other books and articles. In terms of AI in trading, I recommend following @carlcarrie on Twitter, which posts numerous relevant articles daily.

Ernie
Hi Ernie,

I've listened to a couple of your talks online and had a few questions after reading your blog. My background is fundamental investing at asset management and hedge funds and only more recently have I tried to pick up machine learning. 

In one of lectures, you referenced SVM and random forest as two algorithms that work well for stock prediction. However, I wasn't clear if this is a classification or regression problem. In other words, what is more appropriate, to predict the actual stock price, the actual return or if the stock with be up or down? I've seen a lot of academic papers and I've seen a mix of what people do but I've seen no explanation as to the pros and cons of use one over the other. Perhaps you could enlighten me here.

My second question is regarding one of your comments about insufficient data. Are you saying that daily price data for your y label is not enough and you need minute by minute data? 

And my third question has to do with one of your slides where you say feature rich data sets are a curse. I've read about this before so I'm not questioning what you're saying. However, I remember listening to a talk given my James Simons, and he said that once they find a good predictor they just leave it in in case it comes back. If that's the case, then over time, they would like have a substantial number of features in the model. Are they doing something that perhaps is different?

My last question is about your recent book or perhaps you have a suggestion of other resources. I'm trying to find a resource that clearly shows right way to clean and transform feature data and cross validate in order to use it for prediction. I've not seen a source that accurately describes best practices.

Thank you,

Alex
Hi Ever,
Yes, as I have shown in another blog post (https://epchan.blogspot.ca/2014/08/kelly-vs-markowitz-portfolio.html), Kelly is essentially the same as the widely adopted Mean Variance optimization method, except that Kelly also suggests an overall leverage of a portfolio.

It is unclear what it means by periods where Kelly "fails". Fails meaning the portfolio is not profitable? But that has nothing to do with Kelly itself. Fails meaning the portfolio underperforms an equal weighted one? Certainly! But again, Kelly does not promise it always outperforms other allocation methods in all periods. It only promises that in the long run, it generates maximum growth rate. The long run, of course, can be very long.

Ernie
Based on all your books it seems that Kelly is the ideal allocation method for all trading strategies. Just to confirm, the Kelly formula has flaws when in use, I know you covered the half Kelly but beyond that, have you encountered period where you didn't use Kelly at all and periods where you used Kelly and it failed? Keep in mind I am not referring to the periods where you've recommended QT inventors to opt out of strategies based on regime changes or unknowns. I am aware that this specific does focus on Kelly so we can call it a side note.

Hi Emil,

The compound growth rate at any finite T asymptotically approaches the formula given in this post (which is exact at T-> infinity). This is true for most theoretical derivations involving time averages in finance.

Ernie
Very interesting, log-utility just happens to maximize the time average which is what we really should be focusing on. But what about the case when we have a fixed time horizon, then taking the limit as T -> infinity does not make sense and we don't get rid of the stochastic component in the growth rate?

Hi,
ivolatility.com, quandl.com, optionmetrics.com.
Ernie
Hi Ernie,

Where can we get historical option data (put, call, all strikes and expiry)?

Thanks.
Hi,
Certainly if ev is an eigenvector, -ev is one too. So you are free to multiply all components by -1. Does that work for you?
Ernie
Hi Ernie,

I mean the signs of the eigenvector. Let's say we have n=4 time series, then for example the first eigenvector may look like [.54 .23 -.55 .90] or [-.54 -.23 .55 -.90]. If one does a PCA on a set of economic time series one might want to preserve the economic interpretation of the first PCA, i.e. the signs matter. So when doing a PCA walk forward, it would be nice to guarantee that the vector "points in the same direction" for each time t. So my question was just whether you have come across a quick code trick for this. One suggestion I've come across is, at each time t, find the largest (in absolute terms) element in EIG(:,1) and make sure this value is positive or negative, depending on which direction one want to rotate the first eigenvector. But I'm not sure this is theoretically correct.
Hi,
Thanks for your kind words.
Hi,
Thanks for your kind words.

When you said "signs of eigenvectors are not unique" did you mean the sign of their eigenvalues, or the signs of the components of an eigenvector? 

If the former, that's not possible, since the eigenvalues of a covariance matrix are all >= 0 (the covariance matrix is positive semidefinite.)

If the latter, I don't see what's wrong with that. It just means that the hedge portfolio that corresponds to each eigenvector have both long and short positions. You cannot simply ignore the ones with negative values, as that would mean it is no longer an eigenvector of the covariance matrix.

Ernie
Hi Ernie,

Thanks for a great blog and great books. I have a PCA question.

As you know, the signs of eigenvectors are not unique. When using for example pcacov in Matlab, the sign can change from time to time. 

When backtesting, I need to make sure the eigenvectors are pointing in the same direction for each time t. Do you have a quick code trick for how to guarantee this?

Thanks
Hi Kin Wa,
Jensen inequality can certainly explain why E[exp(x)] !=exp(E[x]), but it doesn't give the whole picture.
Ernie
Hi Ernie, 

Is this something can be explained by the Jensen's inequality?

thanks 

Kin Wa
Hi,
Mainly E-minis, but small allocation to agricultural and energy futures.
Ernie
Hi Ernie,

Thank you for quick response.
Hi Ernie,

Thank you for quick response.

For Futures momentum strategy in your pool, what futures do you trade?
commodities or stocks index futures? Many thanks.
Hi,
We trade mainly FX mean reversion and Futures momentum strategy in our pool. There are a number of other strategies as well that have lower allocations.
Ernie