Friday, May 03, 2013

Nonlinear Trading Strategies

I have long been partial to linear strategies due to their simplicity and relative immunity to overfitting. They can be used quite easily to profit from mean-reversion. However, there is a serious problem: they are quite fragile, i.e. vulnerable to tail risks. As we move from mean-reverting strategies to momentum strategies, we immediately introduce a nonlinearity (stop losses), but simultaneously remove certain tail risks (except during times when markets are closed). But if we want to enjoy anti-fragility and are going to introduce nonlinearities anyway, we might as well go full-monty, and consider options strategies. (It is no surprise that Taleb was an options trader.)

It is easy to see that options strategies are nonlinear, since options payoff curves (value of an option as function of underlying stock price) are plainly nonlinear. I personally have resisted trading them because they all seem so complicated, and I abhor complexities. But recently a reader recommended a little book to me: Jeff Augen's "Day Trading Options" where the Black-Scholes equation (and indeed any equation) is mercifully absent from the entire treatise. At the same time, it is suffused with qualitative ideas. Among the juicy bits:

1) We can find distortions in the 2D implied volatility surface (implied volatility as z-axis, expiration months as x, and strike prices as y) which may mean revert to "smoothness", hence presenting arbitrage opportunities. These distortions are present for both stock and stock index options.

2) Options are underpriced intraday and overpriced overnight: hence it is often a good idea to buy them at the market open and sell them at market close (except on some special days! See 4 below.). In fact, there are certain days of the week where this distortion is the most drastic and thus favorable to this strategy.

3) Certain cash instruments have unusually high kurtosis, but their corresponding option prices consistently underprice such tail risks. Thus structures such as strangles or backspreads can often be profitable without incurring any left tail risks.

4) If there is a long weekend before expiration day (e.g. Easter weekend),  the time decay of the options value over 3 days is compressed into an intraday decline on the last trading day before the weekend.

Now, as quantitative traders, we have no need to take his word on any of these assertions. So, onward to backtesting!

(For those who may be stymied by the lack of affordable historical intraday options data, I recommend Nanex.net.)

===

There are still 2 slots available in my online Mean Reversion Strategies workshop in May. The workshop will be conducted live via Adobe Connect, and is limited to a total of 4 participants. Part of the workshop will focus on how to avoid getting hurt when a pair or a portfolio of instruments stop cointegrating.

Thursday, April 04, 2013

An Integrated Development Environment for High Frequency Strategies

I have come across many software platforms that allow traders to first specify and backtest a strategy and then, with the push of a button, turn the backtest strategy into a live trading program that can automatically submit orders to their favorite broker. (See all my articles on this topic here.)  I called these platforms "Integrated Development Environment" (IDE) in my new book, and they range from the familiar and retail-oriented (e.g. MetaTrader, NinjaTrader, TradeStation), to the professional but skills-demanding (e.g. ActiveQuant, Marketcetera, TradeLink),  and finally to the comprehensive and industrial-strength (e.g. Deltix, Progress Apama, QuantHouse, RTD Tango). Some of these require no programming skills at all, allowing you to construct strategies by dragging-and-dropping, others use some simple scripting languages like Python, and yet others demand full-blown programming abilities in Java, C#, or C++. But which of these allow us to backtest and execute high frequency strategies?

To state the obvious: backtesting HFstrategies is quite hard. The volume of data is one issue. But in addition, the execution details are very important to such strategies: details such as the exact exchange/venue to which we are routing our orders, the precise state of the order book that triggers our orders, the order types we are using, and finally the probability of getting filled if we use non-marketable orders. Messing up one of these details and the backtest will be far from realistic. I often tell people that it is easier to paper trade a HF strategy than to backtest one. While many of the platforms I reported above do allow backtesting using tick data, I don't know that they enable backtesting using the full order book and choice of execution venue. With this background, I am happy to report I have recently come across just such a platform called Lime Strategy Studio.

First, the bad news. LimeTrader is useful only to traders who trade with Lime Brokerage, as it is configured to send live orders to Lime only. [UPDATE: I have since learned that there are adapters available for 3rd party brokers.] However, if you are going to trade HF stocks and futures strategies, why not go with Lime, since they provide you with a comprehensive API, direct ultra-low latency feeds from the exchanges, and allow (nay, insist on) colocation either at the exchanges or at their data center at a reasonable fee? (Full Disclosure: I have no current business relationship with Lime, though I was a customer.) Another piece of bad news: the specification of the strategy must be in C++.

But once you get over these two hurdles, the benefits are manifold. Every detail that you can specify for a live trading strategy can be specified for the backtest and paper trading. As I said, these details may include order type, trading venue, state of order book, and even statistics of the order book, not to mention fundamental data such as earnings, corporate actions, and other user-provided data such as news. A fill simulator is included for your non-marketable orders. As with other IDEs, once you backtested a strategy in its every detail and are satisfied with the performance metrics, you can go live (either for paper or production trading) with the push of a button.

If any reader know of other IDEs that have similar features and useful for backtesting HF strategies, please let us know!

===

Speaking of HF strategies, traders often lament the ultra-high secrecy around them and the difficulty of gathering knowledge in this field. A friend (hat tip: Dave) referred me to this paper by Prof. Dragos Bozdog et. al. that gives a flavor of what sort of modeling may be involved. I find it very readable and thought-provoking.

===

There are still 2 slots available in my online Mean Reversion Strategies workshop scheduled for May.



Thursday, March 14, 2013

What Can Quant Traders Learn from Taleb's "Antifragile"?

It can seem a bit ironic that we should be discussing Nassim Taleb's best-seller "Antifragile" here, since most algorithmic trading strategies involve predictions and won't be met with approval from Taleb. Predictions, as Taleb would say, are "fragile" -- they are prone to various biases (e.g. data snooping bias) and the occasional Black Swan event will wipe out the small cumulative profits from many correct bets. Nevertheless, underneath the heap of diatribes against various luminaries ranging from Robert Merton to Paul Krugman, we can find a few gems. Let me start from the obvious to the subtle:

1) Momentum strategies are more antifragile than mean-reversion strategies.

Taleb didn't say that, but that's the first thought that came to my mind. As I argued in many places, mean reverting strategies have natural profit caps (exit when price has reverted to mean) but no natural stop losses (we should buy more of something if it gets cheaper), so it is very much subject to left tail risk, but cannot take advantage of the unexpected good fortune of the right tail. Very fragile indeed! On the contrary, momentum strategies have natural stop losses (exit when momentum reverses) and no natural profit caps (keep same position as long as momentum persists). Generally, very antifragile! Except: what if during a trading halt (due to the daily overnight gap, or circuit breakers), we can't exit a momentum position in time? Well, you can always buy an option to simulate a stop loss. Taleb would certainly approve of that.

2) High frequency strategies are more antifragile than low frequency strategies.

Taleb also didn't say that, and it has nothing to do with whether it is easier to predict short-term vs. long-term returns. Since HF strategies allow us to accumulate profits much faster than low frequency ones, we need not apply any leverage. So even when we are unlucky enough to be holding a position of the wrong sign when a Black Swan hits, the damage will be small compared to the cumulative profits. So while HF strategies do not exactly benefit from right tail risk, they are at least robust with respect to left tail risk.

3) Parameter estimation errors and vulnerability to them should be explicitly incorporated in a backtest performance measurement.

Suppose your trading model has a few parameters which you estimated/optimized using some historical data set. Based on these optimized parameters, you compute the Sharpe ratio of your model on this same data. No doubt this Sharpe ratio will be very good, due to the in-sample optimization. If you apply this model with those optimized the parameters on out-of-sample data, you would probably get a worse Sharpe ratio which is more predictive. But why stop at just two data sets? We can find N different data sets of the same size, calculate the optimized parameters on each of them, but compute the Sharpe ratios over the N-1 out-of-sample data sets. Finally, you can average over all these Sharpe ratios. If your trading model is fragile, you will find that this Sharpe ratio is quite low. But more important than Sharpe ratios, you should compute the maximum drawdown based on each set of parameters, and also the maximum of all these max drawdowns. If your trading model is fragile, this maximum of maximum drawdowns is likely to be quite scary.

The scheme I described above is called cross-validation and is well-known before Taleb, though his book reminds me of its importance.

4) Notwithstanding 3) above, a true estimate of the max drawdown is impossible because it depends on the estimate of the probability of rare events. As Taleb mentioned, even in case of a normal distribution, if the "true" standard deviation is higher than your estimate by a mere 5%, the probability of a 6-sigma event will be increased by 5 times over your estimate! So really the only way to ensure that our maximum drawdown will not exceed a certain  limit is through Constant Proportion Portfolio Insurance: trading risky assets with Kelly-leverage in a limited liability company, putting money that you never want to lose in a FDIC-insured bank, with regular withdrawals from the LLC to the bank (but not the other way around).

5) Correlations are impossible to estimate/predict. The only thing we can do is to short at +1 and buy at -1.

Taleb hates Markowitz portfolio optimization, and one of the reasons is that it relies on estimates of covariances of asset returns. As he said, a pair of assets that may have -0.2 correlation over a long period can have +0.8 correlation over another long period. This is especially true in times of financial stress. I quite agree on this point: I believe that manually assigning correlations with values of  +/-0.75, +/-0.5, +/-0.25, 0 to entries of the correlation matrix based on "intuition" (fundamental knowledge) can generate as good out-of-sample performance as any meticulously estimated numbers.The more fascinating question is whether there is indeed mean-reversion of correlations. And if so, what instruments can we use to profit from it? Perhaps this article will help.

6) Backtest can only be used to reject a strategy, not to predict its success.

This echoes the point made by commenter Michael Harris in a previous article. Since historical data will never be long enough to capture all the possible Black Swan events that can occur in the future, we can never know if a strategy will fail miserably. However, if a strategy already failed in a backtest, we can be pretty sure that it will fail again in the future.

===

The online "Quantitative Momentum Strategies” workshop that I mentioned in the previous article is now fully booked. Based on popular demand, I will offer a "Mean Reversion Strategies" workshop in May. Once again, it will be conducted in real-time through Skype, and the number of attendees will be similarly limited to 4. See here for more information.






Monday, February 18, 2013

A workshop, a webinar, and a question

There is a workshop on the 25th of February titled "Market turbulence; monetization; and universality" by Mike Lipkin at Columbia University that promises to be interesting to those traders who have a physics background. Mike is a former colleague of mine at Cornell's Laboratory of Atomic and Solid State Physics, and I fondly remember the good old days when we all hunched over the theory group's computers while day-dreaming of our future. Mike has since gone on to become an options market-maker at the American Stock Exchange and an Adjunct Associate Professor at Columbia. He published some very interesting research on the "stock pinning" phenomenon near options expirations, i.e. stock prices often converge to the nearest strike prices of their options just before expirations.

---

If we want to trade directly on various FX ECNs such as HotspotFX or EBS, perhaps because we want to run some HFT strategies, we will need to be sponsored by a prime broker. However, since the Dodd-Frank act has been in full force, no prime brokers that I know of are willing to take on customers with less than $10M assets. (I often feel that the CFTC's primary goal is to prevent small players like myself from ever competing with bigger institutions. Of course, their stated goal is to "protect" us from financial harm ....) The only exception may be CitiFX TradeStream ECN. Has any reader ever traded on this market? Any reviews or comments will be most welcome.

---

I am now offering an online workshop "Quantitative Momentum Strategies” to a select number of traders and portfolio managers. It will be conducted in real-time through Skype, and the number of attendees will be limited to 4. See here for more information.

Sunday, February 03, 2013

A stock factor based on option volatility smirk

A reader pointed out an interesting paper that suggests using option volatility smirk as a factor to rank stocks. Volatility smirk is the difference between the implied volatilities of the OTM put option and the ATM call option. (Of course, there are numerous OTM and ATM put and call options. You can refer to the original paper for a precise definition.) The idea is that informed traders (i.e. those traders who have a superior ability in predicting the next earnings numbers for the stock) will predominately buy OTM puts when they think the future earnings reports will be bad, thus driving up the price of those puts and their corresponding implied volatilities relative to the more liquid ATM calls. If we use this volatility smirk as a factor to rank stocks, we can form a long portfolio consisting of stocks in the bottom quintile, and a short portfolio with stocks in the top quintile. If we update this long-short portfolio weekly with the latest volatility smirk numbers, it is reported that we will enjoy an annualized excess return of 9.2%.

As a standalone factor, this 9.2% return may not seem terribly exciting, especially since transaction costs have not been accounted for. However, the beauty of factor models is that you can combine an arbitrary number of factors, and though each factor may be weak, the combined model could be highly predictive. A search of the keyword "factor" on my blog will reveal that I have talked about many different factors applicable to different asset classes in the past. For stocks in particular, there is a short term factor as simple as the previous 1-day return that worked wonders. Joel Greenblatt's famous "Little Book that Beats the Market" used 2 factors to rank stocks (return-on-capital and earnings yield) and generated an APR of 30.8%.

The question, however, is how we should combine all these different factors. Some factor model aficionados will no doubt propose a linear regression fit, with future return as the dependent variable and all these factors as independent variables. However, my experience with this method has been unrelentingly poor: I have witnessed millions of dollars lost by various banks and funds using this method. In fact, I think the only sensible way to combine them is to simply add them together with equal weights. That is, if you have 10 factors, simply form 10 long-short portfolios each based on one factor, and combine these portfolios with equal capital. As Daniel Kahneman said, "Formulas that assign equal weights to all the predictors are often superior, because they are not affected by accidents of sampling".


Wednesday, January 02, 2013

The Pseudo-science of Hypothesis Testing

Backtesting trading strategies necessarily involves a very limited amount of historical data. For example, I seldom test strategies with data older than 2007. Gathering longer history may not improve predictive accuracy since the market structure may have changed substantially. Given such scant data, it is reasonable to question whether the good backtest results (e.g. a high annualized return R) we may have obtained is just due to luck. Many academic researchers try to address this issue by running their published strategies through  standard statistical hypothesis testing.

You know the drill: the researchers first come up with a supposedly excellent strategy. In a display of false modesty, they then suggest that perhaps a null hypothesis can produce the same good return R. The null hypothesis may be constructed by running the original strategy through some random simulated historical data, or by randomizing the trade entry dates. The researchers then proceed to show that such random constructions are highly unlikely to generate a return equal to or better than R. Thus the null hypothesis is rejected, and thereby impressing you that the strategy is somehow sound.

As statistical practitioners in fields outside of finance will tell you, this whole procedure is quite meaningless and often misleading.

The probabilistic syllogism of hypothesis testing has the same structure as the following simple example (devised by Jeff Gill in his paper "The Insignificance of Null Hypothesis Significance Testing"):

1) If a person is an American then it is highly unlikely she is a member of Congress.
2) The person is a member of Congress.
3) Therefore it is highly unlikely she is an American.

The absurdity of hypothesis testing should be clear. In mathematical terms, the probability we are really interested in is the conditional probability that the null hypothesis is true given an observed high return R: P(H0|R). But instead, the hypothesis test merely gives us the conditional probability of a return R given that the null hypothesis is true: P(R|H0). These two conditional probabilities are seldom equal.

But even if we can somehow compute P(H0|R), it is still of very little use, since there are an infinite number of potential H0. Just because you have knocked down one particular straw man doesn't say much about your original strategy.

If hypothesis testing is both meaningless and misleading, why do financial researchers continue to peddle it? Mainly because this is de rigueur to get published. But it does serve one useful purpose for our own private trading research. Even though a rejection of the null hypothesis in no way shows that the strategy is sound, a failure to reject the null hypothesis will be far more interesting.

(For other references on criticism of hypothesis testing, read Nate Silver's bestseller "The Signal and The Noise". Silver is of course the statistician who correctly predicted the winner of all 50 states + D.C. in the 2012 US presidential election. The book is highly relevant to anyone who makes a living predicting the future. In particular, it tells the story of one Bob Voulgaris who makes $1-4M per annum betting on NBA outcomes. It makes me wonder whether I should quit making bets on financial markets and move on to sports.)

Thursday, November 29, 2012

The Importance of 2 (as Sharpe Ratio)

A reader ezbentley recently pointed out a little-noticed fact in the derivation of Kelly's formula: if we apply the optimal Kelly leverage, then the standard deviation of the annualized compounded growth rate of your equity is none other than the Sharpe ratio (Sdev=S). This fact is of mild interest in itself, but its implication has relevance to another interesting fact of behavioral finance, so I will reproduce our discussions here.

Suppose our strategy has an annualized Sharpe ratio of 2. According to the above result, Sdev=2 as well. This may startle some of us: a standard deviation of 200% of our compounded growth rate g - wouldn't ruin be very likely? But check out g itself: g=S^2/2, so g=2 when S=2, which means that g itself is exactly 200%. A Sdev of 200% here means that if the growth rate drops one standard deviation below its mean, we will still manage not to lose money for the year. Another way to put this is that there is a 84.1% chance that our annual return will be greater than 0, based on the Gaussian distribution.

It gets better if S goes above 2. For example, at S=3, g=4.5, but Sdev is just 3. So you can see that as S goes above 2, a 1 standard deviation fluctuation of g below the mean will still get you a positive number: profitable for the year.

This is a very interesting result: this means that S=2 is really an important threshold in more ways that I realized. From behavioral finance experiments, we already know that humans demands $2 profits for $1 risk. Given the universal desire of portfolio managers not to lose money on the year, it turns out that the demand of a Sharpe ratio of at least 2 is quite rational!

===

Now, time for a couple of public service announcements:

1) Those who are looking for a way to connect Matlab to Interactive Brokers should check out undocumentedmatlab.com. The creator of this product has an accompanying book, and the documentation for the product is excellent.

2) NAG sells high performance Matlab toolboxes for those who prefer alternatives to the native ones.

3) Here is the Twitter feed for FIXGlobal Online, the magazine from the creator of the FIX Protocol, an order submission standard. Interesting breaking news from the global finance scene.

Thursday, October 25, 2012

A leveraged ETFs strategy

In a post some years ago, I argued that leveraged ETF (especially the triple leveraged ones) are unsuitable for long-term holdings. Today, I want to present research that suggests leveraged ETF can be very suitable for short-term trading.

The research in question was just published by Prof. Pauline Shum and her collaborators at York University. Here is the simplest version of the strategy: if a stock market index has experienced a return >= 2% since the previous day's close up to the current time at 2:15pm ET, then buy this index (via its futures, ETFs, or stock components) right away, and exit at the close with a market-on-close order. Vice versa if the return is <= -2%. The annualized average return from June 2006 to July 2011 was found to be higher than 100%.

Now this strategy is actually quite well-known among institutional traders, although this is the first time I see the backtest results published. The reason why it works is also quite well-known: it has to do with the fact that every leveraged ETF need to rebalance at the market close in order to keep its leverage constant (at x2 or x3, depending on the fund). If the market index goes up, the fund needs to buy the component stocks; otherwise, it needs to sell stocks. If there is major market movement (with absolute return >= 2%) since the previous close, then the amount of stocks that need to be bought or sold will be correspondingly larger, resulting in momentum in all those stocks near the close. This strategy aims to front-run this rebalancing to take advantage of the anticipated momentum.

It has been estimated that if the market moves by 1%, the rebalancing could account for up to 16.8% of the market-on-close volume, so the induced momentum can be substantial. Now who is paying for this profits for those momentum traders? Why, the buy-and-hold investors, of course. This loss for the ETFs shows up as their tracking errors, resulting in a cost of as much as 5% per annum for the buy-and-hold investors. Yet another reason we should not be one of those investors!

As Prof. Shum pointed out, if you trade this strategy live today, you will likely get a lower return, because of all those momentum traders who drove up the price way before the close. However, there may be an ameliorating factor at work here: this momentum is proportional to the NAVof the ETFs. As their NAV goes up with time (either due to additional subscriptions or positive market returns), the returns of this strategy should also increase.

===
Now for some public service announcements:

1) A company called Level 3 Data Corp sells proprietary data indicating buying and selling pressure on stocks. Their internal backtests show that adding these data to some common stock trading strategies essentially double their returns. An explanatory video is available, and I heard they are offering 3-month free trials.

2) The London Systematic Traders (LST) Club has asked me to say a few words about their new initiative to build a London centric collaborative community of traders, developers and researchers.

LST aims to be at the intersection of traders, developers and quants with a strong emphasis community building and on knowledge exchange, providing a trading networks with a very specific focus on systematic, algorithmic (i.e. automated) or quantitative trading.

Membership is free and open to everybody with an interest in the above topics.

http://www.meetup.com/London-Systematic-Traders/

On Friday, Nov 23, I expect to be hosting a Q&A session with members of the LST (see 2 above) at the Apex Hotel in London. All are welcome. Please visit their website for details.

3) I will be conducting my Backtesting and Statistical Arbitrage workshops in London, Nov 19-22, and look forward to seeing some of our readers there!