Friday, February 02, 2018

FX Order Flow as a Predictor

Order flow is signed trade size, and it has long been known to be predictive of future price changes. (See Lyons, 2001, or Chan, 2017.) The problem, however, is that it is often quite difficult or expensive to obtain such data, whether historical or live. This is especially true for foreign exchange transactions which occur over-the-counter. Recognizing the profit potential of such data, most FX market operators guard them as their crown jewels, never to be revealed to customers. But recently FXCM, a FX broker, has kindly provided me with their proprietary data, and I have made use of that to test a simple trading strategy using order flow on EURUSD.

First, let us examine some general characteristics of the data. It captures all trades transacted on FXCM occurring in 2017, time stamped in milliseconds, and with their trade prices and signed trade sizes. The sign of a trade is positive if it is the result of a buy market order, and negative if it is the result of a sell. If we take the absolute value of these trade sizes and sum them over hourly intervals, we obtain the usual hourly volumes (click to enlarge) aggregated over the 1 year data set:



















It is not surprising that the highest volume occurs between 16:00-17:00 London time, as 16:00 is when the benchmark rate (the "fix") is determined. The secondary peak at 9:00-10:00 is of course the start of the business day in London.

Next, I compute the daily total order flow of EURUSD (with the end of day at New York's midnight), and I establish a histogram of the last 20 days' daily order flow. I then determine the average next-day return of each daily order flow quintile. (I.e. I bin a next-day return based on which quintile the prior day's order flow fell into, and then take the average of the returns in each bin.) The result is satisfying:


We can see that the average next-day returns are almost monotonically increasing with the previous day's order flow. The spread between the top and bottom quintiles is about 12 bps, which annualizes to about 30%. This doesn't mean we will generate 30% annualized returns, since we won't be able to arbitrage between today's return (if the order flow is in the top or bottom quintile) with some previous day's return when its order flow was in the opposite extreme. Nevertheless, it is encouraging. Also, this is an illustration that even though order flow must be computed on a tick-by-tick basis (I am not a fan of the bulk volume classification technique), it can be used in low-frequency trading strategies.

(One may be tempted to also regress future returns against past order flows, but the result is statistically insignificant. Apparently only the top and bottom quintiles of order flow are predictive. This situation is actually quite common in finance, which is why linear regression isn't used more often in trading strategies.)

Finally, one more sanity check before backtesting. I want to see if the buy trades (trades resulting from buy market orders) are filled above the bid price, and the sell trades are filled below the ask price. Here is the plot for one day (times are in New York):



















We can see that by and large, the relationship between trade and quote prices is satisfied. We can't really expect that this relationship holds 100%, due to rare occasions that the quote has moved in the sub-millisecond after the trade occurred and the change is reported as synchronous with the trade, or when there is a delay in the reporting of either a trade or a quote change.

So now we are ready to construct a simple trading strategy that uses order flow as a predictor. We can simply buy EURUSD at the end of day when the daily flow is in the top quintile among its last 20 days' values, and hold for one day, and short it when it is in the bottom quintile. Since our daily flow was measured at midnight New York time, we also define the end of day at that time. (Similar results are obtained if we use London or Zurich's midnight, which suggests we can stagger our positions.) In my backtest, I have subtracted 0.20 bps commissions (based on Interactive Brokers), and I assume I buy at the ask and sell at the bid using market orders. The equity curve is shown below:



















The CAGR is 13.7%, with a Sharpe ratio of 1.6. Not bad for a single factor model!

Acknowledgement:  I thank Zachary David for his review and comments on an earlier draft of this post, and of course FXCM for providing their data for this research.

===

Industry update

1) Qcaid is a cloud-based platform that provides traders with backtesting, execution, and simulation facilities. They also provide servers and data feed.

2) How Cadre Uses Machine Learning to Target Real Estate Markets.

3) Check out Quantopian's new tutorial on getting started in quantitative finance.

4) A new Matlab-based backtest and live trading platform for download here.

5) A nice resource page for open source algorithmic trading tools at QuantNews.

My Upcoming Workshops

February 24 and March 3: Algorithmic Options Strategies

This online course focuses on backtesting intraday and portfolio option strategies. No pesky options pricing theories will be discussed, as the emphasis is on arbitrage trading.

June 4-8: London workshops

These intense 8-16 hours workshops cover Algorithmic Options Strategies, Quantitative Momentum Strategies, and Intraday Trading and Market Microstructure. Typical class size is under 10. They may qualify for CFA Institute continuing education credits. (Bonus: nice view of the Thames, and lots of free food.)


38 comments:

Unknown said...

Thanks for this Ernie, thought provoking stuff. I have always ignored volume in my systems, may have to re-visit. I wonder if the order flow system would transfer to intra-day movements? Say hourly orderflow?

David

Ernie Chan said...

Thanks, David. Order flow should translate to intraday return prediction, but my study of hourly flow on EURUSD didn't work.

Ernie

Anonymous said...

Interesting article, though I don't think you can draw too many conclusions on a 1 year backtest. Does the strategy hold on other FX rates?

Thanks,

Daniel

Ernie Chan said...

Hi Daniel,
Agreed -unfortunately there is only 1 year of data available. This also doesn't work as well for other rates.
Ernie

TC said...

The average Joe can't obtain this data from an OTC broker.
Would it work using tick data of CME FX futures (classifiy-ing buys or sells by either a big data job of matching up timestamped actual trades with the just prior bid/ask ... or determining buys abd sells by the inferior, but more tractable, "bulk classification")

Ernie Chan said...

TC,
Yes, if you have true futures tick feed (not like the feed provided by IB which is sampled at 250ms), you can apply the tick rule to estimate order flow. If you are paying for the expensive direct MDP feed from CME, they will tell you the order flow explicitly (via the aggressor tag on each trade).

I have determined that results from bulk volume classification are inferior to a tick-by-tick computation, hence I have disavowed it.

Ernie

Unknown said...
This comment has been removed by the author.
Unknown said...

Thank you for the article, Dr Chan!

A quick question for you, regarding the logic as a whole, you summed all the trades into net values of volume, considering that sell trades were negative and buy trades were positive, sorted then into quintiles over a moving window of 20 trading days and then realized that there was almost a linear relationship between the top quintiles (buy skewed) and positive returns on the next trading day, and vice versa for the lowest quintile. Right?

Thanks!

Edit for typos

Ernie Chan said...

Hi Eduardo,
You summarized correctly!
Ernie

Unknown said...

Tks Dr! Definitely worthy of further testing.

Unknown said...

Thanks. It is insightful. I think only Oanda supply order flow information but it I don think it can be downloaded. Maybe retail traders have to trade manually.

Ernie Chan said...

Good to know about Oanda - thanks.

Ernie

Ernie Chan said...

By the way, FXCM told me that they can offer anyone free historical order flow data for 6 months in 2017. Just email premiumdata@fxcm.com.

Ernie

Ernie Chan said...

Hi M,
Yes, order book imbalance has also been shown to be predictive of future price change.
See Cartea 2015 (the HFT book on my Recommended Books list on right sidebar).
Ernie

Unknown said...

Hi Ernie,

I enjoyed your class you taught for MSPA. Interesting post. Quandl also apparently offers data from CLS which may be an even better indicator for volume since it takes into account large institutional trades. I don't subscribe, but the service does seem to offer volume hourly.

Brad

Ernie Chan said...

Hi Brad,
You are right about Quandl. The cost, however, is beyond the reach of most retail investors.
Ernie

R said...


Inform yourselves about the moral problem of trading :

https://sites.google.com/site/tradingonlineamoralproblem/

Ernie Chan said...

Hi M,
Presumably you are only summing all the buy orders for a buying pressure?
Ernie

Dave said...

Do you know if there is any sort of similar feed for CME futures data? Are there any techniques to infer buy vs sale that are reasonably accurate. Most algos I’ve seen just assume buys are at offer and sells are at the bid.

Ernie Chan said...

Dave,
Yes, if you subscribe to CME's MDP feed, it has an aggressor tag for each trade that enables you to compute order flow. My 3rd book talked about it.

Yes, there is also a technique called Tick Rule (based on whether it is uptick or down. Up is considered positive flow) and BVC method (complicated). All these methods are described in my 3rd book.

Ernie

Anonymous said...

What is the nature of the signal you trade in the mini S&P strategy? It's long vol it seems - how do you get that kind of profile when designing a strategy?

Thanks!

Ernie Chan said...

Hi,
Short term momentum strategies are typically long volatility.

Ernie

Anonymous said...

Hi Ernie

Great article. When conditioning on high past volume, are you perhaps picking up short term momentum in prices? I.e have you tried controlling for price momentum and see if the order flow effect is still there?


Thanks
Hank

Ernie Chan said...

Hi Hank,
Thanks.

Yours is a good question, and I have not tried to control for price momentum in this study. I do expect positive correlation between contemporaneous price return and order flow in the past period. However, from a trading perspective, those occasions when one reverts in the next period and the other doesn't that are most interesting (and profitable). The theory is that price change in the absence of strong order flow will likely mean revert (i.e. exhibit serial anticorrelation), while price change in presence of strong order will trend.

Ernie

Anonymous said...

Quick question. In previous comments you mentioned that your trading records were public at epchan.com/accounts. However I get an error accessing that. Is this no longer public?

Ernie Chan said...

It has been moved to www.qtscm.com/accounts.

By the way, where did you see that comment that said epchan.com/accounts - I need to correct it!

Thanks,
Ernie

Anonymous said...

Hi Erine

Do you know what is the difference between CQG API and CQG Web API (continuum Connect)?

Thanks

George

Ernie Chan said...

Hi,
Ever since I found that CQG is missing ticks in its historical data, I have stopped using them. So I am afraid I don't know anything about their API's.
Ernie

Anonymous said...

The regression of flows to returns was insignificant which is a huge warning sign. I think you're just picking up on the strong dollar downtrend (especially against EUR) that was persistent throughout 2017.

Ernie Chan said...

Yes, it is quite possibly a chance correlation in 2017, which is why we need to confirm this effect in other years pending data availability.

I don't, however, regard the lack of statistical significance in regression a major issue. Most of the time in recent years if we see significant regression for a single predictor in finance, the effect is either overfitted, or it has look-ahead bias. If life were so easy with a single factor ...

Ernie

P_Ser said...

Dear Ernie,

In our third book you write that one could use online decoding of an HMM algo so we do not use all the previous data for next step prediction. Could you please shed some ight on how to do that? I couldn't find any relevant matlab libs for this particular purpose.

Tahnk you!

Ernie Chan said...

Hi P-Ser,
Check out http://articles.ircam.fr/textes/Bloit08a/index.pdf
Ernie

Unknown said...

Hi Ernie,

I wasn't sure where to post this question! I'm wondering about the look forward bias test that you mention in your Quantitative trading book. You mention to run a strategy with all data, then truncate the recent data, then truncate the results of the first!

I've designed a very successful FX strategy that performs exceptionally well on pretty much every timeframe above 15minutes and does not have a look ahead bias according to this test. Do you know of any way that a strategy could pass this test and still have a look ahead bias?

Thanks,
Brandon

Unknown said...

nice

Unknown said...

Dear Dr Ernie:
First of all, I want to thank you. I learn a lot from your third book <>. It is very clear to understand and easy to test your code.
I have a question about the data you used in that book. In your dataset named fundamentalData, there are 27 cross sectional factor. I check one by one. I still can not find the taxefficiency and dilutionratio factors from quandl account. Could you explain the formula how to solve it? I appreciate it.
Kind regards.
Peng.

Ernie Chan said...

Hi Unknown,
Sharadar may have dropped taxefficiency and dilutaionratio in their latest version. But if you download fundamentalData.mat from epchan.com/book3, you will find those quantities used in my code.
Ernie

Anonymous said...

Hi Ernie,

I just saw your twitter that you have added python code for BOOK1, which is great! Just wondering if/when you plan to add python code for BOOK2 and BOOK3? Thanks.

Ben

Ernie Chan said...

Hi Ben,
Yes, the Python codes for book2 will be available shortly.
There is no plan for book3 codes yet.
Ernie