I am a quant researcher and developer for QTS Partners, a commodity pool Ernie (author of this blog) founded in 2011. I help Ernie develop and implement several strategies in the pool and various separate accounts. I wrote this article to give insights into a very important part of our strategy development process: the selection of data sources.
Our main research focus is on strategies that monitor execution in milliseconds and that hold for seconds through several days. For example, a strategy that trades more than one currency pair simultaneously must ensure that several executions take place at the right price and within a very short time. Backtesting requires high quality historical intraday quote and trade, preferably tick data for testing. Our initial focus was futures and after looking at various vendors for the tick data quality and quantity we needed, we chose Nanex data which is aggregated at 25ms. This means, for example, that aggressor flags are not available. We purchased several years of futures data and set to work.
Earlier this year we needed to update our data and discovered that Nanex prices had increased significantly. We also needed quotes and trades, and data for more asset classes including US equities and options.
We looked at TickData.com which has good data but is very expensive and you pay up-front per symbol. There are other services like Barchartondemand.com and XIgnite.com where you pay based on your monthly usage (number of data requests made) which is a model we do not like. We ended up choosing QuantGo.com, where you have unlimited access to years of global tick or bar data for a fixed monthly subscription fee per data service.
On QuantGo, you get computer instances in your own secure and private cloud built on Amazon AWS with on-demand access to a wide range of global intraday tick or bar data from multiple data vendors. Since you own and manage the computer instances you can choose any operating system, install any software, access the internet or import your own data. With QuantGo the original vendor data must remain in the cloud but you can download your results, this allows QuantGo to rent access to years of data at affordable monthly prices.
All of the data we have used so far is from AlgoSeek (one of QuantGo’s data vendors). This data is survivorship bias-free and is exactly as provided by the exchanges at the time. Futures quotes and trades download very quickly on the system. I am testing options strategies, which is challenging due to the size of the data. The data is downloaded in highly compressed form which is then expanded (by QuantGo) to a somewhat verbose text form. Before the price split, a day of option quotes and trades for AAPL was typically 100GB in this form. Here is a data sample from the full Options (OPRA) data:
Timestamp, EventType, Ticker, OptionDetail, Price, Quantity, Exchange, Conditions
08:30:02.493, NO_QUOTE BID NB, LLEN, PUT at 7.0000 on 2013-12-21, 0.0000, 0, BATS, F
08:30:02.493, NO_QUOTE ASK, LLEN, CALL at 7.0000 on 2013-12-21, 0.0000, 0, BATS, F
09:30:00.500, ROTATION ASK, LLEN, PUT at 2.0000 on 2013-07-20, 0.2500, 15, ARCA, R
09:30:00.500, ROTATION BID, LLEN, PUT at 2.0000 on 2013-07-20, 0.0000, 0, ARCA, R
09:30:00.507, FIRM_QUOTE ASK NB, LLEN, PUT at 5.0000 on 2013-08-17, 5.0000, 7, BATS, A
09:30:00.508, FIRM_QUOTE BID NB, LLEN, PUT at 6.0000 on 2013-08-17, 0.2000, 7, BATS, A
These I convert to a more compact format, and filter out lines we don't need (e.g. NO_QUOTE, non-firm, etc.)
The quality of the AlgoSeek data seems to be high. One test I have performed is to record live data and compare it with AlgoSeek. This is possible because the AlgoSeek historical data is now updated daily, and is one day behind for all except options, which varies from two days to five (they are striving for two, but the process involves uploading all options data to special servers --- a significant task). Another test is done using OptionNET Explorer (ONE). ONE data is at 5-minute intervals and the software displays midpoints only. However, by executing historical trades, you can see the bid and ask values for options at these 5-minute boundaries. I have checked 20 of these against the AlgoSeek data and found exact agreement in every case. In any event, you are free to contact the data vendors directly to learn more about their products. The final test of data quality (and of our market model) is the comparison of live trading results (at one contract/spread level) with backtests over the same period.
The data offerings have recently expanded dramatically with more data partners and now include historical data from (QuantGo claims) "every exchange in the world". I haven't verified this, but the addition of elementized, tagged and scored news from Acquire Media, for example, will allow us to backtest strategies of the type discussed in Ernie's latest book.
So far, we like the system. For us, the positives are:
1. Affordable Prices. The reason that the price has been kept relatively low is that original vendor data must be kept and used in the QuantGo cloud. For example, to access years of US data we have been paying
Five years of US Equities Trades and Quotes (“TAQ”) is $250 per month
Five years of US Equities 5 minute Bars $75 per month
Three Years of US Options 1 minute bars $100 per month.
Three Year of CME, CBOT, NYMEX Futures Trades and Quotes $250 per month
2. Free Sample Data. Each data service has free demo data which is actual real historical data where I can select data from the demo date range. This allowed me to view and work with the data before subscribing.
3. One API. I have one API to access different data vendors. QuantGo gives me a java GUI, python CLI and various libraries (R, Matlab, Java).
4. On-Demand. The ability to select the data we want "on demand" via a subscription from a website console at any time. You can select data for any symbol and for just a day or for several years.
5. Platform not proprietary. We can use any operating system or software with the data as it is being downloaded to virtual computers we fully control and manage.
Because all this is done in the cloud, we have to pay for our cloud computer usage as well. While cloud usage is continuing to drop rapidly in price it is still a variable cost and it needs to monitored. QuantGo does provide close to real-time billing estimates and alarms you can preset at dollar values.
I was at first skeptical of the restriction of not being able to download the data vendor’s tick or bar data, but so far this hasn't been an issue as in practice we only need the results and our derived data sets. I'm told that if you want to buy the data for your own computers, you can negotiate directly with the individual data vendor and will get a discount if you have been using it for a while on QuantGo.
As we use the windows operating system we access our cloud computers with Remote Desktop and there have been some latency issues, but these are tolerable. On the other hand, it is a big advantage to be able to start with a relatively small virtual machine for initial coding and debugging, then "dial up" a much larger machine (or group of machines) when you want to run many compute and data intensive backtests. While QuantGo is recently launched and is not perfect, it does open up the world of the highest institutional quality data to those of us who do not have the data budget of a Renaissance Technologies or D.E. Shaw.
===
Industry Update
(No endorsement of companies or products is implied by our mention.)
- A new site for jobs in finance was recently launched: www.financejobs.co.
- A new software package Geode by Georgica Software can backtest tick data, and comes with a fairly rudimentary fill simulator.
- Quantopian.com now incorporates a new IPython based research environment that allows interactive data analysis using minute level pricing data in Python.
Workshops Update
My next online Quantitative Momentum Strategies workshop will be held on December 2-4. Any reader interested in futures trading in general would benefit from this course.
===
Managed Account Program Update
Our FX Managed Account program had an unusually profitable month in October.
===
Follow me on Twitter: @chanep
104 comments:
Hi Ernie,
A question on hedge ratios. Say I am long asset X and want to hedge it with asset Y.
Do I simply run an OLS regression of returns of X onto returns of Y? What does the slope then represent, is it the percentage of the portfolio I should hold in Y?
Thanks
If Y=hedgeRatio*X + constant, and X, and Y represent returns (not prices), then hedgeRatio represents the dollars to be invested in a long position of X vs. 1 dollar invested in a short position of Y.
Ernie
Does QuantGo have access to your AWS instance where they provide you the data?
No, they do not. You set it up yourself and set the passwords and any other security settings.
Morning Ernie,
I am backtesting an arbitrage strategy between two future and wondering how to calculate the profit/lose ratio of each trade for Sharpe Ratio.
For example
future A is current 100 points, and each point worth 10 dollars.
future B is current 200 points, and each point worth 20 dollars.
assume arbitrage between 1 lot of future A with 1 lot of future B.
I earn 50 dollars after an arbitrage trade , is the profit % equal to 50/((100*10)+(200*20))?
If this is true, then normally Sharpe ratio of arbitrage trade between 2 derivatives would be much lower than momentum trade with 1 derivative because of dividend by two asset value?
Thanks,
HK
Hi HK,
Yes, your calculation of gross returns looks correct.
However, Sharpe ratio is unaffected by how you divide the profits to calculate returns. Even though the returns look lower, the standard deviation of returns is lowered by the same factor.
Ernie
Hi - I'm wondering what's to stop someone subscribing for a month, writing lots of data to a file and then downloading that file and keeping it without paying the monthly fee? Is this against TOS?
Hi Lee,
Any large data download from your AWS instance will trigger an alert.
If it is determined that the download is of data rented from QuantGo, I believe you will be charged the full purchased data cost.
Ernie
"Affordable" seems relative in this case. As long as the industry racket continues, quality data will remain out of reach of retail traders.
Hi Ernie,
Say I predict returns with a variable X and I get a negative sign, meaning high X predicts negative returns tomorrow.
Now say I have a trading model in which I go long for high values of X (let's ignore the short leg for this example)
Can I still make money on this strategy despite the regression telling otherwise?
I find yes but I am really surprised by this. Have you come across this before? I can't find any errors in my code.
In data, high values of X is followed by high returns 58% of the time so I guess the negative sign in the regression must come from some influential observations, or?
What are your thoughts on this?
Thanks.
I beg to differ.
$250 for the full TAQ of US equities at 1ms is very affordable for retail traders.
Ernie
Hi Ernie,
If I have a long/short position of a suddenly delisted US stock in IB, how I can close this position in IB?
Even though a stock is delisted, it may still trade OTC on bulletin boards. But this is a question best directed at IB.
Ernie
Hi Ernie,
In your book, for PEAD day trading, you use 90 days standard deviation of close to open returns to generate buy/sell signals for S&P 500 stocks.
May I ask if you have other suggestions for indicators or stocks universe or holding period?
Many thanks.
You can always try different lookback period or stock universe to optimize your backtest results - though if the results are very sensitive to lookback, they are probably not robust.
This indicator is crucial to this strategy though. If you pick another indicator, it will be an entirely different strategy.
Ernie
Hi Ernie,
For PEAD day trading, how about choosing 1x, 2x, 90 days standard deviation?
Why not backtest those lookbacks and see which one performs best?
Ernie
Hi Ernie,
Where can we find historical earning dates (after mkt close and before mkt open)?
I cannot find earnings.com any more.
Thanks.
You have to go to biz.yahoo.com/research/earncal
Ernie
Hi Ernie,
Where can we get real-time news feed?
Btw, today, oil price drops dramatically, shall we avoid stocks pair trading today?
Many vendors provide elementized real-time newsfeed: Bloomberg, Thomson-Reuters, Dow Jones, Newsware, NewsEdge, etc.
Unless your stock pair has some direct relationship to oil price, I don't see why you should exit a pair just because oil prices drop.
Ernie
Hi Ernie,
Usually how many stocks pairs does a trader trade?
And how much % capital does he/she allocate on each pairs?
The more pairs, the merrier.
Just divide your total buying power by the maximum possible number of pairs.
Ernie
Hi Ernie,
To trade more pairs, is that good to trade intraday stock pairs in different stocks market, inculding Hong Kong, US, Singapore, Australia, etc.?
There may be some currency risks.
Thanks.
You can certainly trade cross-border pairs, but as you said, you need to convert their values using their spot FX rate. It gets complicated, but if you are a good programmer, you can do it!
Ernie
Hi,
I was looking at the track record for your QTS managed account. Very impressive. Presumably you're trading spot FX, what platform/broker do you use?
Thanks.
We use IB.
Ernie
Hi Ernie,
In your book, you use Index as stocks universe for long-short model.
Could we use individual sector (such as Financial, Healthcare, and Services) instead for long-short equity strategy?
Sure, using sector as universe will make your long-short portfolio sector-neutral in addition to market-neutral.
Ernie
Hi Ernie,
Do you believe that long-short equity strategy is better than stocks pairing strategy?
Yes, in general long-short is better because we do not require cointegration.
Ernie
Hi Erine
May I know if the trading program running at your Managed FX account is fully automated? Or human decision is invloved frequently as there are so many economic news and data are released everyday
Thanks
Yes, our FX program is fully automated with no human intervention. Nowadays, it is also very easy to subscribe to machine-readable news.
Ernie
Hi Erine
Thank you of your reply. Is your Managed FX trading account also taking machine readable news as one of the factors for making trade decision?
Thanks
Ha! That is proprietary information that I certainly won't tell you!
Ernie
Ernie,
Congratulations on your QTS Strategy. Very impressive.
Just a quick note on data collection especially for millisecond trading strategies.
Buying or renting data from outside vendors is attractive obviously to most of us because you can get vast amounts of data all at one time and see how you strategy performs under wildly different conditions.
But millisecond trading strategies are very latency and execution dependent and the sources of the data really should match up with where you plan to execute. There is little sense in getting your hopes up on a back test when the ultimate execution and fill rate will very much depend on the liquidity provider.
Elliott L Shifman
Hi Elliott,
Very true.
The one utility of backtesting millisecond strategies is to reject obviously hopeless ones.
Ernie
Hi Ernie,
Do Gold and Silver cointegrate? I look at XAU PHLX Gold/Silver index, it looks like a stock price movement and not cointregrate at all. If it is not contrregate, how should we think about it with fundamental explanation ?
-HK
Hi HK,
I won't expect gold and silver to cointegrate. We only need to find a fundamental explanation when we suspect cointegration, not when we don't expect cointegration.
Ernie
Hi Ernie,
For intraday trading in Hong Kong Exchange, usually, what is the margin requirement? We can get leverage more than 4x, or less than that?
I don't have direct experience trading HK stocks, so I can only tell you what IB wrote on their website.
It seems that the margin requirement depends on whether you are trading the HK stocks through a US broker. If so, you are still subject to the same portfolio margin limit, which means the leverage can be higher than 4. However, if you trade through another country's brokerage, that's a totally different question that I cannot comment on.
Ernie
Why do people keep asking questions that should be directed to IB or other vendors? Please increase the quality of the questions.
Hi Ernie,
A great observation on pruning out solutions that that have low probability of succeeding.
Elliott Shifman
Hi Ernie,
Roughly speaking,if we trade 100 stocks pairs, how many pairs will not work out?
Thanks.
For stock pairs, expect a large % of not working out, possibly > 50%
Ernie
Hi Ernie,
50% is large! Then how do people make money by trading stock pairs?
Many of the stock pair traders study the fundamentals in addition to the technicals of each pair. So they will know when not to trade or to liquidate.
Ernie
Hi Ernie,
For fundamentals analysis, what do we need to take a look at?
P/E ratio? ROE?
Thanks.
By fundamental data in regard to pairs, I meant primarily corporate news and SEC filings.
Ernie
Hi Ernie,
What is the good source to read SEC filings before the market open?
Do they provide SEC filings in IB?
Thanks.
Hi Ernie,
I just find SEC filings in IB, which is linked to EDGAR webpage. However, could we know SEC filings dates in advance? just like earnings announcement dates we can know in advance.
Some of the SEC filings are scheduled, such as quarterly filings. Others are not.
Ernie
Hi Ernie,
Kong Kong hsi index etf 2800 normally pay around 4% intertest per year. If I short equal value of hsi future and buy 2800 etf for the whole year, do I gain the 4% without risk?
-HK
Hi HK,
No, the future return took dividend into account, and will not be identical to the ETF return.
Ernie
Hi Ernie,
When working through Millisecond Frequency Trading Strategies, there are of course a wide variety of strategies and parameters associated with those strategies. Any suggestions on working groups where quants exchange ideas on strategies and techniques?
Thanks,
Elliott Shifman
LinkedIn
www.linkedin.com/pub/elliott-shifman/34/262/714?trk=pub-pbmap
Facebook
https://www.facebook.com/elliott.shifman.7
Hi Elliott,
I don't know that there is a specific forum for HFT, but I sometimes look up and post questions to elitetrade.com and keep tabs on Tabbforum.com for these issues.
Generally speaking, I go through the publications list of many finance professors to see if any articles are relevant.
Ernie
Hi Ernie,
On first line of page 19 of your new book, how to calculate to get 2.93?
In
mean(ret)/std(ret)*sqrt(length(ret))
I think mean(ret)/std(ret) is 1 base on the result in chapter 6, then I am not sure how to get the sqrt(length(ret)) part.
-HK
Hi HK,
The test statistic is not the Sharpe ratio. As I wrote on page 17 and Table 1.1, it is sqrt(sample_size)*Daily_Sharpe_Ratio.
Ernie
Hello Ernie,
thank you for sharing the QuantGo-data-website with us.
BTW, congrats to the extraordinary nice returns on your managed accounts!
Thanks.
BR,
QT
Thanks, QT!
Ernie
Hi Ernie,
For that example, is that sqrt(trading days of a year)*1? I am not sure what is the different between Daily Sharpe Ratio and Sharpe Ratio.
-HK
Hi HK,
No, daily Sharpe ratio for that strategy is not 1. The annualized Sharpe ratio is 1.
Sharpe ratio has volatility in its denominator. Assuming i.i.d returns, volatility scales as the square root of time, while returns scale linearly with time. Hence Sharpe ratio scales as square root of time. So to annualize a Sharpe ratio, you multiple a daily Sharpe ratio by square root of 252.
Ernie
Hi, Ernie,
I am having some difficulty in locating the data files you used in your code. The data file "inputData_USDCAD" you used in your
"stationarityTest.m" is missing. Could you help. Thanks
Tom
Hi Tom,
If you email me privately I will send you the link to download.
Ernie
Ernie.
Thankx anyway. Just verified a story that's spreading in the Ether.
Tom
hi Ernie,
May I ask where we could get historical implied volatility data?
Do you trade volatility?
The only place I know of that sells implied volatility data is Optionmetrics.
I used to trade volatility, but don't do that currently.
Ernie
Hi Ernie,
Options traded in US exchange are usually American options (exercised at any time before the expiration date)?
My partner Roger told me that all US exchange traded options are American-style.
It is discussed here: https://www.cboe.com/LearnCenter/pdf/understanding.pdf
Ernie
Hi Ernie,
Thank you for the response.
I am just a little bit confused that in the book of volatility trading by Euan Sinclair, they still use the Black-Scholes formula for option pricing.
I mean that if all option quotes are American-style prices, they may need to use other numerical pricing methods to recover the volatility surface, such as trees, or finite difference methods.
Maybe they use Black-Scholes European option price to approximate American option price??
Thanks.
Yes, BS is a decent approximation for American options prices. But for details, you should ask Euan!
Ernie
Hi Ernie,
Did you trade options on IB TWS?
Or have you ever heard any comments about it?
How is TWS option modeler and its option data quality?
Thanks.
Yes, I have traded options on IB. TWS has all the options Greeks and implied volatility to aid in your trading. The live market data is the standard consolidated feed, with IB's customary 250ms snapshots.
Ernie
Hi Ernie,
May I ask what was your volatility trading strategy?
Did you long/short volatility with discrete delta-neutral hedging?
We traded various options strategies. One of them is modeled on this: http://epchan.blogspot.ca/2012/07/extracting-roll-returns-from-futures.html
Ernie
Hi Ernie,
Have you done delta hedging before?
Which volatility do you use for computing delta?
We use implied volatility for delta hedging. This can be obtained either from IB, or using Matlab's Financial Instruments toolbox implied vol calculator.
Ernie
Hi Ernie,
Thank you for response.
Is it the implied volatility you long/short in the very beginning?
Then you use this "constant" implied volatility to compute delta for hedging until you close your option position?
No. Implied vol as well as delta will change continuously. So at the very least you need to delta hedge daily, using the latest numbers.
Ernie
Hi Ernie,
For volatility trading with delta hedging, usually what "time to expiry" options do you choose to trade? such as 1 month, 2 months, 3 months, etc..
Thanks.
The choice of the options expiration date is highly dependent on your specific strategy. But generally speaking, I would select one that has at least 1 month left.
Ernie
Hi Ernie,
Is there optionable stocks list?
What criteria do you use when judging the liquidity of a option?
One list of optionable stocks is at finviz.com/screener.ashx. One of the filter choices under Option/Short is Optionable. There is an export button at the bottom right, but this may no longer be free.
Yahoo! has a list of most active options at finance.yahoo.com/options/lists.
You can also use Yahoo! finance to check option volumes. Generally, smaller spreads indicate higher liquidity.
Hi Roger,
Thank you for help. The information is very useful.
Moreover, there are three hedging methods: Hedging at Regular Intervals,
Hedging to a Delta Band, Hedging Based on Underlying Price Changes.
If "time to expiration" of your option is less than 2 months, which method will you use? Or you use other even more complicated hedging methods, such as UTILITY-BASED METHODS.
Thanks.
We hedge at regular intervals, but such choices should always be backtested and optimized according to your specific strategy.
Ernie
Hi Ernie,
Do you know any Japanese stocks screener?
No, but you can look up IB's product list for Japanese stock market.
Ernie
Hi Ernie,
Assume our base currency is US dollars.
If we do intraday trading in Japan stocks market first, and then intraday trading in US stocks market on the same day, in practice, do we need to do currency conversion every day?
Since it is intraday trading (not holding position overnight), there is no physical settlement, right?
Maybe we only have P&L left each day in local currency (Yen)?
Do we need to pay any borrowing interest, but it is intraday?
Thanks.
Indeed, for intraday trading, you don't need to worry about their differential interest rates or settlement. Trading USDJPY will leave you with P&L in JPY, and it is a good practice to convert that to USD before the end of day so that your P&L matches your backtest exactly.
Ernie
Hi Ernie,
I had a look at the available historical data periods for Quantgo and they don't seem to be superlong. For example, starts in 2007 at best for US equities and 2009 for US futures. Are you basing your backtests on these short periods?
I am of the opinion that there is no point in backtesting high frequency strategies with data prior to 2007. The market structure was completely different then.
Ernie
Hi Ernie,
SPY dropped significantly yesterday.
Do we need to avoid trading stocks pairs on such day in US market?
Thanks.
Just because the market index drops doesn't mean stock pairs should be avoided. Sometimes this makes them more profitable due to the higher volatility, but other times less profitable due to the increased correlation ("tail dependence") as a result of the market crash.
Ernie
Hi Erine
If a retail FX broker specific Max 50 lots per click, and now I want to send a pending order of 100 lots. Does it mean that I can split this 100 lots into 2 pending orders with 50 lots each with same pending price to get my 100 lots fill at the same price?
Thanks
Ron
Hi Ron,
We never had such restrictions on IB, so I don't know the correct answer for your broker. But my guess is that yes, you can do it.
Ernie
Hi Erine
You mean won't limit the trade size per order?
Thanks
Hi Erine
You mean IB won't limit the trade size per order?
IB does limit the size of an order to something like $5M.
I don't think you should go beyond that anyway due to market impact.
Ernie
Hi Erine
You said "IB does limit the size of an order to something like $5M.
I don't think you should go beyond that anyway due to market impact."
Understanding that you managed FX fund has leverage at 10X Max. If your fund size is few millions for example, if you are using 10X leverage, you can easily have trade size more than 5MM mark, and I right? How you minimize the impact of the trade cost?
There is no need to deploy the maximum leverage with 1 order, just like a mutual fund does not buy $100M MSFT shares with 1 MKT order.
Ernie
Roger and Ernie:
Thanks for talking about QuantGo in your post. Because of you, we checked it out ourselves and posted our own positive reviews at:
http://robusttechhouse.com/quantgo-review/
Ernie,
This is the very first time I read about QuantGo and that's very impressive!
Ernie,
Are you still using QuantGo as your data provider? I'm trying to explore some semi HFT equities trading and notice they have full depth of book going back 10 years at millisecond precision. The price is very reasonable and just wondering if you found them still the best game in town for your research.
Thanks,
Derek
Derek,
Yes, we are still subscribing to QuantGo. As you said, there isn't anybody else selling book data at that price.
Ernie
Post a Comment