Wednesday, January 02, 2013

The Pseudo-science of Hypothesis Testing

Backtesting trading strategies necessarily involves a very limited amount of historical data. For example, I seldom test strategies with data older than 2007. Gathering longer history may not improve predictive accuracy since the market structure may have changed substantially. Given such scant data, it is reasonable to question whether the good backtest results (e.g. a high annualized return R) we may have obtained is just due to luck. Many academic researchers try to address this issue by running their published strategies through  standard statistical hypothesis testing.

You know the drill: the researchers first come up with a supposedly excellent strategy. In a display of false modesty, they then suggest that perhaps a null hypothesis can produce the same good return R. The null hypothesis may be constructed by running the original strategy through some random simulated historical data, or by randomizing the trade entry dates. The researchers then proceed to show that such random constructions are highly unlikely to generate a return equal to or better than R. Thus the null hypothesis is rejected, and thereby impressing you that the strategy is somehow sound.

As statistical practitioners in fields outside of finance will tell you, this whole procedure is quite meaningless and often misleading.

The probabilistic syllogism of hypothesis testing has the same structure as the following simple example (devised by Jeff Gill in his paper "The Insignificance of Null Hypothesis Significance Testing"):

1) If a person is an American then it is highly unlikely she is a member of Congress.
2) The person is a member of Congress.
3) Therefore it is highly unlikely she is an American.

The absurdity of hypothesis testing should be clear. In mathematical terms, the probability we are really interested in is the conditional probability that the null hypothesis is true given an observed high return R: P(H0|R). But instead, the hypothesis test merely gives us the conditional probability of a return R given that the null hypothesis is true: P(R|H0). These two conditional probabilities are seldom equal.

But even if we can somehow compute P(H0|R), it is still of very little use, since there are an infinite number of potential H0. Just because you have knocked down one particular straw man doesn't say much about your original strategy.

If hypothesis testing is both meaningless and misleading, why do financial researchers continue to peddle it? Mainly because this is de rigueur to get published. But it does serve one useful purpose for our own private trading research. Even though a rejection of the null hypothesis in no way shows that the strategy is sound, a failure to reject the null hypothesis will be far more interesting.

(For other references on criticism of hypothesis testing, read Nate Silver's bestseller "The Signal and The Noise". Silver is of course the statistician who correctly predicted the winner of all 50 states + D.C. in the 2012 US presidential election. The book is highly relevant to anyone who makes a living predicting the future. In particular, it tells the story of one Bob Voulgaris who makes $1-4M per annum betting on NBA outcomes. It makes me wonder whether I should quit making bets on financial markets and move on to sports.)

90 comments:

  1. I agree about not going back too many years for data. I've based this years's systems on 2009-2012, with greater weight to the last 2 years. Which brings me to my question: given a backtest containing >5000 trades over this period, selected from a universe of ~1200 stocks (such that each stock is only likely to be traded a few times, i.e., the setup is rare), with an average holding period of 3 days, and given that no randomization of any part of the system can be found that doesn't significantly decrease the return, what is the correct way to think about the statistical relevance of these results as a predictor of future returns?

    ReplyDelete
  2. mhp326,
    Certainly, the larger the number of trades, the more significant your results is. However, whether the strategy is truly predictive depends on whether
    1) The rules are complicated, resulting in overfitting.
    2) The fundamental reasons for the success of the strategy have changed.

    As I explained in the article, I don't think the rejection of the null hypothesis (as you have reported) tell us much.
    Ernie

    ReplyDelete
  3. Agreed, feel okay about both those points, and understand about the null hypothesis. What I'm less certain of is to what extent (if any) the size of the universe of stocks, i.e., the infrequency of trades per stock, might negate the statistical benefit of the large total trade count.

    ReplyDelete
  4. Thanks for this great post and the paper by Gill. It makes me re-think the many subtleties in statistical inference.

    On a more practical side, do you recommend any other rigorous/statistical procedure to assess whether a strategy is sound?

    ReplyDelete
  5. We should care only about the total number of trades, not the number of trades per stock, to assess statistical significance.
    Ernie

    ReplyDelete
  6. ezbentley,
    The paper recommends Bayesian analysis, though I think that is too tedious for backtesting.

    I don't believe statistical tests by themselves can assure us of the predictive power of a strategy. However, Monte Carlo simulations can give us a sense of the vulnerability of the strategy to outliers. So risk management is the more achievable objective.

    Ernie

    ReplyDelete
  7. Hi Ernie,

    I wish you a Happy and Prosperous New Year.

    "1) If a person is an American then it is highly unlikely she is a member of Congress.
    2) The person is a member of Congress.
    3) Therefore it is highly unlikely she is an American."

    Modus Tollens can be applied if the rule (1) involves a universal quantifier AFAIK. Otherwise, this is not a valid deduction and not a good example as a result. However the point you are making is valid. The problem is in the hypothesis tested. In backtesting IMO the correct hypothesis is one that tests if the system has high probability to fail in the future, not perform well in the future. Of course there will always be false rejects, i.e. cases when a good system is rejected but the process gains some value as a result in the sense that if a system has performed terribly in the past, then the probability of a future failure is also high: http://tinyurl.com/a4bw4oe

    ReplyDelete
  8. Hi Michael,
    Happy New Year to you too.

    The example I quoted is designed to be invalid, but it illustrates the deduction used in usual hypothesis testing.

    I agree with you that if the backtest results are bad, we can with good confidence say that the future results will be bad too. On the other hand, if the backtest results are good, then the future results may be good or bad with equal probability.

    Ernie

    ReplyDelete
  9. Hi Ernie,

    Is bid-ask spread usually $0.01 for liquid US stocks, such as AAPL, IBM?

    But I find their bid-ask spreads are higher than 0.01 on 2 Jan 2013.

    Is this because this is just after New Year Day?

    Or it is because the US market increase a lot? I find the trading volume is as big as before.

    Thanks.

    ReplyDelete
  10. Hi Anon,
    The bid-ask to some extent depends on the share price. For a high price stock such as GOOG, it can be higher than 1 cent momentarily.
    Ernie

    ReplyDelete
  11. Ernie,

    Your deduction of the "deduction of hypothesis testing" using that example is incorrect. Math can be easily broken when put in english like you did. Saying 2.999 is almost 3 seems reasonable but using this further in a mathematical proof wont work. The "highly likely/unlikely" term is just vague not math. In your example you need the Bay's rule which includes two other probabilities. The probability of a person being American and the probability of a person being in congress.

    Anyways, hypothesis testing is not all that bad as you project. It does have its merits when properly understood. The rejection/acceptance of null hypothesis depends on another key parameter - the threshold which you ignored in your argument. This threshold determines the Probability of detection and probability of false alarm which are more important.

    It seems like the holiday shopping has put a dent in your bank balance and possible your head (joke). You know very well, the absolute amount, though in millions, are irrelevant. Ask Bob Voulgaris what his Sharpe ratio is ;). My suggestion - don't quit financial markets. Happy 2013.

    -Naresh

    ReplyDelete
  12. Naresh,
    I believe you misunderstood the point of that example. The deduction is supposed to be faulty, but it is identical to the deduction used in common hypothesis testing.

    "Unlikely" means 1% probability, as in usual hypothesis testing.

    The thresholds used in hypothesis testing are arbitrary, and doesn't improve the validity of the whole paradigm.

    For further elucidation, please read the original paper by Jeff Gill.

    (Bob Voulgaris's Sharpe ratio is likely to be very high, since Silver wrote that he had no losing years since he started his business.)

    Ernie

    ReplyDelete
  13. Hi Ernie,

    Did you read Silver's book? I was looking at Amazon reviews (FWIW) and even the higher 4 ratings and lower ones had a common theme of disappointment and meh. Was curious of your viewpoint, if you have previously read his book.

    Best,
    Ken

    ReplyDelete
  14. Hi Ken,
    I am almost finished with the book (reading the chapter on stock market).

    There are a lot in the book that I agree with, particularly in relation to my own outlook on quantitative trading. You will find lots of parallels with trading even in the chapter on poker. (As you may know, many quant traders are excellent poker players, such as Peter Muller.)

    Ernie

    ReplyDelete
  15. Hi Ernie,

    You mentioned that you don't backtest farther than 2007. How do you feel about strategies that have poorer Sharpe ratios the farther back you go? I have backtested a fairly simple strategy that trades daily and has a back-tested Sharpe ratio of > 2 for 2012 and 2011. It performs poorly in prior years however and with such a limited dataset, I think I've just stumbled on a statistical anomaly.

    ReplyDelete
  16. Hi Duke,
    I would certainly backtest the strategy in 2007-2009 to see if it survives the financial crisis.
    2007-2012 is 5 years: sufficient statistical significance for a strategy that trades daily or at least weekly.
    Ernie

    ReplyDelete
  17. Good afternoon Ernie,

    Would you mind commenting on the trading strategy described in this post? http://www.elitetrader.com/vb/showthread.php?threadid=253195

    It involves shorting both bull and bear leveraged ETFs.

    Regards,

    ReplyDelete
  18. Hi Nedzad,
    Inverse ETFs are often hard-to-borrow. The borrow fees that your broker charge often overwhelm any profitability. Otherwise, everybody would be engaging in such riskless cash generator.
    Ernie

    ReplyDelete
  19. Thanks for the feedback on the book, Ernie. Sounds encouraging. I'll add it to my to-read list!

    And, I hope all is proceeding well with your new book.

    Regards,
    Ken

    ReplyDelete
  20. When I traded AAPL several months ago, the spread was rarely if ever one cent. It was typically $.10 to $.20.

    You definitely need to check bid/ask spreads when determining profitability of a strategy...

    ReplyDelete
  21. You said, I agree with you that if the backtest results are bad, we can with good confidence say that the future results will be bad too. On the other hand, if the backtest results are good, then the future results may be good or bad with equal probability.

    Ernie
    ------------

    If this was true shorting poorly backtested strategy would be more profitable than going long with good backtested strategy.

    Thanks for your book and discussions.

    Eugene

    ReplyDelete
  22. Eugene,

    A "bad" backtest result may be one that has zero returns, so it won't help to short it either.

    Ernie

    ReplyDelete
  23. In my experience, I find a pattern of some sort. That pattern may provide a statistically verifiable edge, may go on for years, and may involve hundreds or even thousands of trades… But as soon as I start trading that pattern, the same tools that found the pattern confirm that it is now gone, often very abruptly and distinctly.

    I don’t see how any statistical tool is going to tell you what is going to happen in the future. At best, it will tell you that the pattern that you found in the past is really there, and not just the product of overfitting or dumb luck.

    It seems to me we are all in a race.

    1. Find the pattern as quickly as possible
    2. Take it on faith that the pattern will continue, at least for a little while, and put money at risk, even before you can “prove” that the pattern is valid.
    3. Recognize when the pattern breaks or degrades, either because too many other people have found the pattern or because some underlying market dislocation has smoothed itself out, and then stop trading it.
    4. Find a new pattern as quickly as possible…..

    My main point is that by the time you have enough data to “prove” a pattern exists with high confidence, it is almost certainly too late.

    Now add to this that there are people spoofing patterns to suck you in.

    My hats off to those who can make a living at this game. I think it must take a special emotional makeup, not just the ability to do the math.

    ReplyDelete
  24. R.D.
    I agree with everything you said. Finance does not have stationary statistics, which means statistical methods have limited use.

    But as Nate Silver said in his book about statistical predictions in general, the combination of statistics and fundamental laws/understanding of a system in question can often provide reasonably robust predictions. Weather forecast is given as such an example.

    Ernie

    ReplyDelete
  25. Ernie,

    With weather prediction I take it as a given that the laws of physics are a solid unchanging foundation. I also take it as a given that there are no weather gods scheming to get me, except perhaps in my anthropomorphic imaginings.

    There may be some immutable laws of behavioral economics, but I can't say that I know them, or trust them to make anything more than a guess. Perhaps someday we will understand the Bernoulli principle or Boyle's Law equivelents in behavioral economics, but for now, we, or at least I, am just finding patterns. Sometimes I have theories about their economic underpinnings, but they are just theories.

    So no, I don't think weather prediction is a good analogy for stock prediction, or trading.

    ReplyDelete
  26. Hi Ernie,

    I am about to start managing $20m of a $100m hedge fund and will negotiate my terms in a couple of weeks time. It is a multi strategy fund.

    What are typical terms for a portfolio manager in terms of percentage on returns etc?

    I appreciate any advice that can help me negotiate a good deal.

    Thanks for your advice

    ReplyDelete
  27. Hi Anon,
    Typically profit share goes from 10-16% within a hedge fund.
    Ernie

    ReplyDelete
  28. Hi Ernie & Anon,

    Ususally, how much unlevered returns do hedge funds require from their portfolio managers in this "$20m" level?

    About "$20m of a $100m hedge", is this amount before or after leverage?

    Usually, how much leverage could they get on both sides, long and short?

    Thanks a lot

    ReplyDelete
  29. Anon,
    Correct me if this isn't true at your fund ... but I believe when people say $20M, it means unlevered.

    Of course, minimum return and maximum leverage varies for different funds.
    Ernie

    ReplyDelete
  30. hmmm, the above comments on fund size are interesting. I'd be interested in hearing Ernie's and other posters thoughts on what a realistic return is for quant funds of the non HFT type (ie trading at say > 5 min intervals, or even mostly end of day strategies) could realistically be achieved? I'm aware of private investors with low 8 figure funds but trade purely as value investors/macro funds that can achieve > 50% returns, whether they can carry this into the future I'm not sure. I know Ernie, that your a huge fan of Sharpe ratios but from a return perspective where do you see the private trader doing trades through IB being able to achieve? I know there's a lot of variables but it would be interesting to examine if the upkeep costs for a systematic trader is greater than or less than that required by a concentrated fundamental trader?

    ReplyDelete
  31. Andrew,
    Returns depend on the specific strategy and leverage. I don't believe there is a "maximum" that a low frequency trader at IB can achieve. If you use high enough leverage and is lucky enough, over 100% return is certainly possible ... the only question is what kind of risk one is enduring in return?
    Ernie

    ReplyDelete
  32. hi Ernie,

    That's a good point. Generally I personally would be comfortable with a 20% -25% max drawdown. Though the aim of what my question was to compare whether upkeep costs for getting a system going is worth the setup. I think one of the main attractions of quant trading is this ideal that you can just tweak your models and let them trade; however more and more it seems that even fully automated systems need constant babysitting. Do you find this is the case? With value/macro investors they can put on a trade and go away to do research.

    ReplyDelete
  33. Hi Andrew,
    A quantitative strategy can certainly die due to fundamental changes in the market. So one has to be cognizant of such changes and see if it is time to kill the strategy, especially when its drawdown is deeper and longer than expected.

    In that sense, it is no different from a value strategy when one has to constantly check to see if the value is still there.

    On the other hand, I do not find it necessary to monitor a strategy on a minute-to-minute basis as long as your alarm system is set up correctly.

    Ernie

    ReplyDelete
  34. Hi Ernie

    Do you know of a cheap source for intraday futures data, in addition to tickdata which are expensive.

    Kibot and pitrading seem to only have single stock data.

    Thanks

    ReplyDelete
  35. Hi Anon,
    cgqdatafactory.com offers futures tick data which are cheaper than tickdata.
    Ernie

    ReplyDelete
  36. Hi Ernie,

    This is off topic to the current thread; however, I was wondering what your opinion was on applying Statistical Arbitrage techniques to pairs on the TSX.

    Do you believe that there is greater potential for one to take advantage of some of the mispriciings due to less market participants than on exchanges like NYSE and NASDAQ?

    ReplyDelete
  37. Hi Anon,
    I am not sure that TSX has fewer arbitrageurs. Only a backtest can decide this!
    Ernie

    ReplyDelete
  38. @Andrew
    On achievable performance : On 26 months of trading, I am on a 100%/year, 30 % vol/year and 30 % maxDD (turnover between 3 days and 3 months). Its too early to know if its sustainable but I feel more and more confident on the matter. I have around 5 strategies running on the same time. I have 3 big kinds of strategies which are all working quite well. I feel that I could further improve my performance with more R&D time.
    I also trade with very low equity which makes think much easier.
    I think that my main risk could be a LTCM misunderstanding i.e. some big players are doing the same trades and when something goes wrong everyone is deleveraing and you dead ! (http://www.amazon.com/When-Genius-Failed-Long-Term-Management/dp/0375758259 is a must read)
    But I dont feel that there is big players doing the same thing on my 3 kinds of strategies.

    Any comments of traders having unexpected bad results are welcomed !

    @Backtesting
    I agree that a good backtest is not so much on statistical significativity but should follow the following rules :
    - Dont try too many regressors
    - Understanding why fundamentally the strategy is working
    - Only out of sample results are important
    - Discard the strategy when fundamentals are not anymore valid (when i say fundamentals its not about the state of the market but more on actors impacts)

    Any comments of traders having other way of doing backtests welcomed !

    ReplyDelete
  39. @Model regressors

    I feel much more confident (and it worked better) when my regressors are
    randomly signed (1 1 - 1 1 -1 1 -1 1 1 -1 1) than strongly autocorrelated (1 1 1 1 1 - 1 - 1 - 1 -1 1 1 1 1 1 1).
    I feel that statistical significaty as much more sense for radomly signed regressors thatn strongly autocorrelated regressors.
    Do you agree with that ?
    Any more formal explcations ?

    ReplyDelete
  40. Ernie,

    I know you have touched upon this in your previous posts on ETF arb and XLE. You basically state how one must develop a synthetic basket in order to create new arbitrage opportunities.

    Hypothetically if one was to backtest and keep track of the spread between an underlying ETF's value and its components, i.e. FAS or TNA. Is it possible to simply just mimic/front run the positions that the ETF has to re balance and actually make a profit in this day and age or is this simply a fools task as the opportunity is arbed away so fast due to HFT...?

    ReplyDelete
  41. Anon,
    It is certainly possible to front-run ETF rebalancing, but I think it is likely to be indistinguishable from a daily mean-reversion trade, as an ETF is likely to buy a stock that dropped in value that day to maintain its percentage in the fund.

    Perhaps other readers here know of some articles/academic research on this rebalancing trade?

    Ernie

    ReplyDelete
  42. Hi Ernie,

    In Interactive Broker (IB), I find IB store LSE stocks historical bars in a wrong way.

    Minimum increment is .0005.
    IB store them to .000 only.
    One decimal disappears.
    You could check LLOY and BARC tickers in LSE in IB.

    Thanks

    ReplyDelete
  43. Hi Anon,
    That's good to know. I haven't traded LSE stocks before, maybe other readers can comment on this?
    Ernie

    ReplyDelete
  44. Hi Ernie,

    Thank you for quick response.

    If you can see LSE data in IB, it is easy to check.

    You just load LLOY 2 days/ 2 mins chart in TWS.

    Compare data of yesterday and Today.

    There are many level off bars yesterday because one decimal has been cut off.

    LLOY has low price so it is easy to see.

    For BARC, you can just download historical one min bars. Then you can see.

    May I ask that besides IB, do you use any other broker which provides API for systematic trading?

    Thank you.

    ReplyDelete
  45. Ernie,

    how do you feel about the approach of using monte carlo simulations varying position sizing to maxdd in order to leverage or deleverage a trading system continuously?

    assuming this trading system of yours is profitable and was tested only with OOS data, being traded nowadays under real market conditions.

    thank you

    eduardo

    ReplyDelete
  46. Thank you Ernie for answering my question pertaining to front running etfs!

    ReplyDelete
  47. Hi Anon,
    I used to use Lime Brokerage's API for trading. That is in many ways far superior to IB's.

    Ernie

    ReplyDelete
  48. Eduardo,
    I use Kelly formula to determine leverage of my system. I don't know how exactly you determine leverage using Monte Carlo simulation.

    Please explain what OOS data is.

    Ernie

    ReplyDelete
  49. This comment has been removed by the author.

    ReplyDelete
  50. im sorry, i did not explain it well enough in my first post:

    when i said monte carlo simulations, i was referring to simulating different position sizings / $ allocation in the trading system, in order to find an "optimal" mean allocation size for that strategy, while having an acceptable max drawdown.

    OOS i refer is "out of sample". i was just trying to say that the system was tested in a non-biased way.

    do you think this kind of approach , "in a Bayesian-like manner" (citing dr. howard bandy) is acceptable ? or do you prefer the kelly / half-kelly method?

    thanks

    ReplyDelete
  51. Hi Eduardo,
    Yes, finding the optimal leverage via Monte Carlo simulations can be a good way. I have found, however, that for some of my trading strategies, this method results in quite similar leverage as Kelly.
    Ernie

    ReplyDelete
  52. Dr. Chan,

    If a trading system can hypothetically only lose 100-200 per trade; how could one quantify entry points to accurately assign probabilities to these points to determine which prices give the trader the highest probability of a successful move in whichever way he/she chooses to bet?

    ReplyDelete
  53. Anon,
    Why not just try different entry points in backtest and find out which give the best Sharpe ratio?
    Ernie

    ReplyDelete
  54. hey Ken,

    Thanks for replying. Like you, I'm also quite paranoid about strategies that perform well over a long time only to bite the dust from 'black swans'. Its a constant danger of systematic trading. Also, great to hear your strategies are performing really well. Are you size limited in terms of your strategies? I'd definitely consider taking on outside allocations if it doesn't degrade performance.

    ReplyDelete
  55. Dear Chan,

    Good Evening! I was wondering what your opinion is on intraday trading with the use of an underlyings option volatility to help one predict optimal areas to enter in a trade and ultimate help one with the direction of the trade... I have read couple papers in regards to this concept pertaining to trading earnings and other events but I was wondering if you had an opinion/any resources in helping one trade (and further research) a basket of securities on an intraday basis.

    ReplyDelete
  56. hi Ernie,

    Today, Interactive Broker system has serious problem for trading Forex, especially for paper account.

    Do you know that?

    In the beginning, I design my API to trade Forex about 23 hours a day.

    Now I do not dare to try. Maybe I only trade during Europe and US trading hours.

    Btw, have you heard about Ninja trader or Metatrader 4 or Multi-Chart? How about their performance?

    Thank you

    ReplyDelete
  57. Hi Ernie,

    Thanks for your willingness to share your knowledge and I'm looking forward to attend one of your courses in the future when I am less busy and I can travel.

    I show a message by Michael Harris posted here and I would like to ask you what you think about his blog post about getting fooled by randomness: http://tinyurl.com/bwxgc73

    Do you agree with the notion that small exit targets help reveal predictive capacity of an algo?

    Peter

    ReplyDelete
  58. This comment has been removed by the author.

    ReplyDelete
  59. Hello Ernie¡

    I would like to get historical intraday( 1 minute) data of european stocks, do you know a reliable data-base?
    I would like to do a study of European stocks, but I find that is very expensive and hard to find it.

    Thanks

    ReplyDelete
  60. Anon,
    I have not used options volatility to predict optimal trade entries before. If you come across any relevant papers, please share them with us and I will comment on them.
    Ernie

    ReplyDelete
  61. Hi Anon,
    IB had a FX data outage from 5:15pm-7:00pm yesterday. That affected both production and paper trading accounts. However, I hadn't experienced that ever before, and would not judge a broker based on a one-time outage.
    Ernie

    ReplyDelete
  62. Hi Anon,
    Re: Michael Harris' notion that using profit targets and stop losses can reveal whether the backtest is purely luck. In general, a good strategy should be insensitive to different small changes to its logic. Adding profit caps and stop losses are certainly some of the ways to perturb the logic and see if the response is sufficiently small. However, there are other ways as well, such as introducing slight delays in the entry or exit times.
    Ernie

    ReplyDelete
  63. Hi Anon,
    I haven't tried to gather European intraday stocks data, but if you have an Interactive Broker accounts, you can download at least half a year of such data free.
    Try also esignal.
    Ernie

    ReplyDelete
  64. Hi Ernie,

    Thanks for your reply. I like the idea about introducing slight delays in the entry or exit times. Is there a rule of thumb about the duration of the delay in relation to bars used for the signal?

    Peter

    ReplyDelete
  65. Hi Peter,
    The delay should not be more than a few minutes from the original, otherwise it would be a totally different strategy!
    Ernie

    ReplyDelete
  66. @ Andrew

    I have three big kinds of strategies. I feel that I can have between 5 and 100 times more capital depending on the strategies.

    I plan to begin taking on outside allocation soon but I dont want to rush it because its a big responsability. Even if I say to my seeder that I can loose money I know that they dream of my past result. So I want to have more track record and experience to have the maximum security for them.
    As I am young and without a long and classic background, its also quite difficult to find seed money !

    @ Ernie & All

    I begin to look at seeding solutions but it seems that everyting is done for big players around 50 millions of $.
    How can you seed smaller amounts ? (outside family and friends)


    Thanks

    ReplyDelete
  67. Ken,
    If it is an FX strategy, check out websites such as currensee.com. I am sure similar sites for equities strategies exist.
    Ernie

    ReplyDelete
  68. Hi Dr. Chan,

    My apologies about "Dear Chan" in my previous message it was meant to say "Dr. Chan".

    In regards to some of the material that I have found so far pertaining to using options to help one choose an entry point:

    This is an old article from trading markets but it is a primer on the concept

    http://www.tradingmarkets.com/.site/stocks/education/strategies/01042000-3274.cfm

    and these are the other ones

    http://web.ics.purdue.edu/~zhang654/jfqa_option.pdf

    This one was my favourite so far

    http://www.ruf.rice.edu/~yxing/option-skew-FINAL.pdf

    The majority of the research done so far is applied towards event trading (i.e. earnings releases); however, as a prop trader who trades intraday and needs to earn a return daily and monthly... I was wondering what your opinion might be on some of the implications towards intraday trading.

    ReplyDelete
  69. Ernie,

    I am trading futures on commodities. I didnt find any website close to currensee for commodities.

    Can you tell us how did you manage to seed your own fund ?

    Thanks

    ReplyDelete
  70. Anon,
    Thanks for the references. I will study them and perhaps post my opinions in the next blog post.

    Ken: One of my consulting clients signed up to be the seeder of my first fund. Generally investors approach me out-of-the-blue because they know me through my blog, book, and workshops.

    Ernie

    ReplyDelete
  71. @anon,
    The last 2 papers that you referenced are the same: is that intentional?
    Thanks,
    Ernie

    ReplyDelete
  72. You should read The cult of statistical significance by Ziliak and McCloskey

    ReplyDelete
  73. Hi Ernie,

    In your book, you mention that AUD/CAD is relatively stationary.

    Do you know any other currency pairs which is relatively stationary, like AUD/CAD?

    Thanks a lot.

    ReplyDelete
  74. Ernie,

    cqgdatafactory

    CQG offers historical tick data, but only at 60 second time-stamps.

    Morningstar Quotes (formerly known as Tenfore) seems to offer full historical tick data at even the millisecond and sub-millisecond level.

    ReplyDelete
  75. DTN/Nanex nxcore is also said to be another decent historical tick data provider.

    Both morningstar and DTN/nanex also offer real-time data.

    ReplyDelete
  76. Anon,
    Yes, the 1 minute time stamps of CQF are annoying. Thanks for the tip about Morningstar and DTN.

    Ernie

    ReplyDelete
  77. Morningstar, I know, offers bulk data downloads of tick data at the end-of-day via FTP. So you can start using that in matlab or python, right away.

    With DTN, I believe it is API-access only, so unless you know c or c++(I think DTN also has the api accessible via VB), you're going to have to hire someone to wrap the c/c++ code for you via swig in python or mex-files in matlab.

    ReplyDelete
  78. Peter H. claims that a job like that should take around ~$600.

    http://epchan.blogspot.com/2009/05/matlab-as-automated-execution-system.html

    ReplyDelete
  79. Hi Ernie,

    You said it is very expensive to borrow Inverse ETFs. How about short common stock? I think in Asian markets, it is also costly and difficult for individual investor? Then it means those pairs trading strategies, mean reverting strategies (normally requires short and long) would not be an option for an individual?

    Winfred



    ReplyDelete
  80. Hi Winfred,
    The cost of shorting common stocks depend on the stock, in particular, it depends on whether the stock is hard-to-borrow.

    Pair trading can still work if you pick the stocks that are not HTB.
    Ernie

    ReplyDelete
  81. Hi Ernie,

    I see. To filter out the HTB stocks, where could we get the HTB stock list? I think it is easy for hedge fund to get it from stock loan desk of brokage firm. But it may be quite difficult for individual to use that channel?

    Thanks,
    Winfred

    ReplyDelete
  82. Hi Winfred,
    HTB stock list is easy to get on a daily basis, even from Interactive Brokers' website. But it is hard to find historical records of it. So you have to save them yourself going forward.
    Ernie

    ReplyDelete
  83. Hi all,

    There seems to be much confusion here regarding what a hypothesis test is and the conclusions we should draw from the test.

    First, the example

    "1) If a person is an American then it is highly unlikely she is a member of Congress.
    2) The person is a member of Congress.
    3) Therefore it is highly unlikely she is an American."

    in no way, shape, or form imitates the logic of a hypothesis test. In terms of probabilities, what the example is saying is

    1) P(C|A) is low
    2)&3) P(C|A)==> P(A|C) is low, which is absurd

    It makes little sense to talk about probabilities of the null hypothesis, P(H0). A null hypothesis is usually a distributional statement, not a random variable to which we can assign probabilities. The null hypothesis is either true or it's not (without getting too philosophical).

    Hypothesis testing uses the statistical analogue of a proof technique in mathematics called proof by contradiction. For our hypothesis test, we first say okay, let's assume that our null hypothesis H0 is true.

    1) Now we are in a world where H0 is true, or that some distributional statement holds. This is the truth in our world now.
    1)* stock returns are normally distributed

    2) If H0 is true, then the probability of this event of happening is extremely low, or P(A|H0) = extremely low
    2)* If stock returns follow normal dist., then we should rarely see eight sigma events in a 250 trading year, if at all in our lifetime

    3) Therefore, we have statistically convincing but not definitive evidence that H0 is not true
    3)* We observe (clueless guess) multiple eight sigma events a year/decade etc.., therefore it's reasonable to think that stock returns are indeed not normally distributed

    In short, all a hypothesis test is saying is that suppose someone wins the powerball lottery eleven times, wouldn't you question that it is not due to random chance but to cheating, or not as hard to win lottery as you thought, etc. ?

    But I completely agree researchers abuse hypothesis tests when they have no idea how to properly use them....

    ReplyDelete
  84. Hi RM,
    Let me rephrase your H0 about stock returns distributions.

    1) If a returns distribution is normal, then it is highly unlikely we will have a 6-sigma return.
    2) Our return is 6-sigma.
    3) Therefore it is highly unlikely the returns distribution is normal.

    Do you agree this is the logic?

    If you substitute "returns distribution is normal" with "a person is an American", and "a 6-sigma return" with "is a member of Congress", then we are back to the probabilistic syllogism which you have regarded as absurd.

    Ernie

    ReplyDelete
  85. Hi Dr. Chan,

    I think there is a difference between the two examples.

    Example 1:
    1) If a person is an American then it is highly unlikely she is a member of Congress.
    2) The person is a member of Congress.
    3) Therefore it is highly unlikely she is an American.


    Example 2:
    1) If a returns distribution is normal, then it is highly unlikely we will have a 6-sigma return.
    2) Our return is 6-sigma.
    3) Therefore it is highly unlikely the returns distribution is normal.

    In the first example, Gill states 2), but when he refers to a member of congress, he is referring to a member of only American congressmen. The congressmen he uses for 2) are a strict subset of Americans, the group in 1), (he does not allow the group "Americans" to vary in any way). The universe for congressmen is too restrictive by being only American.

    In your example, example 2, a "6-sigma return" in 2) is not a strict subset of the normal distribution, the group in 1). In other words, there are 6-sigma returns for the Cauchy distribution, exponential distribution, uniform, etc..

    In my opinion, we should make the analogy:

    Example 1:
    1) If a person is an American then it is highly unlikely she is a member of Congress.
    2) The person is a member of Congress, including Congresses from various countries.
    3) Therefore it is highly unlikely she is an American.

    and Example 2:
    1) If a returns distribution is normal, then it is highly unlikely we will have a 6-sigma return.
    2) Our return is 6-sigma, including 6-sigma returns from various statistical distributions.
    3) Therefore it is highly unlikely the returns distribution is normal.

    I agree with your sentiment that rejection of the null hypothesis is clearly not enough for backtesting strategies (I am planning to comment again tomorrow to ask for your ideas on what else we can do), but have to disagree with Jeff Gill's opinions regarding hypothesis testing. On the other hand, some push-back is definitely needed for the number of researchers who blindly run regressions until their p-value drops below that magical arbitrary threshold of 0.05 so that they can tell a story to fill in the rest.
    There has also been papers recently which have studied the number academic papers reporting various p-values which shows a very obvious game being played, I'll try to remember the name.

    ReplyDelete
  86. Hi RM,
    Thanks for the clarification.

    In my opinion, hypothesis testing can still be useful in backtest as a way to reject weak strategies, though it can't positively affirm a good strategy is not due to luck alone.

    Also, as my forthcoming book will show, sometimes the failure to reject a null hypothesis lead to interesting new insights about what drives the profits of a strategy!

    Ernie

    ReplyDelete
  87. Hi Ernie, just curious - if the null hypothesis is simply the detrended strategy return (as recommended in “Evidence Based Technical Analysis), would this not be sufficient to support a good hypothesis test? The purpose simply being to create a reference sample distribution which can be used to test whether the algo returns are likely significant? This feels like a straightforward and reasonably valuable test to me.

    ReplyDelete
  88. Hi Anon,
    If the null hypothesis is the mean return of a strategy being zero (or some generalization of that), and you rejected it, that doesn't mean that your mean return is really non-zero. However, if you can't reject it, you can be pretty sure the strategy is very weak or is just random.
    Ernie

    ReplyDelete