One of my mentoring students sent me a really great question today which I thought was such a useful lesson that I wanted to share it with everyone in my community…
This is important because getting this wrong risks deluding yourself about how good your trading system is (at best)… At worst it risks you loosing huge amounts of money before you realise something is wrong.
Here is the question…
“When I backtest my trading system on the list of US Stocks that are Optionable, I get a significant performance improvement… should I add that as a filter to my rules?”
Now before you say “yeah yeah, I have know all about survivorship bias”, please read on because the problem and solution is might be more subtle than you might think…
The problem is, however, when you backtest with certain lists of stocks, you are actually cheating by giving your system advanced knowledge that certain stocks will do well.
Why is this and which stock lists are a problem?
The problem is called Survivorship Bias, and the most obvious trap people fall into is backtesting their system on the current list of S&P500 stocks.
If you backtest your long side system over the last 30 years on the current list of S&P500 stocks, then you have a real problem because the stocks that make up the index today were not necessarily in the index 30 years ago, and the stocks that were in the index 30 years ago may not be in it today.
The impact of this is HUGE in backtesting because if you test your rules over the current list of S&P500 stocks then you are ONLY testing on stocks which SURVIVED and QUALIFIED to be in the index in the present day… but 30 years ago when your backtest ‘started’ you could not have possibly known which stocks would grow and become the 500 biggest stocks on the market and which ones would have failed…
Do you remember the movie ‘Back to the future”?
Maybe you are a geek like me and loved that movie… just in case you haven’t seen it *gasp* (or maybe I am just getting old… it was released in 1985) let me explain.
In the movie the main character Marty (Michael J Fox) uses a time machine called the DeLorean to travel back and forward in time. He ends up causing all sorts of unintended consequences, has emotional highs and has to deal with disasters of his own making.
Anyway, the key lessons are
- If you haven’t watched the movie, you should… and
- Time travel has unintended consequences which are not good
When you backtest using a list of stocks that is subject to survivorship bias, you are effectively giving your backtest the gift of time travel.
I usually test from 1990 or 1992 depending on which historical stock data I am using. So 30 years ago when your backtest starts Facebook was not part of the S&P500 yet… in fact Facebook didn’t even exist back then… but by running your backtest on the current day index constituents of the S&P500 you are giving your backtest advance knowledge that this stock is going to be a winner! In fact Facebook didn’t enter the S&P500 until 17 May 2012.
The distorting effect of giving your trading system this advanced knowledge may not cause the collapse of social order and alter history like Marty did in Back To The Future, but it will cause you to wrongly think you have an AMAZING trading system on your hands.
AND THAT COULD BE FINANCIALLY DEVASTATING!
After all, how could your system NOT make money if you knew in advance which stocks would grow to become the biggest stocks in the market 30 years from now!
With that type of time travel you would be guaranteed to make a fortune, and that is exactly what your flawed backtest will show if you backtest your system using the current list of S&P500 stocks.
Survivorship bias a trading mistake that is caused by backtesting using a list of stocks that you would not have been able to choose at the time the trade was taken given the information that would have been available to you as a trader on the date the trade was taken.
OK – Now we all know what survivorship bias is…
When is it a problem and what can you do about it?
Let’s tackle the ‘when’ first
When is Survivorship Bias a risk in Trading System Development?
Survivorship bias is potentially a problem any time you backtest on a list of stocks that can change over time due to the size and liquidity of the companies on the list. For example:
- Any list of companies currently in any stock index: S&P500, Dow30, Nasdaq 100, ASX200 etc. These stocks were not always in the index, at one time they were just small caps with growth aspirations.
- Stocks that are currently Optionable: Investopedia says that “An optionable stock is one where the stock has the necessary liquidity such that a market maker, like a bank or an accredited financial institution, lists that stock’s options for trading.” Clearly a new company that has just been listed is unlikely to be optionable, so backtesting with the list of currently optionable stocks gives rise to survivorship bias and unachievable results.
- Stocks that are currently Shortable: For a stock to be shortable it must be borrow-able… for a stock to be borrow-able it must be held in fairly large quantities by mutual funds and large investors, which means it is probably a fairly popular stock. Just because a stock is shortable today does not mean it was big enough to be shortable in the past.
- Stocks that migrate along the value chain: If you backtest your system on a current list of Gold Producers then you WILL have a problem, because a company that is a large gold producer today might have been a microcap gold explorer 30 years ago… 10 years ago they may have found gold and gone into production and changed from an explorer to a producer. As a result the share price would have risen dramatically. So preselecting producers would cause you a survivorship bias problem too!
Backtesting with watchlists like the above will give you unrealistically good results because the system has advanced knowledge of what will happen with the stocks before it was possible to know that information.
But not all types of watchlists of cause this problem. It is usually only watchlists that stocks have to somehow ‘qualify’ to be part of, where that qualification could change in the future depending on the share price and liquidity of the company.
Other watchlists like Industry / Sector watchlists do not really have this problem. If you just backtest your trading system on Gold stocks for instance you won’t have this problem because a gold stock is a gold stock…
As a general rule industry and sector lists don’t cause a problem, but any list that changes over time due to company size, liquidity, profitability or share price will cause a problem.
What is the solution to Survivorship Bias In Backtesting?
The easiest solution is to not get into the problem in the first place… This means designing and backtesting your stock trading system using lists of stocks that are not prone to survivorship bias.
For this solution you can backtest over either:
- The entire market
- All stocks on a certain exchange (all Nasdaq stocks for example)
- Stocks in a certain industry classification (provided you are mindful of the gold example above)
How to backtest your trading system on stocks in a stock index
If you really want to backtest your stocks on watchlists that do have changes in their constituents over time (Like the stock indices) then you need to purchase data which has historically accurate index constituents.
There is only one provider that I know of and recommend that has this data – Norgate Data.
Norgate’s data service is excellent:
- It is easy to install
- You can set it to perform updates automatically
- It manages stock splits and corporate actions automatically
- It provides historically accurate index constituents for ASX and US Stocks
(Sadly not other exchanges are currently covered)
- They provides full delisted stock history back to the early 1990’s
- Their data integrates seamlessly with Amibroker for backtesting and managing your trading systems
With just a few additional lines of code in your Amibroker trading system, using Norgate data you can backtest your trading system on historically accurate index constituents AND delisted stocks.
If you want to backtest your stock trading system on historically accurate index constituents, check out Norgate’s offering – you won’t be disappointed.
Adrian, it seems that one can backtest with ALL stocks initially to derive parameters, and then decide to trade using Optionable stocks when using it in practice. You won’t get the same results as the master backtest, but you’ll be using statistical preferences that came from the full body of all stocks.
What do you say to this idea?
Also, for my own custom backtester (not AmiBroker), I have been taking a full market snapshot every day, and I have a lengthy history of folders, showing “how the market looked on that day” in history. So the backtester works its way forward using the snapshots, not just a price history for each symbol. Is Norgate providing something similar here?
Great question – thank you for asking!
The potential risk is that you are trading your system on a different subset of stocks than you backtested it on. While these Optionable stocks were in the backtest, they were not the whole universe the system was backtested on so there is a real risk that they have a completely different set of performance than the dataset as a whole. It may be much better or much worse and you just don’t know!
The best guidance I can give here is “Trade what you test AND Test what you trade” If you are trading something different than you have explicitly and specifically tested then you are putting yourself at risk unnecessarily. Backtesting is important because it gives you an indication about how profitable your system might be, but if you backtest one thing and trade another (even if there is an overlap between the two populations) then you actually don’t know whether or how profitable the trading system really is.
Norgate date works by having a full history of every single stock on the market and also a day by day list of which stocks were in which Index list. That way you can basically say on any day in the backtest, if this stock was in the S&P500 Stock Index on the day the system generated the buy signal in the backtest then take the trade. This sounds similar though probably not identical to what you are describing. There is a lot of work to maintain the list of what stocks were in every index on every day over the entire backtest period. I certainly would not try to do that work on my own (But I am not an avid coder… just a trader who uses code to serve my purposes).
I hope that helps – Thanks again for the great comment and trading question Andrew!
Thank you Adrian! I will definitely look at the Norgate data. It was recommended to me by other sources also.
As an additional comment, I think the more advanced way to think about all of this is with a proper Monte Carlo simulation. A backtest can’t know what it didn’t see. You can test it forward on new data and verify that the statistics are consistent between a distant past and a less distant past. BUT ALSO the market didn’t have to play out in the way that it did in the new new forward data. So you’d need to play out alternate market scenarios to see how results might vary if things went wildly different. The best backtest then would be one that still is robust after some wildly different market outcomes (Monte Carlo) still validate that parameters.
All of this, to me, would be in addition to the avoiding the survivorship bias that you’ve been (very correctly) cautioning about.
Thank you for making us aware it, and reminding of solid statistical principles!
Absolutely agree – Monte Carlo simulation is a very useful tool and can be used in quite a few different ways to get deeper insights into the performance of a trading system. I will actually be running an advanced trading system development program in the near future which will cover various applications of Monte Carlo simulation amongst many other interesting techniques which help in the trading system development process… so keep an eye out for that one 😉
Excellent article Adrian!
While many understand the importance and potential value of back testing, as you note here, having a clear and full understanding of your sampling is critical.
Good that you teach such things along with developing cogent strategies for back testing so that it is meaningful and useful without getting you into potential trouble.
Thanks for publishing this one!
Thank you so much for your great comment Brian! I appreciate you taking the time to write.
In case anyone reading has not seen my trading debate video with Brian you can check it out here: https://enlightenedstocktrading.com/the-great-trading-debate-what-is-more-important-your-trading-psychology-or-your-trading-system/
Brian let’s collaborate on something again soon!!
Great content, Adrian. I’ve read quite a bit about survivorship bias already. But you appraoch this interesting topic from many more angles. Thank you
Thank you for your great comment, I am really glad this article on survivorship bias has helped you deepen your understanding of the issue and how it relates to your stock trading system development!