Read Transcript
So today we are talking about beyond the backtest, better optimization and robustness testing. And look, I don’t know about you, but it’s pretty easy to put a bunch, I think it’s pretty easy anyway, to put a bunch of indicators and rules into some backtesting software and mash the backtest button.
And then it’s also pretty easy to throw a whole bunch of other indicators and filters and rules against it and get a pretty nice looking equity curve. But if you’ve been trading systematically or algorithmically for more than about three minutes, you’ll probably realize that doing that doesn’t actually generate you much in the way of profit, it generates a whole bunch of pain.
So what I want to do is share with you a few techniques that I use. Tonight, to help you go beyond the typical backtest and actually really more deeply understand and get a feel for whether the system is likely to be profitable in the future with real time trading.
Before we go any further, I just want to remind you, as always, that the content of tonight’s presentation is for general educational purposes only. It certainly should not be considered to be personal financial advice. If you have any doubts or or uncertainties about that, please consult with your financial advisor before making any decisions.
Would you like to improve your trading systems and build greater confidence in those systems?. So you’re in the right place. I’ll give you a couple of techniques to help you do exactly that.
First lesson this is not a technical lesson. This is an engagement lesson. Learning requires energy. What that means is I’m going to put a bunch of energy into this presentation. I’m going to be giving you some nice ideas sharing you some examples. But in order for you to really get the lessons, you need to put some energy back in as well.
So that means not just sitting there treating Adrian like Netflix. I know I’m super entertaining, but it’s far more powerful if you actually write notes in your notebook. So make sure you’ve got a notebook. Write down questions in the chat roll. Give me a head nod if you get it. Give me a what?
If that makes no sense just engage in the whole process as if we’re sitting in the same room as each other and I promise you, it’ll be far more useful for you at the end of the hour. Good? This is the moment where you go. Yes. Cool. Okay. I’m engaged. Alrighty. So here’s my goals for you today.
I want to share some common problems in systematic training system development. We’re going to go through the most powerful technique that I know of. to harness your emotions and improve your results. And this is probably unless you’re one of my students who I can see a few of my students in the room.
This is probably going to be a little unexpected, but that’s great because this is super useful, super easy and powerful technique. I’m also going to share with you a clever way to make better use of your historical data. Get more from the same amount of data. I’m going to share with you a simple way to evaluate system robustness.
Okay. And make sure that your optimizations when you when you run an optimization, make sure you’ve you’ve done that well and effectively. And then hopefully we’ve got some time for open discussion to sit, share some additional ideas. Now, I also want to emphasize that while I’ve been doing this a long time, I’ve been trading well over 20 years now.
I’ve been systematic for the vast majority of that time. I can see some people in the room who I know have a lot of experience. So when we get to the general discussion. time. I’d love to hear from you guys about, what you think and your other ideas and what else you do. And we’re also going to talk right now about some some of the challenges.
And for those of you that have been doing it for a while, you will know some of the challenges, so please share because I want the knowledge in the room to be shared around as well. That’s one of the great things about having a good community like this. Good? Great. Okay, cool. Alrighty, let’s jump into the common problems.
in trading system development. Emotions get in the way and can lead to self sabotage. Second one is insufficient data to correctly develop and optimize our systems.
And then finally, overfitting systems to pass data.
So how many of you have followed a system and found yourself not comfortable with the signal that it generated? Raise your hand. Anyone? Okay, great. How many of you have traded a system and you’ve ended up in a drawdown or a losing streak or something where you thought, oh, that’s not really what I signed up for?
Raise your hand. Okay, cool. And how many of you have thought, should I turn this system off? It seems to be broken. I don’t know what I should do. Maybe I will. Oh yeah. No, maybe I’ll just scale it down or yeah. Anyone? Okay, good. So that is emotions getting in the way. Now we’re supposed to be systematic algorithmic traders, and so therefore emotions shouldn’t be coming into our trading, but they do.
And we are emotional beasts, we can’t avoid it. Humanity will have emotions, it’s part of us, we can’t avoid it. And what’s interesting is, as traders, when we discover systematic trading, the the kind of internal excitement, for me at least, was, Oh my god, finally I’ve got a way to get the emotions out of the way.
But then I fairly quickly learned that it didn’t get all of the emotions out of the way. It eliminated a lot of the day to day emotion because, I didn’t have to look at every single chart and make every single trading decision anymore, thank God, because that was taking hours and hours every single night.
Instead I could just press the backtest button, generate my signals for the next day and place the trades which takes just a few minutes. So that was great, except I very quickly found that I would get uncomfortable or excited about different trades or different market environments. And this is a real, this is a real challenge if you don’t know how to harness it.
And so here’s what happens. Let’s let’s talk about how to harness your emotions to, and improve your results. There’s a bunch of causes to emotional reactions that you have when you’re following your systems. Now, if you’re 100 percent automated your reactions are probably not going to be driven by individual trades because you’re probably not looking at at individual trades on a, a trade by trade basis.
But you might be your emotional reactions are probably driven by the equity curve. But at least at the beginning of your journey, I think it’s highly advisable to look at every single trade and understand exactly what’s happening. And there’s a bunch of things that that cause you to have, or cause us to have emotional reactions.
Things like, this doesn’t look like the move that my system is supposed to catch. I’m getting a signal, the indicators all line up, and I know that’s giving me the buy signal, but the behavior of the market isn’t what I expected that was going to get me in. Or, my system’s not giving me enough signals.
I’m just sitting here waiting and I want to trade more often. Why aren’t I getting more signals? Or I’m getting too many signals. This system is going into overdrive. It’s taking too many trades. I’m not comfortable with that. Or maybe, you’re looking at a chart for a trade that you’ve generated.
And last time you traded on a chart that looked like that, you lost money. And that’s. Concern. Or the chart doesn’t look like a good entry. Or the stock is just going sideways and it just happened to trigger a buy but it doesn’t really look like it’s moving in the right direction. Or maybe you traded this same stock or the same instrument before and the last couple of times you traded those trades were losses and so you were resisting placing that trade.
Or maybe you’re in a deep drawdown and you you’re worried about maybe you should try stop trading that system. Or the economy is weak and you think maybe I should just turn the system off and wait till times are better. Like these sorts of things come up even though we’re algorithmic traders and even though we, even if we’re fully automated.
If you’re watching the markets and looks, looking what what trades you’re getting in and getting out of, you’re going to have emotional reactions to it. And what happens when we have emotional reactions? Sorry, here’s all of those things in case you couldn’t see them. What happens when you have emotional reactions?
Is, it manifests in different ways for different traders and different parts of the strategy. But, it could manifest on the entry, maybe you skip entry signals, or maybe you enter before you get a signal, or you get a signal and you enter a little bit late. On the exit, maybe you might exit early because you want to get out of that trade, or maybe you hold after the exit signal, if you, again, if you’re not fully automated, these things can happen.
On position sizing, you could trade at reduced size if you’re if you’re fearful of the current conditions, or you could trade at increased size if you’re overly confident about the current market conditions. Risk management, maybe you can take too few positions or too many positions. Maybe you move your stops closer than you should because you want to protect some of your profits, or you are worried that the market is going the wrong way.
Maybe you move stops further away because you feel like the stock is definitely going to turn around. It can manifest in trades in system management where you turn a system off, because all of a sudden you’re not comfortable with it anymore. Or you increase or decrease your allocation to a certain system.
because of how you feel about it. Now there’s something that all of these things have in common, and that is that basically they’re all a mistake. All of, almost all things were, all actions we take where we manually intervene with our trading, once we’ve developed our system and properly tested it, almost all of these interventions are a mistake, and almost all of them will destroy value.
Every now and then you might do something that adds value. But the trouble is it’s pretty random and just like gambling, you get an infrequent random payoff from doing some action and it makes you money and you get addicted to that. And so it becomes very hard to avoid manually intervening and tweaking and fiddling with your systems.
Make sense? What do we do? This is, first of all, we need to understand the process that we’re actually going through. Now remember, I’m approaching this first from the perspective of a non automated trader. I’m going to approach this from the perspective of a manual execution, but the same applies to automated trading. It’s just that when you are looking at your system and how it’s performing, you’re having the emotional reaction to what’s going on after the event, because typically the automation has already placed the trades.
Yeah, so you’ve still got emotions, you’ve just got to observe them. So this is what happens typically for the manual trader. There’s a process we go through and it only takes seconds or milliseconds. First of all, we receive the trading signal. It’s either a buy, a sell, a hold, or a wait, or basically something that we have to do or not do in the markets.
Okay. Then we have a subconscious evaluation of that signal or action that we’re supposed to take. And that subconscious evaluation typically comes from us looking at chart patterns, news, social media, past losses, past gains, recent results what we saw on Twitter how we’re feeling that day, whether we’re at equity highs, or whether we’re in drawdown, all sorts of things come into this subconscious evaluation.
Now this is not rational. This just happens because we’re emotional beasts, right? Once we’ve undergone that subconscious evaluation of this action that we’re supposed to take in the market, then we have this internal psychological battle that happens. So the internal struggle is between confidence and discipline on one hand, and fear and greed on the other hand.
And typically, one of those will win out. Depends on how confident and and certain you are about the quality of your strategy. Or if you’ve got a lot of uncertainty, a lot of fear, a lot of psychological scarring from previous trading, then, that, that balance can be tipped the wrong way.
And then very soon, we’re going to come to a decision and we’re going to make one of three decisions. We’re going to either follow the system, we’re going to bend the system, or we’re going to break the system. And the great thing about fully automated trading is that you’re much less likely to bend or break the system because the computer is placing the trades.
However, you’ve got to be really careful with assuming that you’re not bending and breaking the system because even though the computer is placing the trades for you, who has the control to turn the algo off? We do. Who has the control to adjust the capital allocation that we give the algo or the system?
we do. Who has the ability to change the position size or the level of leverage? We do. And so even though the computer might be placing the trades, we can still intervene and that intervention can still be driven by this subconscious process that goes on in our level of emotion, the level of emotional impact that we that we feel as a result of that strategy.
So the trouble is, Obviously bending and breaking a system is very high risk, like we know that consciously and rationally, but it doesn’t stop us feeling the psychological pressure. So what do we do about that? We’ve got to look at this process and try and come up with a better process that we can consciously put in place to harness the psychological or the subconscious evaluation rather than be victim to it.
Make sense? So here’s what we want to do. I want to change this process to reinforce our confidence and improve our systems rather than undermine and break them. So here’s a new process that I want to propose and this is what I use and I’ve done this hundreds and hundreds of times. and it it really helps.
So the first thing you want to do is we’re going to receive a trading signal or look at some trading stimulus. Again, if we’re automated, you might be looking at your trades just after they’ve been placed or as they’re being placed. Or you might be debriefing at the end of the day. Whatever it is, you’re going to look at the signals that you’re getting.
Yeah, so you receive the signals. And then you’re going to do a conscious evaluation of that signal. Okay. You’re going to actually look at it and think about it and feel about it. How do I feel about this? What am I seeing in the markets? After that, you’re going to list all of your sources of uncertainty, right?
So what makes you uncomfortable? What makes you unhappy? What makes you stressed, worried, or excited, or keen to place that trade? What are the things you’re seeing that give you some sort of emotional reaction, either positive or negative about that. Okay, now I know as traders we’re supposed to avoid and eliminate the emotions, but bear with me because this is actually a far better way to trade, right?
So you’ve got the trade, you’ve looked at it consciously, and you’ve listed your sources of uncertainty. Oh, I don’t like that. Oh, that makes me nervous. Oh, the market’s a bit down. Oh, Trump nearly got assassinated. Oh, this happened, or whatever it is. Alright, you want to list all of the things that make you uncomfortable or overly positive about that action you’re about to take or you just took in the market.
Make sense? Once you’ve listed all of those sources of uncertainty, what you want to do is try as much as you can to find or create rules to isolate and or exclude those sources of uncertainty. So what that means is let’s say you look at the chart and you go, Oh, that stock suddenly looks really volatile.
I’m not comfortable buying that or I don’t like the look of that. Then the rule to isolate that would be something to do with the volatility of that instrument now compared to the volatility in the past or something to do with the absolute level of volatility of that instrument. Does that make sense?
So you’ve listed your sources of uncertainty, or you’ve thought about the emotional reactions you’ve had to those trades, or that trading activity during the day, and then you’ve created a rule to isolate that behavior. For one trade, you might have, Three or four or more things that cause you to be either uncertain or overly positive about that particular trade.
And that’s okay. What you want to do is create a rule to isolate each one of those things individually. Maybe it’s gappy and you create a rule about the amount of gaps to eliminate stocks like that. Maybe it’s a really nice smooth trend and you’re feeling like, oh, I really can’t wait to buy this stock.
So you create a rule to isolate stocks that really look like they’re trending smoothly like that. Whatever the response is, you’re creating a rule to either exclude or isolate that behavior. Once you’ve done that, you’re going to codify those rules. So first of all you do it conceptually, then you codify it so you’ve got an extra rule to add into your system.
And there’s going to be, sorry, there’s going to be one rule for each of the sources of uncertainty that you identified. Yeah? So we list the sources of uncertainty, we conceptually create some rule to exclude or isolate that, and then we codify it. Once we’ve codified it, then we’re going to backtest those rules objectively.
So each of our emotional reactions, either positive or negative, will lead to several rules. We’re going to add those rules to the system one at a time. We’re going to add the first one. Backtest it, see how it performs, then remove it, then add the second one, backtest it, see how it performs, and remove it.
And we’re going to do that every single time we have an emotional reaction or some sort of concern about a trade or a situation that system generates. Once we have that, then what we’re going to do is discard the worthless rules and incorporate the promising rules. And then finally once we’ve got that, We’ve we get to the point where we have tested our emotions so that we we know whether or not those emotions have a profitable impact or a detrimental impact to the p and l of the system or the equity curve of the system.
Yeah. So rather than fighting and trying to bury the emotions, we’re actually trying to observe the emotions more than we naturally would. Because those reactions to the trades we’re generating are stimulus, and the stimulus gives us ideas to test, and the ideas are either, oh, get rid of those sorts of trades, or get more of those sorts of trades, and create a new rule to, to achieve that, and then we put that into the system and test it.
What you’ll find is if you do this every time you trade a system. then, over time, you’re going to have less and less emotional reactions because you’re building more and more confidence in the rules. Because every time you have an emotional reaction, then you create a rule and you test it.
What do you think is going to happen to the performance of the system when you add that rule generated from your emotional reaction? Is the system going to get better, generally, or is it going to get worse? What do you think? It’s generally going to get worse. Yeah. In fact, probably 98 percent of the time it gets worse.
I reckon would be about my hit rate because I’m, I’m just a normal person. I have plenty of emotional reactions and they don’t help. in the vast majority of cases. But every now and then, if you’ll do this often enough, and if you’ll observe yourself deeply and observe the charts and see what’s happening and note your reactions to it and test them, every now and then you will find something that improves the performance of the system.
And even if that’s only 2 percent of the time, what’s the impact of that? Greater profit forever. Chucking away 98 percent of the things you test to find 2 percent of reactions that create something positive can actually be hugely profitable and worthwhile. Because even if you improve the performance of the system just a little bit over 5 or 10 years, compounding, that’s humongous.
Does that make sense? So we have actually a wealth of insight. most of which is emotional garbage, but maybe 2 percent of which will generate some extra profits for us. And if all we do is bury or suppress our emotions and just blindly follow the system, we don’t get to harness that. Yeah? But if we follow this process instead, every time we get a signal, note the emotional impact list the sources of uncertainty, convert it to rules, code them, test them individually, discard the worthless ones.
We build greater confidence and discipline. So only two things can happen. We either increase our confidence to follow the system as it is because yet again our emotional reaction didn’t help, or we improve the system. So we convert our emotions to positive things in our trading. either increase confidence or increase profitability.
There is no downside. Whereas when we’re trading and we’re just trying to suppress the emotions or fight the emotions, then every now and then we make a mistake, we let it creep through. There is plenty of downside, right? And I’m sure every one of you trading has made an emotional mistake because we all do, right?
So instead of suppressing them, let’s harness them. And you will find things that make you a more certainly you’ll find that you very quickly build confidence, but you’ll also find things that make you a much better trader, with much more profitable strategies. So how many of you have felt, once you’ve gone live with a system, that, Oh, I didn’t really think that I’d get a trade that looked like that.
Anyone thought that? Yeah. Yeah. How many of you thought, Oh. That chart looks a bit weird, or gappy, or volatile, or hasn’t really gone anywhere, I don’t really want to buy that, or, oh, that’s a perfect chart, I wish every signal looked like that! These are all emotional reactions. And we gotta, if you harness those, then again, you either increase confidence, or you improve system profitability.
And so this is the process that I would suggest that all of us as systematic traders do, and I still do this less and less over time, but every time I have a new strategy, I will do this a lot at the beginning of that strategy, as I’m really building the confidence, the insight, and trying to improve it.
But gradually, the amount of emotional reactions I have to individual trades or situations on the equity curve or trading days or market conditions, gradually, those reduce because I’ve tested them all and realized that, I can let that go. I don’t need to worry about that. At the very beginning, there’s a lot of this, and then it tapers off, and then I develop a new system.
There’s a lot of it, and then it tapers off. No matter what you find, your confidence in the system improves, and maybe, just maybe, you find something that improves your profitability. And look, I have. I can’t say I’ve found dozens and dozens of things that came out of my emotions that improved profitability, but I do have a handful of things that I’ve found from doing this that have made me hundreds and hundreds of thousands of dollars.
Just because they gave me a slightly bigger edge and compounded over time, it made a lot of money. And if you could test some emotional reactions and five years later that was worth a couple hundred thousand dollars to you, would that be worth it? Yes or no? Yeah, absolutely. So don’t fight it, use it.
That’s the trick. Make sense?
Okay I want to talk about how to make better use of your historical data. Now again, a few a few challenges for optimization. But again, this is in the context, so this section is in the context of portfolio systems.
Now in case you’re not the language is not familiar to you, there’s single instrument systems. And there’s portfolio systems, single instrument systems, one set of rules trades one particular ticker and it just, buys and sells that ticker. Portfolio systems is one set of rules trades a whole portfolio of stocks, like a trend following strategy on stocks, maybe it holds 15, 20, 30 individual trades and it manages the port, the system rules manages that portfolio.
Okay, so I’m talking about portfolio systems, I’m not talking about single instrument systems. Just just so I’m clear because this next strategy is, only works on the portfolio side. So there’s a couple of challenges for optimization of portfolio systems. The first one is, there’s often just insufficient data.
Even though we’ve got, 20 or 30 years of market history on daily bars for many many markets, there’s often just not enough data and not enough trades to really properly develop the system with a high degree of confidence. And if you use shorter bar sizes, then, or shorter timeframes then that can definitely be a problem as well.
Another problem is outlier trades can skew your optimization results. So maybe there’s a couple of big trades and it’s pretty hard if you’ve got some big positive outliers for them not to skew the optimization decision that you make. Sometimes there’s also too few trades in the backtest. So your you make your, you run your backtest, you run your optimization and there’s only a relatively small number of trades at any one time.
And you end up overfitting your system rules to those particular trades. Another problem is the lumpy data optimization space. The lumpy the parameter space, so when you run your optimization, the performance is is quite lumpy. So it might be that it’s quite sensitive.
You adjust a parameter just a little bit and the performance improves a lot. And then you adjust a little bit more and the performance drops. So the parameter space is typically not a nice smooth surface. If you’ll mention a three dimensional surface, the way you want to pick the best performance right at the peak of that nice big mound of good performance, it’s typically not like that.
It’s typically lumpy and bumpy as hell. So it’s quite hard to find the best or the most stable sort of parameter set. Okay. Another problem is that brute force optimization, which is where you take all of your variables, all your parameters and optimize them all simultaneously. Typically, brute force optimization results in overfitting, and that’s back to the problem we raised at the beginning.
Where when you develop your system and optimize it and it looks great in the backtest and you start trading in real time and it disappoints that’s because you’ve overfit to the past data and in real time trading, the conditions are different and you don’t because it’s overfit, the performance suffers.
And brute force optimization, when you optimize everything simultaneously, almost always results in overfitting. Then two other things. If you optimize with compounding, then again with a portfolio system, what happens is you end up ignoring a large number of possible signals that the system could have taken.
Because with when you’ve got a compounding portfolio let’s say it’s a trend following system, Maybe it holds 20 trades at any one time, but there could have been 30 signals or 50 signals, but you ran out of capital, so you only take the first 20 of them, and then, the portfolio grows, you get that nice exponential growth, but there’s lots of data that doesn’t make it into your optimization, so there’s trades that that you really should consider that you haven’t considered.
And so that’s that can be a real problem. And then if you want to avoid that, then you can optimize without compounding. But the trouble is if you optimize on an all trades test, an all trades test is a test where you run a backtest with a huge amount of capital and you make every trade a small consistent size.
So like 10 million dollars of starting capital in your backtest and every trade is 10 thousand dollars so that you get every single trade into the backtest for your optimization. The problem with that is that they don’t compound very well. You end up developing a system where the average profit per trade looks great, but it doesn’t compound nicely because you haven’t optimized in a comp on a, in a compounding on a compounding basis.
So these are some problems for, Developing or optimizing systems that are portfolio systems and I’ve got a neat way to help with that But first I want to talk a little more about this idea of compounding backtest versus all trades or unconstrained capital backtest This is an equity curve for a trend following system.
It’s on daily bars the actual model doesn’t make doesn’t matter. The rules are not important for this discussion But this this system has a position sizing model of 0. 75 percent risk per trade. This backtest took 1, 215 trades. It generated an annual return of 20% with a max drawdown of 23%, and the average profit per trade was 14. 2%. It’s a reasonable trend following system. It it certainly does a lot better than buying and hold in the market. It’s pretty easy to trade. Thank you. The trouble is, in order to generate this backtest, the the backtest will reject a whole lot of signals because of the capital constraints.
Now these signals that are rejected are lost information. This is an opportunity to inform us about something that could happen to us in real time trading, but we missed it because the backtest couldn’t take that trade because it didn’t have enough capital. Make sense? So that’s the downside. But the upside is the compounding is realistic.
As in, assuming you take into account slippage and commissions, it shows you how the system would compound over time and what the drawdown would be. Therefore, it helps you understand whether or not this system is likely to meet your objectives. So compounding backtests give you some good advantages on the meeting your objective side, but a big disadvantage that it leaves a lot of information or potentially useful information on the table.
So that’s a bit of a problem. And, the key thing here is that the trade is only included in the backtest if it has sufficient cash at the time the signal is generated. Otherwise the backtest will ignore the trade. And that’s just like what you would do in real time trading. Yeah, if you’ve got a spare slot for that system, a spare bit of cash for that system, and you get a signal, you take the trade.
But if your system is fully allocated and you get a new signal, you ignore the trade, right? However, the nature of that trade that you’re ignoring is important. It could give you useful information when you’re optimizing the system, but we’re missing it, right? So when we optimize the system. We miss all of that useful information.
Compounding backtests allow us to optimise to meet our return and drawdown objectives, but they miss a large amount of trade level information.
. So the Unconstrained Capital Backtest or the All Trades Backtest. This is where you give, you tell the backtesting software, have 10 million to start and every trade is going to be small.
Every trade is a constant $1, 000 or a constant $10, 000. And so this is the same rules as what we saw previously, except when we run an All Trades test, it’s got, this generates 6, 104 trades. So there’s a lot more signals. Now this is great because it tells us every possible signal that we could have got with this strategy.
You should have a ranking formula to tell you which is the best signal, which is the second best signal on each given day. But you don’t miss any information. But the trouble is, because you’re giving it so much capital and you’re making every trade a consistent, a constant size over the life of the backtest, you don’t get compounding.
So you, you get good trade level stats. You can see here the average profit per trade is 13. 5%, but you don’t get the annual return and the drawdown stats calculating sensibly because you’ve got a large amount of capital and small constant trade size. There’s no compounding. So you don’t know if you run your optimization on this basis, you don’t know whether or not you’re going to meet your objectives of what annual return you want and what max drawdown you want.
So optimizing on an unconstrained capital basis or trades basis is problematic as well, but for different reasons. Yeah, so no compounding, but we do get every signal. We don’t lose any information. So we know if there’s any big ugly outliers waiting there to come get us. Or we know if there’s any huge positive winners that we really want to make sure we do capture, but we don’t know whether or not those rules will will compound well.
And we don’t know whether the drawdown is going to kill us or not. So that’s a real problem. Neither of these two methods in isolation is perfect, right? The the unconstrained capital gives us all trading signals, but it doesn’t give us the ability to optimize for return and drawdown objectives, which is what we really need because we’re constructing a portfolio to meet our financial objectives.
So when we’re doing our optimization we can’t do it on I believe we shouldn’t do it on an all trades basis like this because we don’t know whether or not it’s going to meet our objectives. And typically, if you optimize on an all trades basis, when you you might get an equity curve that’s nice and smooth and stable and you might get good average profit per trade, but it typically won’t end up compounding well.
And therefore it won’t contribute well to your portfolio. Because you haven’t optimized on a compounding basis, so you end up with trades that look good, but maybe they take too long to play out and they just don’t give you a good return on capital. So both have issues. We either give up a bunch of information or we give up the ability to see how compounding and drawdown impacts us.
So what do we do about this? What we want to do is capture the benefit of both of these approaches and avoid the downsides. And here’s a real simple way to do it. You run a, you run your optimization on a compounding basis. So just like the the one on the left here, except instead of just testing the rules, you add an extra rule, which is a random miss, okay?
So every single trade has a random number assigned to it, and if that random number is less than, say, .7, then it gets into the backtest. But if it’s more than .7, it doesn’t. So this would reject 30 percent of trades randomly. And you also add an extra optimization parameter to the system. So this is just an optimization counter.
And this is, this just counts from 1 to 20 in steps of 1. And so when you press the optimize button, in whatever software you’re using, this is going to create 20 iterations of each parameter set. And each iteration is going to be slightly different because 30 percent of trades will be randomly skipped.
And so you’ll get compounding equity curves with different drawdown profiles and different return profiles. But they’re all different because 30 percent of trades are randomly skipped. And what that does is it allows you to get more information into the compounding backtest for your for your portfolio system.
Okay. So think about this in terms of, in the context of an optimization involves several other parameters. So we’ve got a short term moving average bars, long term moving average bars, and a breakout bars. These are just three parameters, whatever, it doesn’t matter what they are. And let’s say we optimize these and we’re going to optimize each of them over a different range.
Here we’ve got from 20 to 60 in steps of 20 from. 100 to 400 in steps of 100, and from 100 to 400 in steps of 100. Every combination, so the first combination would be 20, 100, 100. And it would run once. And then, the optimization would run again with 20, 100, 100. But a different random number will generate for every single trade.
And it’ll do that 10 times before stepping to the next combination, which would be 20, 100, 200. And then I’ll run that combination 20 times. So what we’re getting is 10 times as many equity curves, if you like, or results, than what we were getting before from the standard compounding equity curve. So here’s what it looks like.
We can see short term moving average, long term moving average, and breakout bars. We’ve got, 20, 20, 20, 20, 20, 100, 100, 100, 100, 100, and 100, 100, all the way down. But the optimization counter has gone from 1 to 10 in steps of 1. So it’s run them 10 times, and you can see that the compound return, each one is a little bit different.
The max system drawdown, each one is a little bit different. The curve MDD, each one is a little bit different. Number of trades, all a little bit different. So we’re getting slightly different portfolios each time it runs. So we run the optimization, we get way more information because we randomly reject different trades and let other trades into the portfolio.
And then what we can do is average out the performance stats. Or look at the range of those performance stats across the the different optimizations. Now, depending on how many or how much computing power you’ve got, you might run five of these or 10 of these or 100 of these. Yeah, it just depends on, how long you’ve got to wait, how fast your computers are.
But running at least a few of them will give you way more information to allow you to make a good optimization decision. Because if you just optimize. Based on a single run and you don’t randomly reject any trades what happens is you’ve got a high chance of Overfitting to the trades that happened to make it into the portfolio on that particular Sequence of conditions in the backtest, but when you run this random trade skip and you do it many times Your chance of overfitting to the particular trades that made it into the backtest is very low falls dramatically because you’ve got a lot more data, a lot more information coming into your optimization.
So we can now average multiple runs of each parameter combination and incorporate more trades while still making our optimization decisions on a compounding basis. And if we come back to this, The lower we make this number or the more trades we randomly skip, the more variability we’re going to get into our our portfolio.
So the more different trades we’re going to take into account. The trick or the trap is that if you make this the closer this number is to one, as in the fewer trades you reject, the fewer trades you reject the closer the compound return and the drawdown is to the reality that you would expect.
But if you reject more trades, there’s going to be situations where you randomly reject trades and your exposure on the system drops, so the return on the system will drop a little bit. So when you run this sort of optimization, your compound return will be lower, typically, than what it would be if you didn’t have the random miss.
But that’s okay, because all we’re doing is trying to make the best equity curve possible with the least amount of overfitting. And then once we’ve done that, then we can run the backtest again without the random miss and see what the returns would really have been. And see what the drawdown would really have been.
And we’ll do that with a much higher degree of confidence. When you do the random miss, don’t be surprised that the return drops a little. That’s okay, because you are rejecting a bunch of trades. You might have one signal on one particular day and that signal happens to get randomly rejected.
Therefore, obviously, you’re not going to make the return from that, so the compound annual return will drop. But you’re still there’s still huge value in the technique because you’re getting more of the information and you’re still allowing it, your, yourself to make the compound, the decisions, your optimization decisions on a compounding basis.
Good? Okay. Alrighty, so we’ve we’ve run this. Now, this is if you do this has got 10 runs. So I’ll typically do 10 or maybe 20 different runs. That means your optimization is going to take 10 or 20 times longer to run. Okay, so you want to get everything set up nicely, you want to make sure you’ve got everything set perfectly and then I just run it overnight and come back in the morning and then I’ll do the analysis in the spreadsheet afterwards.
So it’s going to take a lot more computing time but, computing time is cheap. Actual dollars that we lose in our account because our system was overfit, that’s expensive, right? Take a little longer to develop the system and save some money in real time trading. Does that make sense?
So use a random miss and optimise so you’re getting more trades into your compounding equity curve. When you’re optimising, you get a lot more data. You can make decisions with a much higher degree of confidence, with a much lower chance of overfitting to the particular trades that made it into the equity curve.
That’s the message. Good? Okay, I’m going to go on to the next one. A simple way to evaluate system robustness. Now, everyone is going to develop systems. Typically, most people will run some sort of optimization. However you do it is your business. And at the end of the day, you’re going to choose a combination of parameters to run with your system.
Now the question is, how robust are those parameters? How robust is that system? Is it likely to hold up in real time trading in the future? There’s a really simple way to determine that. So first of all, what is a robust system? In my view, a robust system is a system that is likely to work over a wide variety of future unseen market conditions.
If a system is fragile, if it’s over fit to pass data, then on real time data that hasn’t seen before, it’s likely to break, lose money, or at the very least, not perform so well. Yeah, a robust system should test well in backtesting and then continue to work well in real time trading. That’s what I would consider to be robust.
How do we do this? Yeah, covered that. Okay, so here’s what we do. No matter how you optimise, you’re going to have a problem. combination of parameter values that you’ve selected to run with. And they’re going to be your magic parameters that are supposedly the best. The ones that you think are are going to give you the most, the highest return for the least amount of drawdown, typically.
Rather than just blindly accepting that we optimized and we didn’t overfit, I want you to add one more step to the process. And this step I call it a robustness test. What you want to do is set your system up to optimize again, but every single parameter you’re going to vary plus or minus some small amount.
Let’s say 20 percent I typically do 20 to 30 percent depending on the parameter. And so each parameter will end up with three values. Here we’ve got rule one, it’s an exponential moving average of the closing price over 50 bars, and I’m going to vary the 50 because that was my optimized value. 50 was the value I chose as a result of my optimization.
I’m now going to vary that plus or minus 20 percent. So I’m going to go 40, 50, 60 in the optimization. And for the long term moving average here, it’s an EMA of 200 bars. Let’s vary that plus or minus the same 40, same 20 percent. So that would have three values, 160, 200, 240. And now I’ve got a second rule here, which is a breakout.
It’s a 50 bar breakout. I chose a value of 50. So I’m going to optimize again, varying it plus or minus 20%, 40, 50, 60. So if I’ve got three parameters in this hypothetical system here, each of the three parameters has three possible values. Three times three is 27 combinations, right? So I’m going to re optimize with three steps for every single parameter value.
Now I’m not re optimizing to try and make the system better. This is a robustness test. Okay, we’ve already optimized, we already chose our parameters that we want to trade with. What we’re doing is checking that we didn’t screw it up. We’re checking that we have a combination of parameter values that is robust, and we’re not sitting at the edge of a performance cliff.
somewhere in the multi dimensional space, right? This is the real problem, because when we optimize a system with three, four, five, seven, ten rules, how do you know if you just vary one of them slightly whether or not the system falls off a cliff? You don’t, because you can’t see it. As soon as you’ve got more than about three rules in your system, it’s almost impossible to properly visualize how that parameter space looks.
So we need a neat way to say, are my chosen parameter values robust, stable, or are there performance cliffs that I haven’t noticed, because of all of the interaction effects of all of those different parameters. This is how you do it. So we’re going to vary all of them, plus or minus 20%, and we’re going to run every combination.
So how does this help us? Actually, our systems are typically going to have more than three parameters. I’ve got a trend following system, it’s got seven parameters. So if I varied every one of those, plus or minus 20%, I’ve got three to the power of seven, or 2, 187 combinations. Every single parameter that I chose, plus, minus 20%.
So three values for each, three to the power of seven, 2, 000 combinations. If all of those 2, 187 combinations are quite profitable and tradable, then it’s clear that parameter set that I’ve chosen and that system is robust and stable. But if half of them don’t make money or if the performance really falls apart on a large number of a large number of those combinations then you know that your system is not very stable.
You know that somewhere in your parameter space there’s some performance cliffs or some decay that is going to put you at risk in your real time trading. So we run our chosen parameter set, plus or minus say 20%, so we get thousands of extra back, extra slightly different parameter combinations, and we want as many as possible to be profitable and tradable, preferably all of them.
So what this is going to do is highlight when your chosen parameters are close to this performance, to some performance cliff that you can’t see because it’s a multi dimensional space and we can only really visualize in three dimensions. And that performance cliff would make your system fragile and likely to fail.
And the cool thing about this is when you run this, you just get a single like table of results. And there’s thousands and thousands of rows in the table. and all of those, every single row is a slightly different combination of parameters. The first one minus 20%, the second one plus 20%, the third one is, plus 20 percent and the value you chose, and then another one is minus 20%, they’re all slightly different combinations.
And you can just look on them and you can filter that table, say, okay what’s the best one, what’s the worst one, where is my chosen parameter combination in that performance space, in that sort of output. Cool. And here’s another way to tell whether or not you’ve really overfit your model during optimization.
When you do this, let’s say you’ve got the 2, 100, 2, 200 combinations that I was just talking about for that seven rule system. If you run that plus or minus 20%, so you get all of those rows, and then you find that your chosen parameter set is right up the top in the top couple of results of those 2, 000, what does that tell you?
Tells you 100 percent that you’ve over fit your system to pass data. 100%. Before you’re on the cliff. Yeah, and you’re probably on the cliff. Yeah. So that’s a real problem. So when you do this, you don’t want to see your chosen parameter set right up the top. It should be near the top, like maybe in the top 25 percent or so.
But if it’s way up the top. then you know you’ve just overfit during your optimization because you’re never gonna find you should never choose the absolute best performing combination of parameters. Why not? Because it’s always overfit. If you run a brute force optimization of every combination of parameters and you choose the best one, in real time trading, that absolutely won’t perform well.
Because all you’ve done, you’ve picked all the performance peaks, you’re right next to all of the cliffs, it’s going to be terrible. So we don’t want our chosen parameter combination to be right at the top of the optimal robustness test. But we do want it to be, towards the top, maybe in the top 25 percent or so.
Is this making sense? So if you’re not sure whether you’ve over optimized or over fit, this is a really great way to find out. And if you’re not sure whether this system is likely to hurt you in the future, this is a really great way to find out.
So we’ll show you whether or not the system and the parameter combination you’ve chosen is robust, even though you can’t visualize stability. in a multi dimensional space. And this is the problem. When you optimize one parameter at a time, you can see, oh, the performance is nice and smooth, and then there’s a big spike, so I don’t want to, I don’t want to choose the parameter value that gives me the big spike, because that’s unstable, but I want to choose the parameter value right in the middle of that big stable area, because that’s more likely to perform well in the future.
But as soon as you do three, four, five, six, seven variables simultaneously, you can’t see that anymore. Because we can’t visualize in more than three dimensions. Make sense? But this allows us to basically shortcut that process, just by varying every single one, plus or minus a little bit. We can automatically tell whether we’ve overfitted, or whether or not there’s performance cliffs.
So there’s a good question in the chat. Graham says, Do you test for robustness using in sample or out of sample data? I think both is probably the best answer. Thanks a lot. I absolutely do it on InSample because I want to know if I made a good decision for selecting of my parameters or whether I overfit it.
So I want to know where in that list my chosen combination is. But if you do it on the Outer Sample Data, you’ll also get get some good insight into, in the future, is it likely to hold up? Yeah. Good question. Alrighty let’s see. I think something that a lot of people miss, and I know a lot of you have been doing this for a while, so you probably, this is probably not new to you, but the stability of your chosen parameter values relative to their surrounding values is far more important than the absolute performance.
If you choose a set of parameters and your chosen parameter set performs up here, but the adjacent parameter sets perform down here, then in real time trading you’re not going to get this performance, you’re more likely to get that performance. But if you choose a parameter set that performs here and the surrounding parameter sets are all about the same, or just slightly less, then that gives you a much higher degree of confidence that it’s stable, and that in the future this strategy is more likely to perform as well as in the backtest.
So this robustness test is a really great way of achieving multiple things. It tells you about the stability for the robustness of the system in terms of coping with unseen data. It tells us how well you’re doing the optimization. It tells us whether or not there’s some performance cliffs and so on. And it allows you to also do an out of sample test in a much more rigorous way by testing all of the surrounding parameters on your out of sample data as well.
That’s useful. Alrighty, so that’s all I wanted to share formally. I just want to give you something for free that hopefully might help. If you scan that QR code or go to bit.
ly forward slash est acceleration you’ll get a couple of free courses, two eBooks and there’s actually four cheat sheets that you’ll get as well, and a trading quiz that will help you improve your trading based on how you’re making your decisions right now. So scan the QR code, go to that link, enter your name and email, and get some free stuff, which will help.