When it comes to adopting a serious approach with regard to financial markets and trading, backtesting plays a major role in the resulting workflow and overall reliability of the system in place. After coming up with a trading strategy, it should be scientifically tested with historical data before shipping it to a live trading environment and putting money at risk.
Regardless of the programming language, you’re using, you’ll have to choose between using a thor-party library or creating your own engine. In other words: should you create your own backtester?
The short answer is yes, it is convenient to build your own backtester. In addition to greatly improving your programming (or Excel) skills, you will also gain a deeper understanding of the inner workings of a backtesting engine. Moreover, creating a backtesting engine will also force you to decide the assumptions of the backtester, which in turn will make you more knowledgeable when it comes to knowing the limitations of backtesting.
In the following sections, I will cover both the advantages and disadvantages of creating your own backtesting engine from scratch. Additionally, I will compare the vectorized and event-driven approaches, and discuss a few additional relevant topics related to backtesting.
Advantages of building your own backtester
Understand the limitations of backtesting
Every model is a simplification of the phenomena that it tries to replicate, and backtesting is no exception. Those simplifications are more or less realistic depending on the type of asset being traded or the strategy that you’re trying to test. For example, it would be unreal unrealistic and useless to evaluate the performance of a market making algorithm if you do not account for slippage. Creating your own backend will force you to actively make trade-offs between simplicity and reality.
Also, depending on the logic of your strategy, the backtester might fail to generate all the events that would have happened in a real-life trading scenario. If you’re using 1-hour data, your backtester will be limited to generate 1 order per asset at most. But, if you’re trading a volatile asset, the strategy might have bought and sold it multiple times during said bar.
Improve programming skills
Even if you’re a seasoned programmer, creating your backtesting engine will test your current skills. Your data science chops will greatly benefit from this exercise, and solving computational bottlenecks that arise when dealing with large time series will further improve your algorithmic thinking.
Gain a deeper understanding of algorithmic trading
Creating a backtesting engine will force you to learn quite a few market microstructure aspects that you would otherwise ignore. Although it might seem like a trivial exercise, it is strongly recommended to create a backtesting engine and compare the results with an existing one.
Even if you end up using not the backtester you developed, you will have gained a deeper understanding of the assumptions made by the creators of the engine you choose to use.
Introduce your own assumptions
On clear advantage of creating a backtesting engine is that it allows you to introduce taylor made assumptions that are especially relevant to your trading purposes. For example, based on our observations, we could incorporate our own complex logic for calculating slippage. Another scenario that requires extensive customization is when backtesting a market making algorithm with tick-level data.
Most backtester fail to incorporate some features that could be relevant in some situations. For example, most engines don’t contemplate margin calls and short-selling fees. Additionally, slippage is commonly calculated as if it were a regular trading fee.
Taylor-made for your own workflow
Most proprietary firms have a solid and fine-tuned workflow that enables them to go from testing to implementation as soon as possible. Due to the fact that most strategies have no value and end up not being used, it is essential to reduce the duration of each iteration cycle.
Having a taylor made backtesting engine allows proprietary firms to deploy a suitable strategy to their live-trading platform with as few modifications as possible. Event-Driven backtesting engines are specially well-suited for this purpose, since its structure closely resembles the one that would be finally used.
Disadvantages of building your own backtester
Reinventing the wheel
Assigning resources and our time to develop a backtesting engine instead of using a third-party backtester has high opportunity costs since we could alternatively be researching and testing new trading ideas.
Additionally, professional-grade backtesters require a specific set of skills, since it requires programming skills and previous exposure to financial markets and their microstructure. It goes without saying that algorithmic developers tend to have access to good salaries.
If not for some specific reason that requires it or for the sake of learning, creating another fully-fledged backtesting engine should be avoided.
No community of users
One of the main benefits of using a third-party backtesting platform is the fact that most of them also have a thriving community of users. As a consequence, you’ll easily find tutorials that will help you to quickly get started, in addition to providing excellent documentation.
Some backtesting platforms have their own forums, slack, or discord channels. Using the same backtester is analogous to sharing a common language since it allows the community to quickly replicate issues and solve each other’s problems at a rate that would otherwise be impossible if each member used their own system.
No other developers will implement features
When creating your own backtesting engine, you’ll have to develop each and every single feature that you would like to add. If you want your backtesting engine to be at least of comparable quality to the ones available for free, you will have to incorporate take profits, stop losses, trailing-stop losses, short-selling, an optimization module, and a reporting dashboard to conveniently analyze the results.
Also, if you’re just getting started in backtesting trading strategies, most features you’ll probably want have already been incorporated by other contributors. You’ll also find convenient functions for resampling data, doing walk-forward analysis, and plotting the output in various convenient ways.
Increased probability of bugs
When using third-party libraries that have a thriving community and a long history of usage, you can rest assured that said library is more reliable than yours. As backtesters become more complex and incorporate more features, it becomes more difficult to thoroughly test the accuracy of the results. Thus, having multiple users using the engine is an excellent way to find and solve bugs.
Vectorized vs Event-Driven Backtesting
If you do decide to create your own backtesting engine, you’ll be faced with having to choose between the vectorized and event-driven paradigms. As expected, they both have their own advantages and disadvantages, which we will briefly discuss in this section.
In short, vectorized backtesting offers faster processing speed and is suitable for quickly prototyping, whereas event-driven backtesting is more realistic, and allows for easily implementing recursive features, like trailing stop losses.
The main advantage of going the vectorized backtesting route is the sheer processing speed that this approach allows. This paradigm is also commonly known as array programming and refers to an approach where operations on scalars are generalized to vectors, matrices, or even tensors.
If properly implemented, vectorized calculations lead to faster executions times due to the fact that calculations are performed in parallel, whereas their for-loop counterpart would perform the calculations in series.
If you’re fairly comfortable with Python and its most popular scientific computing libraries (Numpy and Pandas), you will also find this approach extremely convenient and efficient for quickly prototyping and testing trading ideas. In fact, in the following video, I show how to create a quick backtest for a simple moving average crossover strategy using only pandas.
Despite its clear advantages, implementing a vectorized backtester also has its drawbacks. Due to the fact that vectorizations perform calculations in parallel, it becomes challenging to implement features that require complex conditionals based on previous data.
Although completely possible, implementing take profits and stop losses is not as straightforward as one would expect. Going a step further, adding trailing stop losses to a vectorized backtester is quite an accomplishment. In other words, adding features that are recursive is difficult under this approach, and this is exactly where event-driven backtesting shines.
- Faster computation
- Easy Prototyping
- Less realistic
Event-Driven backtesting is a more intuitive and realistic backtesting approach where events can be characterized by new data being sequentially available for decision-making. The most common approach of event-driven backtesting is by means of for-loops, and each event could be the closing of a bar, or some news becoming public. In simple terms, think of it as iterating over a time series and applying conditional logic over each period.
Since this approach goes over each period and evaluates if a trading signal should be triggered, it is less prone to some common backtesting pitfalls, the most important one being lookahead bias. It is common practice to generate signals based on the closing price, but execute an order at the opening of the next bar.
This approach allows practitioners to implement more refined, complex, and realistic features that would otherwise be difficult to implement via vectorized backtesting. Take-Profits, Stop-Losses, Trailing-Stop-Losses, and other recursive or path-dependent features can be easily implemented with an event-driven approach.
Last but not least, given the fact that event-driven backtesting more closely resembles real life, it is almost always also easier to migrate a trading strategy from the testing environment to live trading.
- More intuitive approach
- Easier to translate into a live-trading algorithm
- More realistic approach
- Easier to implement recursive features (Take-Profit, Stop-Loss, path-dependent conditions).
- Slower computation
Best programming language for Backtesting
Unless extreme computational efficiency is required, Python is the go-to programing language for backtesting trading strategies. In addition to having numerous active backtesting libraries freely available, it also gives access to the most convenient libraries for scientific computing and everything that could be required for backtesting a strategy.
Thanks to its ease of use and friendly learning curve, it is the most popular programming language in the algorithmic trading community. As a consequence, you’ll find plenty of resources, tutorials, examples, and GitHub repositories that will help you in getting started effortlessly.
Having said that, if you’re familiar with another programming language, you’ll most probably also be able to find a community of users around it. You won’t have any problems finding up-to-date resources for Java, C#, C, Python, and C++.
Best backtesters in Python
If you want to quickly test a research idea without going through the hassle of creating your own backtesting engine, you should definitely use Backtesting.py or VectorBT. The choice depends on your specific needs, and if you don’t mind using a less active but mature project, you can also choose Backtrader.
Backtesting.py is by far the fastest event-driven backtesting engine freely available for Python. Its syntax closely resembles Backtrader but has the advantage of still being an active project. Its documentation is excellent, and it also has some interesting tutorials.
If you want to get started using backtesting.py, take a look at the following video I made.
This library has two main disadvantages. Being an event-driven backtester, it runs much slower than a vectorized engine, which could become a relevant issue executing thousands of iterations.
Also, it does not allow for backtesting multiple assets at once. As a consequence, it is impossible to test rebalancing strategies. Having said that, you can still do pairs trading strategies by expressing one asset in terms of the other one.
As the name suggests, this backtester uses the vectorized approach in order to do incredibly fast computations. It even goes a step further and uses Numba under the hood, which is a JIT compiler that improves the speed of Python in general and Numpy in particular.
The main drawback of this library is that its syntax is pretty opinionated and has a steep learning curve. You’ll be able to easily follow the tutorials, but you’ll initially have a harder time implementing your strategy.
As you might have guessed, creating a backtester is not mutually exclusive to using an existing one. Creating your own engine will greatly benefit you in terms of programming skills and overall understanding of the assumptions and inner workings of backtesting as a tool for trading.
As a learning experience, it is not necessary to implement every single aspect of a mature engine. The exercise of replicating the results of a simple strategy tested on another backtester will set you apart from other developers.