Photo by Aaron Burden on Unsplash

In the last post, we learned the basics of performing the pair trading strategy and using cointegration as a method to identify the potential tradable stocks pair. All the theories and the math formulas are so seemingly promising and convincing enough for us to believe it’s a profitable and stable trading strategy. But is it? In order to test and check the profitability and effectiveness of this strategy, we need to backtest this trading strategy to simulate real-world scenarios.

Become a Medium member to continue learning without limits. I’ll receive a small portion of your membership fee if you use the following link, at no extra cost to you.

If you enjoy reading this and my other articles, feel free to join Medium membership program to read more about Quantitative Trading Strategy.

Previous researches

【Pair Trading】 Cointegration Test - A Key to Find High Probability Trading Strategy

Recap

Let’s pick up where we left off.

In the last post, we spent time explaining the basic concepts of cointegration pair trading strategy such as what cointegration means, the meaning of stationary, and how we profit from these concepts. We chose Engle-Granger 2-step approach as a method to inspect the level of cointegration between two time series. Once we are able to find the stock pairs that have higher probabilities to stay cointegrated, then we can start monitoring the co-movement of their price and make trades when the pairs are temporarily not cointegrated.

We’ve also revealed the preliminary trading rules of the cointegration pair trading strategy. We

Open a long position if the current spread is smaller than the mean of the spread $\mu - threshold * \sigma$
Close a long position if the current spread is bigger than the mean of the spread $\mu$
Open a short position if the current spread is bigger than the mean of the spread $\mu + threshold * \sigma$
Close a short position if the current spread is smaller than the mean of the spread $\mu$

Where

The mean of the residuals ($\mu$) as the benchmark line in our residual observation
The standard deviation of the residuals ($\sigma$) to calculate the trigger line in our residual observation
The threshold would be 2.32, indicating a 99% of the confidence level

Even if we’ve done a lot of research to learn as much as we can about the pair trading method, there are some factors we can’t avoid in real-life settings. As a result, this is where backtesting comes into play. Conducting a successful backtest would mean a lot to simulate what is going to happen if you throw your trading strategy into the wild and unpredictable stock market.

Build our trading rules

To conduct a backtest, we first need to set up the ground for this pair trading strategy. Here are a few trading rules that I have put together.

Trading pair formation

Selecting around 500 stocks based on their company financial fundamental data to find the companies that are stable and relatively financially healthy.
Obtaining the daily close price from the past two years for every stock that we picked
Using the scipy.stats.pearsonr(Series_A, Series_B) to calculate the Pearson correlation, and chose the stock pairs whose correlation value is bigger than 0.9 and the p-value is smaller than 0.05
Using the Engle-Granger 2-step approach to examine every existing pair
1. Using sm.OLS(Series_B as y, Series_A as x).fit() to get the beta, intercept, and residual of each pair
2. Using statsmodels.tsa.stattools.adfuller(model.resid, autolag = 'BIC') to evaluate the level of stationary of the residual of this pair
3. Eliminating the stock pairs whose cointegration p-value is bigger than 0.05, shows the stationary property is less significant.
Lastly, we sort these pairs by their cointegration p-value, the smaller the better.
We repeat this process every month in order to closely follow the cointegration status of the potentially tradable stock pairs.

Monitoring and trading

Since the pairs have been filtered and sorted by their level of correlation and cointegration, we consider these stock pairs to be our stock pair universe. As the methodology mentioned in the previous post, we are already using the data of the past two years to build the upper and lower bands. There are more details to be defined in order to complete our trading strategy.

The upper and lower bands built from the residual of the OLS results

Observing frequency

We examine the status of the pairs in our observing universe and the pairs that we already traded for every 15 minutes. Since we’re using the daily close price to form the trading pairs and to trade accordingly, I presume that 15 minutes would be an ideal interval to inspect the status of the pairs.

Enter and exit signals

The basic idea of signal generation has been stated in the Recap section above, so I won’t waste any of your time and rewrite them here again.

Stop loss and stop gain

In the post of 【Momentum Trading】Use machine learning to boost your day trading skill - meta-labeling, we have learned the idea of using the Triple Barrier Method (TBM) to control our gain/loss ratio. In this pair trading strategy, I’m using 2:1 as our gain/loss ratio. This 2:1 ratio essentially indicates that we will be able to tolerate 50% of our expected gain as our maximum loss per trade. For example, in our trading rules, we will close the pairs for those pairs whose residuals ($\mu$) return to the level of 0, meaning the stop gain would be 2.32 as we exit the trades when the residual is back to the level of 0. Then we define our stop loss to be at the level of 2.32 + 2 (2.32 - 0) / 2 = 4.64 and exit our trade at it in order to prevent greater loss when both stock prices further deviate.

As the ground rules have been set up to implement our trading strategy, I’m going to run the first round of backtest to see how it performs and to check the profitability of this strategy.

First-round backtest of the pair trading strategy

Wow! At first glance, the performance was quite impressive and satisfying. However, if we get a closer look, some illogical mistakes hide behind this backtest result. In the bottom chart HeldPositions, the number of long positions and the number of short positions should equal all the time, as we trade pairs including one long stock and one short stock. Therefore, in the red circle, you can tell the number of one side was decreased and the other side wasn’t. This would leave our portfolio exposed to risk as our positions were not hedged properly, increasing the odds to lose money on such an anomaly in our portfolio.

Red circles indicate weird things happened in our trading activities

Several scenarios we need to consider and address

After looking into the log message in the tested backtest, several loopholes can be found and concluded in my trading strategy:

Incorrect trading activities that need to be taken care of properly

Hazard 1: Margin call

According to the trading log message, the first red circle was due to the margin call getting executed to recover the remaining margin in your margin account. In order to trade stock options or to short-sell assets, you are required by brokers to retain the investor’s equity above a certain percentage so that you demonstrate your capability to repay the potential loss of your current investment. Once such a loss occurs and your equity falls under this percentage, you will receive this margin call notice from the broker that requires you to sell a part of your investment and turn it into cash to raise the percentage of equity.

In our backtest, the stock price of TSLA has declined drastically which brought the percentage of equity below that percentage, that’s why we were forced to sell any investment and turn it into cash. The point being, that we don’t want to sell ANY asset in our portfolio. Instead, we need to make sure we sell a pair of assets to remain our market neutrality. In the meantime, we also need to decide which pair to be liquidated that has the least impact on our trading strategy. In the paper An Anatomy of Pairs Trading: The Role of Idiosyncratic News, Common Information and Liquidity, the experiments demonstrate that the probabilities and profitability of the pair start to increase day by day since the deviation has been detected, and then it will start to decline since reaching the top performance on day six. Therefore, I made an assumption that the longer a pair was held, the lower probability for this pair to be profitable. So we need to sell the pair that we hold the longest when the margin call occurs.

The profitability dwindles as the day pass

Hazard 2: Fails to place two orders (one long and one short) simultaneously

The other two incidents in our HeldPositions plot are referring to Insufficient Buying Power. Buying power is a concept that is quite easy to understand but is relatively complex to calculate. According to Understanding IB Margin Webinar Notes, the buying power can be defined as follow:

Buying Power

Is the value of securities you can purchase without depositing additional funds. In cash accounts this is the settled cash. In a margin account, buying power is increased through the use of leverage using cash and the value of held stock as collateral. The amount of leverage depends upon whether you have a Reg. T Margin or Portfolio Margin account. Active traders can take advantage of reduced intraday margin for securities – generally 25% of the long stock value. But keep in mind this requirement reverts to the Reg T 50% of stock value to hold overnight.
$\text{Cash Account Buying Power} = Min(\text{Equity with Loan Value, Previous Day Equity with Loan Value}) –\text{Initial Margin}$
$\text{Margin Account Buying Power} = \text{Cash Account Buying Power} * \text{Leverage Ratio}$

Formulas to calculate the Buying Power

Given the buying power as the limitation of placing orders with the margin account, we will face the scenario that one of the orders in the pair is successfully executed but the other order failed to be executed because the buying power of the day is insufficient. Unfortunately, QuantConnect doesn’t provide a function to check the current buying power in real-time. Therefore, we also need to come up with a solution to remedy this problem.

Backtest result after fixing the loopholes found

We have spotted two different potential scenarios that might endanger the profitability of our trading strategy, and we have successfully addressed them in a more theoretically trustworthy method. Looking back to the number of long positions and short positions in the plot HeldPositions are the same at all times. Now we can move on to the last part of the backtest.

Backtest

Even if we have successfully replicated the pair trading technique that has been profitable for the past two years, a single backtest cannot guarantee that this backtest will perform similarly in the real world. However, we may employ this backtest as a research tool to determine which parameters could potentially improve the win rate and Sharpe Ratio. This is so-called Hyperparameter Optimizing. In this section, I’m going to run backtest against several scenarios to answer the following questions:

Should we use close price or log(close price) while calculating cointegration parameters and epsilon
Does the profitability impact by the holding period as stated in An Anatomy of Pairs Trading: The Role of Idiosyncratic News, Common Information and Liquidity?
Therefore, I’m going to create eight scenarios using price/log(price) and close the pair trade when it’s been 10/22/132/264 days indicating we close the trade after it’s 10 days, one month, six months, and one year.

Furthermore, aside from the standard KPIs such as Total Return, Sharpe Ratio, and MaxDrawDown, other KPIs such as Win Rate will be incorrect because it is based on a single stock and always equals around 50%. Hence I wrote a script to process the order records into a pair-wise order record and visualize the pair-wise performance.

Platform

QuantConnect

Backtest Periods

2020/12/27 - 2023/03/03

Backtest Universe

As stated in the section Pair formation

Backtest benchmark

SPDR S&P 500 ETF Trust (SPY)

Backtest Results

Strategy-wise performance

	Using close price	Using log(close price)
Close after 10 days
Close after one month
Close after six months
Close after one year

Strategy-wise performance of every scenario

Once I put the backtest results altogether, it’s quite easy to notice that the performances of the pair trading strategy do conform to the curve as we see above but not exactly following the days in the plot. You can tell that the portfolio returns that close after six months and one year is comparatively lower than the portfolio returns that close after 10 days and one month. The scenarios that have the highest portfolio returns are all closing the trade after one month, no matter whether it’s using close price or log(close price).

On the other hand, the scenarios using close price don’t seem to have an obvious edge compared to the scenarios using log(close price). I guess we need some more parameters involved and more backtests to be performed in order to find out whether the difference exists.

Pair-wise performance

Before we shift our focus to this part, there are a few types of labels that I need to explain beforehand:

Normal Close: This type of order is the order that received the sell signals before hitting the holding period limitation, the same as hitting the vertical time bar of the Triple Barrier Method (TBM) (see here for more details). In the most optimal case, we would like to see the more Normal Close orders the better, as they are the orders that follow the central idea of pair trading to produce positive profit via buying when two stock prices deviate and selling when two stock prices start to cointegrate.
Early Close: This category represents the orders that are not yet received the sell signals but hit the vertical time bar. The Early Close orders are considered as their momentum/energy/tendency to converge are less stronger than they used to be, therefore we close them before they converge in exchange for other pairs that have higher chances to converge. They could be in the money or out of money while we close these Early Close orders and have higher uncertainty compared to normal close orders. We use + to label this pair trading generate positive profit and - to label this trade’s profit as negative.
Stop Loss Close: These orders are closed because their epsilons are getting too big or too small, indicating the prices of the stocks in this pair are starting to further diverge. To avoid a huge loss, we close this type of order before we hit any other bands. We use + to label this pair trading generate positive profit and - to label this trade’s profit as negative.

	Using close price	Using log(close price)
Close after 10 days
Close after one month
Close after six months
Close after one year

Pair-wise performance of every scenario

From the charts above, you can tell that the volume of the Stop Loss orders increases while the holding period increase. That somewhat corroborates the inference stated in the paper An Anatomy of Pairs Trading: The Role of Idiosyncratic News, Common Information and Liquidity that the profits of pair trading do decrease along with the longer holding period. In the meantime, the volume of the Normal Close and Early Close (+) orders remain fairly stable. The full distribution charts above show that the cointegrated pair has a lower probability to diverge once a specific amount of time has passed. The win rate statistics table provided below further back up our conclusion drawn.

	Early Close Win Rate	Total Win Rate
price_10_dist	60.83%	50.43%
price_22_dist	66.67%	44.13%
price_132_dist	50.00%	38.31%
price_1264_dist	N/A	38.02%
log_price_10_dist	61.31%	51.22%
log_price_22_dist	73.63%	54.95%
log_price_132_dist	75.00%	40.56%
log_price_1264_dist	100.00%	35.33%

Pair-wise performance of every scenario II

Take away

In this post, I have shown you how to implement the cointegration pair trading in detail step-by-step. There are two potential defects in our trading strategy that we detected once we release the trading script to the live environment, and theoretical-based and trustworthy solutions have been applied to mitigate the consequences brought by these defects. Lastly, we use this complete backtest to confirm that the profitability of each trading pair does make a difference given using different holding periods to safeguard your strategy from being sabotaged by time volatility. Hope you enjoy reading this post, and let me know if you would like to know any parameter that might impact the performance of the cointegration pair trading strategy, let me know.

Michael's blog

【Pair Trading】 Complete Guide to Backtest Cointegration Pair Trading Strategy