Skip to content

Commit

Permalink
Some changes to the text
Browse files Browse the repository at this point in the history
  • Loading branch information
IanLKaplan committed Sep 14, 2022
1 parent b17add8 commit 364cb3a
Show file tree
Hide file tree
Showing 503 changed files with 7,554 additions and 34 deletions.
64 changes: 34 additions & 30 deletions pairs_trading.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@
"<p>\n",
"Pairs trading is an approach that takes advantage of the\n",
"mispricing between two (or more) co-moving assets, by\n",
"taking a long position in one(many) and shorting the\n",
"taking a long position in one (many) and shorting the\n",
"other(s), betting that the relationship will hold and that\n",
"prices will converge back to an equilibrium level.\n",
"</p>\n",
Expand Down Expand Up @@ -49,20 +49,20 @@
"</p>\n",
"<p>\n",
"Markets tend toward efficiency and many quantitative approaches fade over time as they are adopted by hedge funds. Pairs trading\n",
"goes back to the mid-1980s. The approach still seems to be profitable. The reason for this could be that there are a vast\n",
"number of possible pairs and the pairs portfolio's tend to be fairly small (5 to 20 pairs, in most cases). This may always\n",
"goes back to the mid-1980s. Surprisingly, the approach still seems to be profitable. The reason for this could be that there are a vast\n",
"number of possible pairs and the pairs portfolio's tend to be fairly small (5 to 20 pairs, in most cases). This could\n",
"leave unexploited pairs in the market. Pairs trading may also be difficult to scale to a level that would be attractive to hedge\n",
"funds, so it has not been arbitraged out of the market.\n",
"</p>\n",
"<p>\n",
"This Python notebook investigates pairs trading algorithms in an attempt reproduce the reported results.\n",
"This Python notebook investigates pairs trading algorithms.\n",
"</p>\n",
"<h3>\n",
"Overview\n",
"</h3>\n",
"<p>\n",
"Pairs trading algorithms attempt to identify pairs of stocks that have mean reverting behavior. Mean reversion takes place\n",
"around a common mean between the two stocks. The strategy is based on the idea that when one of the pairs is above or below\n",
"Pairs trading algorithms attempt to identify pairs of stocks that have mean reverting behavior. The strategy is based on the\n",
"premise that when one of the pairs is above or below\n",
"the mean, it will tend to revert back to the mean.\n",
"</p>\n",
"<p>\n",
Expand Down Expand Up @@ -97,8 +97,8 @@
"S&P 500 Industry Sectors\n",
"</h3>\n",
"<p>\n",
"Even with modern computing power, it would be difficult to test all possible stock pairs since the number of pairs\n",
"grows exponentially with N, the number of stocks.\n",
"Even with modern computing power, it would be difficult to test all possible stock pairs traded on the US exchanges,\n",
"since the number of pairs grows exponentially with N, the number of stocks.\n",
"</p>\n",
"\n",
"\\$ number \\space of \\space pairs = \\frac{N^2 - N}{2} $\n",
Expand All @@ -113,16 +113,11 @@
"The factors that are used to build factor models can generally be classified as macroeconomic and microeconomic factors.\n",
"</p>\n",
"<p>\n",
"Macroeconomic factors include factors like the unemployment rate, a company's interest rate exposure and\n",
"economic cycles exposure (e.g., recession, economic expansion). These factors may be reasonable choices for long\n",
"term investment portfolios, but are not appropriate for a pairs trading portfolio which trades on a shorter horizon.\n",
"</p>\n",
"<p>\n",
"Microeconomic factors include features like momentum, price earnings ratio, gross profits to assets, EBITDA and other corporate factors.\n",
"</p>\n",
"<p>\n",
"Factor model studies suggest that one of the strongest factors is the market sector, which is used here as an initial filter\n",
"for pairs.\n",
"Factor model studies suggest that one of the strongest factors is the market sector. The S&P 500 market sectors are used here as\n",
"an initial filter for pairs.\n",
"</p>\n",
"<p>\n",
"The S&P stocks are used for pairs selection since these stocks are heavily traded, with a small bid-ask spread. These stocks\n",
Expand Down Expand Up @@ -176,22 +171,22 @@
"<p>\n",
"To model pairs trading this notebook requires data for all of the S&P 500 stocks from the start date to yesterday (e.g., one day\n",
"in the past). In other models (see Stock Market Cash Trigger and ETF Rotation) the stock data was downloaded the first time the\n",
"notebook was run and stored in temporary files. The first notebook run incurred the initial overhead of downloading\n",
"notebook was run and stored in temporary files. In these notebooks, the first notebook run incurred the initial overhead of downloading\n",
"the data, but subsequent runs could read the data from local files.\n",
"</p>\n",
"<p>\n",
"Downloading stock data every day would have a high overhead for the S&P 500 stocks. To avoid this, the\n",
"data is downloaded once and stored in local files. When the notebook is run at later times, only the data between the\n",
"end of the previous data and the present date will be downloaded.\n",
"end of the previous date and the present date will be downloaded.\n",
"</p>\n",
"<p>\n",
"There are stocks in the S&P 500 list that were listed on the stock exchange later than the start date. These\n",
"stocks are filtered out, so the final stock set does not include all of the S&P 500 stocks.\n",
"</p>\n",
"<p>\n",
"Filtering stocks in this way creates a survivorship bias. This should not be a problem for back testing through the\n",
"historical time period, since the purpose of this backtest is to understand the pairs trading behavior. The actual\n",
"lookback period is less than the historical period.\n",
"Filtering stocks in this way creates a survivorship bias. This should not be a problem for back testing pairs trading\n",
"algorithms through the historical time period. The purpose of this backtest is to understand the pairs trading behavior.\n",
"The results do not depend on the stock universe, only on the pairs selected.\n",
"</p>\n"
]
},
Expand Down Expand Up @@ -408,8 +403,8 @@
},
"source": [
"<p>\n",
"The table below shows the number of unique pairs for each sector and the total number of pairs. By drawing pairs from sectors, rather than\n",
"the whole S&P 500 set of stocks, the number of possible pairs is reduced from 124,750.\n",
"The table below shows the number of unique pairs for each S&P 500 sector and the total number of pairs. By drawing pairs from sectors,\n",
"rather than the whole S&P 500 set of stocks, the number of possible pairs is reduced from 124,750.\n",
"</p>"
]
},
Expand Down Expand Up @@ -473,8 +468,8 @@
"</h3>\n",
"<p>\n",
"A set of pairs are selected for trading using a lookback period. The longer the lookback period (with more data points) the\n",
"more accurate the statistics will be, assuming that the data is stable (e.g., constant mean and standard deviation).\n",
"Stock price time series are not stable over time, however. The mean and the\n",
"less error there will be in the selection statistics, assuming that the data is stable\n",
"(e.g., constant mean and standard deviation). Stock price time series are not stable over time, however. The mean and the\n",
"standard deviation changes, as do other statistics like correlation.\n",
"</p>\n",
"<p>\n",
Expand All @@ -492,12 +487,13 @@
"Correlation\n",
"</h3>\n",
"<p>\n",
"After selecting stocks based on their industry sector, the next filter used is the pair correlation of the\n",
"natural log of the close prices.\n",
"</p>\n",
"<p>\n",
"In selecting pairs for trading, an attempt is made to find the pairs that have similar market price behavior and mean reversion.\n",
"One measure of similar market price behavior is correlation. This section examines the correlation distribution for the\n",
"S&P 500 sector pairs.\n",
"</p>\n",
"<p>\n",
"The correlation is performed on the natural log of the close prices.\n",
"</p>\n"
],
"outputs": []
Expand All @@ -514,7 +510,9 @@
"\n",
" :param sector_info: A dictionary containing the sector info. For example:\n",
" energies': ['APA', 'BKR', 'COP', ...]\n",
" Here 'energies' is the dictionary key for the list of S&P 500 stocks in that sector.\n",
" Here 'energies' is the dictionary key for the S&P 500 sector. The dictionary value is the\n",
" list of stocks in the sector.\n",
"\n",
" :return: A list of Tuples, where each tuple contains the symbols for the stock pair and the sector.\n",
" For example:\n",
" [('AAPL', 'ACN', 'information-technology'),\n",
Expand Down Expand Up @@ -612,7 +610,7 @@
"cell_type": "markdown",
"source": [
"<p>\n",
"The histogram below shows the distribution of the yearly correlation between the pairs.\n",
"The histogram below shows the distribution of the correlation between the pairs over a half year period.\n",
"</p>"
],
"metadata": {
Expand Down Expand Up @@ -735,12 +733,18 @@
"</h2>\n",
"<ol>\n",
"<li>\n",
"<i>Pairs Trading: Quantitative Method and Analysis</i> by Ganapathy Vidyamurthy, 2004, John Wiley and Sons\n",
"<i>Pairs Trading: Quantitative Method and Analysis</i> by Ganapathy Vidyamurthy, 2004, Wiley Publishing\n",
"</li>\n",
"<li>\n",
"Algorithmic Trading: Winning Strategies and Their Rationale by Ernie Chan, 2013, Wiley Publishing\n",
"</li>\n",
"<li>\n",
"<a href=\"https://www.researchgate.net/publication/5217081_Pairs_Trading_Performance_of_a_Relative_Value_Arbitrage_Rule\">Pairs Trading: Performance of a Relative Value Arbitrage Rule</a>, February 2006\n",
"</li>\n",
"<li>\n",
"<a href=\"http://jonathankinlay.com/2019/02/pairs-trading-part-2-practical-considerations/\">Pairs Trading – Part 2: Practical Considerations</a> by Jonathan Kinlay\n",
"</li>\n",
"<li>\n",
"<a href=\"https://www.quantconnect.com/tutorials/strategy-library/intraday-dynamic-pairs-trading-using-correlation-and-cointegration-approach\">Intraday Dynamic Pairs Trading using Correlation and Cointegration</a>\n",
"</li>\n",
"<li>\n",
Expand Down
9 changes: 5 additions & 4 deletions pairs_trading.py
Original file line number Diff line number Diff line change
Expand Up @@ -189,11 +189,13 @@ def get_close_data(self, stock_list: list) -> pd.DataFrame:

def get_pairs(sector_info: dict) -> List[Tuple]:
"""
Return all the stock pairs, where the pairs are selected from the S&P 500 sector.
Return the sector stock pairs, where the pairs are selected from the S&P 500 sector.
:param sector_info: A dictionary containing the sector info. For example:
energies': ['APA', 'BKR', 'COP', ...]
Here 'energies' is the dictionary key for the list of S&P 500 stocks in that sector.
Here 'energies' is the dictionary key for the S&P 500 sector. The dictionary value is the
list of stocks in the sector.
:return: A list of Tuples, where each tuple contains the symbols for the stock pair and the sector.
For example:
[('AAPL', 'ACN', 'information-technology'),
Expand Down Expand Up @@ -422,8 +424,7 @@ def select_pairs(self, start_ix: int, end_ix: int, pairs_list: List[Tuple], thre
pairs_selection = PairsSelection(close_prices=close_prices_df, correlation_cutoff=correlation_cutoff)
stats_l = pairs_selection.select_pairs(start_ix=0, end_ix=int(trading_days / 2), pairs_list=pairs_list, threshold='1%')

print(
f'Number of candidate pairs: {len(pairs_list)}, number of pairs after selection: {len(stats_l)}: {round((len(stats_l) / len(pairs_list)) * 100, 2)} percent')
print(f'Number of candidate pairs: {len(pairs_list)}, number of pairs after selection: {len(stats_l)}: {round((len(stats_l) / len(pairs_list)) * 100, 2)} percent')

cor_l = [stat.cor_v for stat in stats_l]
cor_a = np.array(cor_l)
Expand Down
15 changes: 15 additions & 0 deletions s_and_p_data/A.csv
Original file line number Diff line number Diff line change
Expand Up @@ -3936,3 +3936,18 @@ Date,Close
2022-08-18,139.97
2022-08-19,137.62
2022-08-22,133.94
2022-08-23,132.64
2022-08-24,133.67
2022-08-25,136.01
2022-08-26,129.87
2022-08-29,128.11
2022-08-30,128.28
2022-08-31,128.25
2022-09-01,128.93
2022-09-02,128.01
2022-09-06,129.3
2022-09-07,131.43
2022-09-08,135.19
2022-09-09,137.63
2022-09-12,139.87
2022-09-13,133.54
15 changes: 15 additions & 0 deletions s_and_p_data/AAL.csv
Original file line number Diff line number Diff line change
Expand Up @@ -3936,3 +3936,18 @@ Date,Close
2022-08-18,14.84
2022-08-19,14.17
2022-08-22,13.71
2022-08-23,13.82
2022-08-24,13.99
2022-08-25,14.42
2022-08-26,13.74
2022-08-29,13.51
2022-08-30,13.33
2022-08-31,12.99
2022-09-01,12.93
2022-09-02,12.99
2022-09-06,13.22
2022-09-07,13.89
2022-09-08,13.96
2022-09-09,14.19
2022-09-12,14.47
2022-09-13,13.68
15 changes: 15 additions & 0 deletions s_and_p_data/AAP.csv
Original file line number Diff line number Diff line change
Expand Up @@ -3936,3 +3936,18 @@ Date,Close
2022-08-18,207.49
2022-08-19,207.02
2022-08-22,198.48
2022-08-23,199.05
2022-08-24,179.91
2022-08-25,180.99
2022-08-26,173.59
2022-08-29,170.54
2022-08-30,170.47
2022-08-31,168.64
2022-09-01,172.15
2022-09-02,171.73
2022-09-06,173.34
2022-09-07,178.3
2022-09-08,180.29
2022-09-09,180.65
2022-09-12,181.49
2022-09-13,172.83
15 changes: 15 additions & 0 deletions s_and_p_data/AAPL.csv
Original file line number Diff line number Diff line change
Expand Up @@ -3936,3 +3936,18 @@ Date,Close
2022-08-18,174.15
2022-08-19,171.52
2022-08-22,167.57
2022-08-23,167.23
2022-08-24,167.53
2022-08-25,170.03
2022-08-26,163.62
2022-08-29,161.38
2022-08-30,158.91
2022-08-31,157.22
2022-09-01,157.96
2022-09-02,155.81
2022-09-06,154.53
2022-09-07,155.96
2022-09-08,154.46
2022-09-09,157.37
2022-09-12,163.43
2022-09-13,153.84
15 changes: 15 additions & 0 deletions s_and_p_data/ABBV.csv
Original file line number Diff line number Diff line change
Expand Up @@ -2426,3 +2426,18 @@ Date,Close
2022-08-18,141.29
2022-08-19,141.85
2022-08-22,140.34
2022-08-23,139.02
2022-08-24,137.91
2022-08-25,139.33
2022-08-26,136.35
2022-08-29,135.71
2022-08-30,135.55
2022-08-31,134.46
2022-09-01,138.45
2022-09-02,136.28
2022-09-06,137.59
2022-09-07,138.71
2022-09-08,140.52
2022-09-09,141.42
2022-09-12,142.24
2022-09-13,138.53
15 changes: 15 additions & 0 deletions s_and_p_data/ABC.csv
Original file line number Diff line number Diff line change
Expand Up @@ -3936,3 +3936,18 @@ Date,Close
2022-08-18,149.91
2022-08-19,152.03
2022-08-22,148.99
2022-08-23,148.02
2022-08-24,148.23
2022-08-25,150.42
2022-08-26,146.19
2022-08-29,147.49
2022-08-30,145.43
2022-08-31,146.56
2022-09-01,147.61
2022-09-02,147.71
2022-09-06,146.02
2022-09-07,147.84
2022-09-08,147.54
2022-09-09,147.27
2022-09-12,147.71
2022-09-13,140.98
15 changes: 15 additions & 0 deletions s_and_p_data/ABMD.csv
Original file line number Diff line number Diff line change
Expand Up @@ -3936,3 +3936,18 @@ Date,Close
2022-08-18,278.18
2022-08-19,270.68
2022-08-22,261.35
2022-08-23,262.23
2022-08-24,268.04
2022-08-25,274.45
2022-08-26,258.12
2022-08-29,256.98
2022-08-30,258.65
2022-08-31,259.28
2022-09-01,259.53
2022-09-02,261.24
2022-09-06,263.89
2022-09-07,268.18
2022-09-08,275.57
2022-09-09,282.28
2022-09-12,279.16
2022-09-13,266.39
15 changes: 15 additions & 0 deletions s_and_p_data/ABT.csv
Original file line number Diff line number Diff line change
Expand Up @@ -3936,3 +3936,18 @@ Date,Close
2022-08-18,109.96
2022-08-19,110.06
2022-08-22,107.45
2022-08-23,106.01
2022-08-24,105.44
2022-08-25,105.89
2022-08-26,101.9
2022-08-29,101.84
2022-08-30,102.2
2022-08-31,102.65
2022-09-01,104.84
2022-09-02,102.5
2022-09-06,102.71
2022-09-07,104.7
2022-09-08,106.99
2022-09-09,108.48
2022-09-12,109.29
2022-09-13,105.84
15 changes: 15 additions & 0 deletions s_and_p_data/ACN.csv
Original file line number Diff line number Diff line change
Expand Up @@ -3936,3 +3936,18 @@ Date,Close
2022-08-18,319.46
2022-08-19,315.29
2022-08-22,310.0
2022-08-23,306.7
2022-08-24,306.26
2022-08-25,309.77
2022-08-26,298.13
2022-08-29,295.14
2022-08-30,292.5
2022-08-31,288.46
2022-09-01,288.79
2022-09-02,284.07
2022-09-06,283.46
2022-09-07,286.76
2022-09-08,287.96
2022-09-09,290.55
2022-09-12,295.26
2022-09-13,281.52
15 changes: 15 additions & 0 deletions s_and_p_data/ADBE.csv
Original file line number Diff line number Diff line change
Expand Up @@ -3936,3 +3936,18 @@ Date,Close
2022-08-18,439.03
2022-08-19,425.06
2022-08-22,411.35
2022-08-23,410.41
2022-08-24,405.65
2022-08-25,403.93
2022-08-26,381.02
2022-08-29,375.26
2022-08-30,375.07
2022-08-31,373.44
2022-09-01,370.53
2022-09-02,368.14
2022-09-06,368.3
2022-09-07,379.72
2022-09-08,383.63
2022-09-09,394.78
2022-09-12,396.36
2022-09-13,368.39
Loading

0 comments on commit 364cb3a

Please sign in to comment.