fixed a few typos in binomial tree pricing

10sun · Oct 9, 2021 · 34885a6 · 34885a6
1 parent 80d2ccc
commit 34885a6
Showing 1 changed file with 11 additions and 11 deletions.
diff --git a/chapter8/chapter8.md b/chapter8/chapter8.md
@@ -656,34 +656,34 @@ So to summarize, we are in good shape to price/hedge in a multi-period and conti
 
 ### Optimal Exercise of American Options cast as a Finite MDP {#sec:binomial-pricing-model}
 
-The original Binomial Options Pricing Model was developed to price (and hedge) options (including American Options) on an underlying whose price evolves according to a lognormal stochastic process, with the stochastic process approximated in the form of a simple discrete-time, discrete-states process that enables enormous computational tractability. The lognormal stochastic process is basically of the same form as the stochastic process of the underlying price in the Black-Scholes model (covered in Appendix [-@sec:black-scholes-appendix]). However, the underlying price process in the Black-Scholes model is specified in the real-world probability measure whereas here we specify the underlying price process in the risk-neutral probability measure. This is because here we will employ the pricing method of riskless rate-discounted expectation (under the risk-neutral probability measure) of the option payoff. Recall that in the single-period setting, the underlying asset price's expected rate of growth is calibrated to be equal to the riskless rate $r$, under the risk-probability probability measure. This calibration applies even in the multi-period and continuous-time settings. For a continuous-time lognormal stochastic process, the lognormal drift will hence be equal to $r$ in the risk-neutral probability measure (rather than $\mu$ in the real-world probability measure, as per the Black-Scholes model). Precisely, the stochastic process $S$ for the underlying price in the risk-neutral probability measure is:
+The original Binomial Options Pricing Model was developed to price (and hedge) options (including American Options) on an underlying whose price evolves according to a lognormal stochastic process, with the stochastic process approximated in the form of a simple discrete-time, finite-horizon, finite-states process that enables enormous computational tractability. The lognormal stochastic process is basically of the same form as the stochastic process of the underlying price in the Black-Scholes model (covered in Appendix [-@sec:black-scholes-appendix]). However, the underlying price process in the Black-Scholes model is specified in the real-world probability measure whereas here we specify the underlying price process in the risk-neutral probability measure. This is because here we will employ the pricing method of riskless rate-discounted expectation (under the risk-neutral probability measure) of the option payoff. Recall that in the single-period setting, the underlying asset price's expected rate of growth is calibrated to be equal to the riskless rate $r$, under the risk-probability probability measure. This calibration applies even in the multi-period and continuous-time settings. For a continuous-time lognormal stochastic process, the lognormal drift will hence be equal to $r$ in the risk-neutral probability measure (rather than $\mu$ in the real-world probability measure, as per the Black-Scholes model). Precisely, the stochastic process $S$ for the underlying price in the risk-neutral probability measure is:
 
 $$dS_t = r \cdot S_t \cdot dt + \sigma \cdot S_t \cdot dz_t$$
 
 where $\sigma$ is the lognormal dispersion (often refered to as "lognormal volatility" - we will simply call it volatility for the rest of this section). If you want to develop a thorough understanding of the broader topic of change of probability measures and how it affects the drift term (beyond the scope of this book, but an important topic in continuous-time financial pricing theory), we refer you to the technical material on [Radon-Nikodym Derivative](https://en.wikipedia.org/wiki/Radon%E2%80%93Nikodym_theorem) and [Girsanov Theorem](https://en.wikipedia.org/wiki/Girsanov_theorem).
 
-The Binomial Options Pricing Model serves as a discrete-time, finite-states approximation to this continuous-time process, and is essentially an extension to the single-period model we had covered earlier for the case of a single fundamental risky asset. We've learnt previously that in the single-period case for a single fundamental risky asset, in order to be a complete market, we need to have exactly two random outcomes. We basically extend this "two random outcomes" pattern to each outcome at each time step, by essentially growing out a "binary tree". But there is a caveat - with a binary tree, we end up with an exponential ($2^i$) number of outcomes after $i$ time steps. To contain the exponential growth, we construct a "recombining tree", meaning an "up move" followed by a "down move" ends up in the same underlying price outcome as a "down move" followed by an "up move" (as illustrated in Figure \ref{fig:binomial-tree}). Thus, we have $i+1$ price outcomes after $i$ time steps in this "recombining tree". We conceptualize the ascending-sorted sequence of $i+1$ price outcomes as the (time step $=i$) states $\mathcal{S}_i = \{0, 1, \ldots, i\}$ (since the price movements form a discrete-time, finite-states Markov Chain). Since we are modeling a lognormal process, we model the discrete-time price moves as multiplicative to the price. We denote $S_{i,j}$ as the price after $i$ time steps in state $j$ (for any $i \in \mathbb{Z}_{\geq 0}$ and for any $0 \leq j \leq i$). So the two random prices resulting from $S_{i,j}$ are $S_{i+1,j+1} = S_{i,j} \cdot u$ and $S_{i+1,j} = \frac {S_{i,j}} u$ for some constant $u$ (that we will calibrate). The important point is that $u$ remains a constant across time steps $i$ and across states $j$ at each time step $i$. Since the "up move" is a multiplicative factor of $u$ and the "down move" is a multiplicative factor of $\frac 1 u$, we ensure the "recombining tree" feature. 
+The Binomial Options Pricing Model serves as a discrete-time, finite-horizon, finite-states approximation to this continuous-time process, and is essentially an extension to the single-period model we had covered earlier for the case of a single fundamental risky asset. We've learnt previously that in the single-period case for a single fundamental risky asset, in order to be a complete market, we need to have exactly two random outcomes. We basically extend this "two random outcomes" pattern to each outcome at each time step, by essentially growing out a "binary tree". But there is a caveat - with a binary tree, we end up with an exponential ($2^i$) number of outcomes after $i$ time steps. To contain the exponential growth, we construct a "recombining tree", meaning an "up move" followed by a "down move" ends up in the same underlying price outcome as a "down move" followed by an "up move" (as illustrated in Figure \ref{fig:binomial-tree}). Thus, we have $i+1$ price outcomes after $i$ time steps in this "recombining tree". We conceptualize the ascending-sorted sequence of $i+1$ price outcomes as the (time step $=i$) states $\mathcal{S}_i = \{0, 1, \ldots, i\}$ (since the price movements form a discrete-time, finite-states Markov Chain). Since we are modeling a lognormal process, we model the discrete-time price moves as multiplicative to the price. We denote $S_{i,j}$ as the price after $i$ time steps in state $j$ (for any $i \in \mathbb{Z}_{\geq 0}$ and for any $0 \leq j \leq i$). So the two random prices resulting from $S_{i,j}$ are $S_{i+1,j+1} = S_{i,j} \cdot u$ and $S_{i+1,j} = \frac {S_{i,j}} u$ for some constant $u$ (that we will calibrate). The important point is that $u$ remains a constant across time steps $i$ and across states $j$ at each time step $i$. Since the "up move" is a multiplicative factor of $u$ and the "down move" is a multiplicative factor of $\frac 1 u$, we ensure the "recombining tree" feature. 
 
 ![Binomial Option Pricing Model (Binomial Tree) \label{fig:binomial-tree}](./chapter8/binomial_tree.png "Binomial Options Pricing Model (Binomial Tree)")
 
-Let $q$ be the probability of the "up move" (typically, we use $p$ to denote real-world probability and $q$ to denote the risk-neutral probability) so that $1-q$ is the probability of the "down move", Our goal is to calibrate $q$ and $u$ so that the probability distribution of log-price-ratios $\{\log(\frac {S_{n,0}} {S_{0,0}}), \log(\frac {S_{n,1}} {S_{0,0}}), \ldots, \log(\frac {S_{n_n}} {S_{0,0}})\}$ after $n \in \mathbb{Z}_{\geq 0}$ time steps (with each time step of interval $\frac T n$ for a given $T\in \mathbb{R}^+$) serves as a good approximation to $\mathcal{N}(rT, \sigma^2T)$ (that we know to be the distribution of $\log(\frac {S_T} {S_0})$, as derived in Section [-@sec:lognormal-process-section] in Appendix [-@sec:stochasticcalculus-appendix]). Note that the starting price $S_{0,0}$ of this discrete-time approximation process is equal to the starting price $S_0$ of the continuous-time process.
+Let $q$ be the probability of the "up move" (typically, we use $p$ to denote real-world probability and $q$ to denote the risk-neutral probability) so that $1-q$ is the probability of the "down move". Our goal is to calibrate $q$ and $u$ so that the probability distribution of log-price-ratios $\{\log(\frac {S_{n,0}} {S_{0,0}}), \log(\frac {S_{n,1}} {S_{0,0}}), \ldots, \log(\frac {S_{n_n}} {S_{0,0}})\}$ after $n \in \mathbb{Z}_{\geq 0}$ time steps (with each time step of interval $\frac T n$ for a given $T\in \mathbb{R}^+$) serves as a good approximation to $\mathcal{N}((r - \frac {\sigma^2} {2})T, \sigma^2T)$ (that we know to be the distribution of $\log(\frac {S_T} {S_0})$, as derived in Section [-@sec:lognormal-process-section] in Appendix [-@sec:stochasticcalculus-appendix]). Note that the starting price $S_{0,0}$ of this discrete-time approximation process is equal to the starting price $S_0$ of the continuous-time process.
 We shall calibrate $q$ and $u$ in two steps:
 
-* In the first step, we pretend that $q=0.5$ and calibrate $u$ such that the variance of the two random outcomes $\log(\frac {S_{i+1, j+1}} {S_{i, j}}) = \log(u)$ and $\log(\frac {S_{i+1, j}} {S_{i, j}}) = -\log(u)$ is equal to the variance $\frac {\sigma^2 T} n$ of the normally-distributed (stationary) process $\log(\frac {S_{t + \frac T n}} {S_t})$ for any $i \in \mathbb{Z}_{\geq 0}$ for all $0 \leq j \leq i$. This yields:
-$$\log^2(u) = \frac {\sigma^2 T} n \Rightarrow u = e^{\frac {\sigma \sqrt{T}} n}$$
-This ensures that the variance of the symmetric binomial distribution after $n$ time steps matches the variance $\sigma^2 T$ of the normal distribution of $\log(\frac {S_T} {S_0})$.
-* In the second step, we adjust the probability $q$ so that the mean of the two random outcomes $\frac {S_{i+1, j+1}} {S_{i, j}} = u$ and $\frac {S_{i+1,j}} {S_{i,j}} = \frac 1 u$ is equal to the mean $e^{\frac {rT} n}$ of the lognormally-distributed (stationary) process $\frac {S_{t + \frac T n}} {S_t}$ for any $i \in \mathbb{Z}_{\geq 0}$ for all $0 \leq j \leq i$. This yields:
-$$q u + \frac {1-q} u = e^{\frac {rT} n} \Rightarrow q = \frac {u \cdot e^{\frac {rT} n} - 1} {u^2 - 1} = \frac {e^{\frac {rT + \sigma \sqrt{T}} n} - 1} {e^{\frac {2\sigma \sqrt{T}} n} - 1}$$
+* In the first step, we pretend that $q=0.5$ and calibrate $u$ such that for any $i \in \mathbb{Z}_{\geq 0}$, for any $0 \leq j \leq i$, the variance of the two equal-probability random outcomes $\log(\frac {S_{i+1, j+1}} {S_{i, j}}) = \log(u)$ and $\log(\frac {S_{i+1, j}} {S_{i, j}}) = -\log(u)$ is equal to the variance $\frac {\sigma^2 T} n$ of the normally-distributed random variable $\log(\frac {S_{t + \frac T n}} {S_t})$ for any $t \geq 0$. This yields:
+$$\log^2(u) = \frac {\sigma^2 T} n \Rightarrow u = e^{\sigma \sqrt{\frac T n}}$$
+This ensures that the variance of the symmetric binomial distribution after $n$ time steps matches the variance $\sigma^2 T$ of the normally-distributed random variable $\log(\frac {S_T} {S_0})$.
+* In the second step, we adjust the probability $q$ so that for any $i \in \mathbb{Z}_{\geq 0}$, for ay $0 \leq j \leq i$, the mean of the two random outcomes $\frac {S_{i+1, j+1}} {S_{i, j}} = u$ and $\frac {S_{i+1,j}} {S_{i,j}} = \frac 1 u$ is equal to the mean $e^{\frac {rT} n}$ of the lognormally-distributed random variable $\frac {S_{t + \frac T n}} {S_t}$ for any $t \geq 0$. This yields:
+$$q u + \frac {1-q} u = e^{\frac {rT} n} \Rightarrow q = \frac {u \cdot e^{\frac {rT} n} - 1} {u^2 - 1} = \frac {e^{\frac {rT} n + \sigma \sqrt{\frac T n}} - 1} {e^{2\sigma \sqrt{\frac T n}} - 1}$$
 
 Thus, we have the parameters $u$ and $q$ that fully specify the Binomial Options Pricing Model. Now we get to the application of this model. We are interested in using this model for optimal exercise (and hence, pricing) of American Options. This is in contrast to the Black-Scholes Partial Differential Equation which only enabled us to price options with a fixed payoff at a fixed point in time (eg: European Call and Put Options). Of course, a special case of American Options is indeed European Options. It's important to note that here we are tackling the much harder problem of the ideal timing of exercise of an American Option - the Binomial Options Pricing Model is well suited for this. 
 
-As mentioned earlier, we want to model the problem of Optimal Exercise of American Options as a discrete-time finite-horizon MDP. To fit into our framework for discrete-time finite-horizon MDPs, we need to set the terminal time to be $t=T+1$, meaning all the states at time $T+1$ are terminal states. Here we will utilize the states and state transitions (probabilistic price movements of the underlying) given by the Binomial Options Pricing Model as the states and state transitions in the MDP. The MDP actions in each state will be binary - either exercise the option (and immediately move to a terminal state) or don't exercise the option (i.e., continue on to the next time step's random state, as given by the Binomial Options Pricing Model). If the exercise action is chosen, the MDP reward is the option payoff. If the continue action is chosen, the reward is 0. The discount factor $\gamma$ is $e^{-\frac {rT} n}$ since (as we've learnt in the single-period case), the price (which translates here to the Optimal Value Function) is defined as the riskless rate-discounted expectation (under the risk-neutral probability measure) of the option payoff. In the multi-period setting, the overall discounting amounts to composition (multiplication) of each time step's discounting (which is equal to $\gamma$) and the overall risk-neutral probability measure amounts to the composition of each time step's risk-neutral probability measure (which is specified by the calibrated value $q$).
+As mentioned earlier, we want to model the problem of Optimal Exercise of American Options as a discrete-time, finite-horizon, finite-states MDP. We set the terminal time to be $t=T+1$, meaning all the states at time $T+1$ are terminal states. Here we will utilize the states and state transitions (probabilistic price movements of the underlying) given by the Binomial Options Pricing Model as the states and state transitions in the MDP. The MDP actions in each state will be binary - either exercise the option (and immediately move to a terminal state) or don't exercise the option (i.e., continue on to the next time step's random state, as given by the Binomial Options Pricing Model). If the exercise action is chosen, the MDP reward is the option payoff. If the continue action is chosen, the reward is 0. The discount factor $\gamma$ is $e^{-\frac {rT} n}$ since (as we've learnt in the single-period case), the price (which translates here to the Optimal Value Function) is defined as the riskless rate-discounted expectation (under the risk-neutral probability measure) of the option payoff. In the multi-period setting, the overall discounting amounts to composition (multiplication) of each time step's discounting (which is equal to $\gamma$) and the overall risk-neutral probability measure amounts to the composition of each time step's risk-neutral probability measure (which is specified by the calibrated value $q$).
 
-Now let's write some code to determine the Optimal Exercise of American Options (and hence, the price of American Options) by modeling this problem as a discrete-time finite-horizon MDP. We create a `dataclass OptimalExerciseBinTree` whose attributes are `spot_price` (specifying the current, i.e., time=0 price of the underlying), `payoff` (specifying the option payoff, when exercised), `expiry` (specifying the time $T$ to expiration of the American Option), `rate` (specifying the riskless rate $r$), `vol` (specifying the lognormal volatility $\sigma$), and `num_steps` (specifying the number $n$ of time steps in the binomial tree). Note that each time step is of interval $\frac T n$ (which is implemented below in the method `dt`). Note also that the `payoff` function is fairly generic taking two arguments - the first argument is the time at which the option is exercised, and the second argument is the underlying price at the time the option is exercised. Note that for a typical American Call or Put Option, the payoff does not depend on time and the dependency on the underlying price is the standard "hockey-stick" payoff that we are now fairly familiar with (however, we designed the interface to allow for more general option payoff functions). 
+Now let's write some code to determine the Optimal Exercise of American Options (and hence, the price of American Options) by modeling this problem as a discrete-time, finite-horizon, finite-states MDP. We create a `dataclass OptimalExerciseBinTree` whose attributes are `spot_price` (specifying the current, i.e., time=0 price of the underlying), `payoff` (specifying the option payoff, when exercised), `expiry` (specifying the time $T$ to expiration of the American Option), `rate` (specifying the riskless rate $r$), `vol` (specifying the lognormal volatility $\sigma$), and `num_steps` (specifying the number $n$ of time steps in the binomial tree). Note that each time step is of interval $\frac T n$ (which is implemented below in the method `dt`). Note also that the `payoff` function is fairly generic taking two arguments - the first argument is the time at which the option is exercised, and the second argument is the underlying price at the time the option is exercised. Note that for a typical American Call or Put Option, the payoff does not depend on time and the dependency on the underlying price is the standard "hockey-stick" payoff that we are now fairly familiar with (however, we designed the interface to allow for more general option payoff functions). 
 
 The set of states $\mathcal{S}_i$ at time step $i$ (for all $0 \leq i \leq T+1$) is: $\{0, 1, \ldots, i\}$ and the method `state_price` below calculates the price in state $j$ at time step $i$ as:
 
-$$S_{i,j} = S_{0,0} \cdot e^{\frac {(2j - i)\sigma T} n}$$
+$$S_{i,j} = S_{0,0} \cdot e^{(2j - i)\sigma \sqrt{\frac T n}}$$
 
 Finally, the method `get_opt_vf_and_policy` calculates $u$ (`up_factor`) and $q$ (`up_prob`), prepares the requisite state-reward transitions (conditional on current state and action) to move from one time step to the next, and passes along the constructed time-sequenced transitions to `rl.finite_horizon.get_opt_vf_and_policy` (which we had written in Chapter [-@sec:dp-chapter]) to perform the requisite backward induction and return an `Iterator` on pairs of `V[int]` and `FiniteDeterministicPolicy[int, bool]`. Note that the states at any time-step $i$ are the integers from $0$ to $i$ and hence, represented as `int`, and the actions are represented as `bool` (`True` for exercise and `False` for continue). Note that we represent an early terminal state (in case of option exercise before expiration of the option) as -1.