-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Started vignette on techniques for computation and numerical accuracy…
… in the package.
- Loading branch information
Ian Taylor
committed
Feb 28, 2021
1 parent
92d27e8
commit fb9940c
Showing
1 changed file
with
61 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,61 @@ | ||
--- | ||
title: "Computational Considerations for Accuracy" | ||
output: | ||
rmarkdown::html_vignette: | ||
toc: true | ||
number_sections: true | ||
vignette: > | ||
%\VignetteIndexEntry{Computational Considerations for Accuracy} | ||
%\VignetteEngine{knitr::rmarkdown} | ||
%\VignetteEncoding{UTF-8} | ||
--- | ||
|
||
```{r, include = FALSE} | ||
knitr::opts_chunk$set( | ||
collapse = TRUE, | ||
comment = "#>" | ||
) | ||
``` | ||
|
||
```{r setup} | ||
library(gammacount) | ||
``` | ||
|
||
|
||
# General Guidelines | ||
|
||
In this package, the guiding principles for ensuring numerical accuracy of computations are | ||
1. Offload as much as possible to base `R` functions such as `pgamma`. | ||
2. Work with probabilities and densities in the log scale, and exponentiate as a last step if required. | ||
3. When given the choice, work with the complement of a probability if the complement is smaller. | ||
|
||
This vignette describes how these principles are applied to some specific computations in this package. | ||
|
||
# Log-scale helper functions | ||
|
||
This package contains two helper functions (not exported), `logsumexp` and `logdiffexp` which can perform operations without leaving the log scale: | ||
\begin{align*} | ||
\mathrm{logsumexp}(a, b) &= \log(e^a + e^b) \\ | ||
\mathrm{logdiffexp}(a, b) &= \log(e^a - e^b) | ||
\end{align*} | ||
|
||
Each works by factoring out the largest value ($\max(a, b)$ in the `logsumexp` case, and $a$ in the `logdiffexp` case). | ||
\begin{align*} | ||
\mathrm{logsumexp}(a, b) &= \max(a,b) + \log(1 + e^{-|a-b|}) \\ | ||
\mathrm{logdiffexp}(a, b) &= a + \log(1 - e^{b - a}) | ||
\end{align*} | ||
|
||
In this form, `logsumexp` can take advantage of `R`'s `log1p` function, and `logdiffexp` uses either `log1p(-exp(b-a))` or `log(-expm1(b-a))` depending on the magnitude of $b-a$. Doing this minimizes the loss of accuracy relative to a naive implementation of this operation. | ||
|
||
# Arrival Time Densities | ||
|
||
Recall that the density of the $n^{th}$ arrival time after a random start time has the density | ||
$$f_{\tau_n}(t) = Q(n\alpha, \alpha t) - Q((n-1)\alpha, \alpha t).$$ | ||
For $n \geq 2$, $\lim_{t\to 0} f_{\tau_n}(t) = \lim_{t\to \infty} f_{\tau_n}(t) = 0$. On the left tail this is because both $Q$ functions approach one, while on the right tail this is because both $Q$ functions approach zero. | ||
|
||
On the log scale, finding the difference between two probabilities $p$ and $p+\epsilon$ is subject to less loss of accuracy when $p$ is small than large. So the density as written will be more accurate for large $t$ than for small $t$. To improve accuracy for small $t$, we notice that the density can also be written in terms of regularized lower incomplete gamma functions | ||
$$f_{\tau_n}(t) = P((n-1)\alpha, \alpha t) - P(n\alpha, \alpha t).$$ | ||
This form of the density has the opposite behavior for small $t$, as both $P$ functions approach zero. | ||
|
||
We choose the distribution's mode as a convenient cutoff point to switch between using each representation and use the $P$ form for $t$ less than the mode, and use the $Q$ form for $t$ greater than the mode. | ||
|