Skip to content

Commit

Permalink
integrate rmd pages
Browse files Browse the repository at this point in the history
  • Loading branch information
szcf-weiya committed Feb 26, 2019
1 parent e7affcf commit d08ea63
Show file tree
Hide file tree
Showing 11 changed files with 354 additions and 9 deletions.
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -13,4 +13,5 @@ other
unimelb_training.csv
CreateGrantData.R
grantData.RData
.vscode/
.vscode/
_site/
26 changes: 18 additions & 8 deletions .travis.yml
Original file line number Diff line number Diff line change
@@ -1,11 +1,21 @@
language: python
install:
- pip install mkdocs==0.17.2
- pip install mkdocs-material==2.5.2
- pip install pymdown-extensions
- pip install python-markdown-math
script:
- mkdocs build
matrix:
include:
- language: python
install:
- pip install mkdocs==0.17.2
- pip install mkdocs-material==2.5.2
- pip install pymdown-extensions
- pip install python-markdown-math
script:
- mkdocs build

- language: r
r_packages:
- rmarkdown
script:
- cd rmds && Rscript -e "rmarkdown::render_site()"
- mv _site/ ../site/rmds

deploy:
provider: pages
skip-cleanup: true
Expand Down
9 changes: 9 additions & 0 deletions rmds/_mathjax.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
$$
\newcommand\E{\mathbb{E}}
\newcommand\y{\mathbf{y}}
\newcommand\1{\boldsymbol{1}}
\newcommand\0{\boldsymbol{0}}
\newcommand\bmu{\boldsymbol\mu}
\newcommand\Dev{\mathrm{Dev}}
\newcommand\null{\mathrm{null}}
$$
22 changes: 22 additions & 0 deletions rmds/_site.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
name: "rmd-collections"
navbar:
title: "Rmd Gallery"
left:
- text: "Home"
href: index.html
- text: "About"
href: https://hohoweiya.xyz
- text: "ESL CN"
href: https://esl.hohoweiya.xyz
- text: "Work Yard"
href: https://stats.hohoweiya.xyz
- text: "Tech Note"
href: https://tech.hohoweiya.xyz
output:
html_document:
theme: cosmo
highlight: textmate
include:
in_header: header.html
after_body: footer.html
css: style.css
1 change: 1 addition & 0 deletions rmds/footer.html
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
<p>Copyright &copy; 2016-2019 weiya</p>
116 changes: 116 additions & 0 deletions rmds/glmnet.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,116 @@
---
title: "Overview of `glmnet`"
author: "weiya"
date: "February 26, 2019"
output: html_document
---

The full signature of the `glmnet` function is:

```{r, eval = FALSE}
glmnet(x, y,
family=c("gaussian","binomial","poisson","multinomial","cox","mgaussian"),
weights, offset=NULL, alpha = 1, nlambda = 100,
lambda.min.ratio = ifelse(nobs<nvars,0.01,0.0001), lambda=NULL,
standardize = TRUE, intercept=TRUE, thresh = 1e-07, dfmax = nvars + 1,
pmax = min(dfmax * 2+20, nvars), exclude, penalty.factor = rep(1, nvars),
lower.limits=-Inf, upper.limits=Inf, maxit=100000,
type.gaussian=ifelse(nvars<500,"covariance","naive"),
type.logistic=c("Newton","modified.Newton"),
standardize.response=FALSE, type.multinomial=c("ungrouped","grouped"))
```

## Family

- `gaussian`
- `binomial`
- `multinomial`
- `poisson`
- `cox`

### Deviance Measure

- $\hat\bmu_\lambda$: the $N$-vector of fitted mean values when the parameter is $\lambda$
- $\tilde\bmu$: the unrestricted or [**saturated** fit](https://stats.stackexchange.com/questions/283/what-is-a-saturated-model)(having $\hat y=y_i$).

Define

$$
\Dev_\lambda \doteq 2[\ell(\y, \tilde \bmu)-\ell(\y,\hat\bmu_\lambda)]\,,
$$
where $\ell(\y,\bmu)$ is the log-likelihood of the model $\bmu$, a sum of $N$ terms.

#### Null deviance

$$
\Dev_\null = \Dev_\infty\,.
$$
Typically, $\hat\bmu_\infty=\bar y\1$, or $\hat\bmu_\infty=\0$ in the `cox` family.

`glmnet` reports the **fraction of deviance explained**

$$
D^2_\lambda = \frac{\Dev_\null-\Dev_\lambda}{\Dev_\null}\,.
$$

The name $D^2$ is by analogy with $R^2$, the fraction of variance explained in regression.

## Penalties

For all models, the `glmnet` algorithm admits a range of elastic-net penalties ranging from $\ell_2$ to $\ell_1$. The general form of the penalized optimization problem is

$$
\min_{\beta_0,\beta}\Big\{-\frac 1N\ell(\y;\beta_0,\beta)+\lambda\sum_{j=1}^p\gamma_j\{(1-\alpha)\beta_j^2+\alpha\vert \beta_j\vert\}\Big\}\,.
$$

- $\lambda$ determines the overall complexity of the model
- the elastic-net parameter $\alpha\in[0,1]$ provides a mix between ridge regression and the lasso
- $\gamma_j\ge 0$ is a penalty modifier.

### Example

As [kjytay](http://kjytay.github.io/) discussed in his post, [A deep dive into glmnet: penalty.factor](https:// https://statisticaloddsandends.wordpress.com/2018/11/13/a-deep-dive-into-glmnet-penalty-factor/), we can find that the sum of penalty modifiers is exactly 1.

First generate some data,

```{r}
n = 100; p = 5; p.true = 2
set.seed(1234)
X = matrix(rnorm(n * p), nrow = n)
beta = matrix(c(rep(1, p.true), rep(0, p - p.true)), ncol = 1)
y = X %*% beta + 3 * rnorm(n)
```

We fit two models, one uses the default options, another use `penalty.factor=rep(2,5)`

```{r}
library(glmnet)
fit = glmnet(X, y)
fit2 = glmnet(X, y, penalty.factor = rep(2, 5))
```

We can find that these two models have the exact same `lambda` sequence and produce the same `beta` coefficients.

```{r}
sum(fit$lambda != fit2$lambda)
sum(fit$beta != fit2$beta)
```

## Offset

All the models allow for an **offset** term. That is a real valued number $o_i$ for each observation, that gets added to the linear predictor, and is not associated with any parameter:

$$
\eta(x_i) = o_i + \beta_0 + \beta^Tx_i\,.
$$

For Poisson models the offset allows us to model rates rather than mean counts, if the observation period differs for each observation. Suppose we observe a count $Y$ over period $t$, then $\E[Y\mid T=t,X=x]=t\mu(x)$, where $\mu(x)$ is the rate per unit time. Using the log link, we would supply $o_i=\log t_i$ for each observation.

## Standardize

The necessity of standardizing our features before model fitting is common practice in statistical learning. This is because that our features are on vastly different scales, the features with larger scales will tend to dominate the action

## References

Hastie, T., Tibshirani, R., & Wainwright, M. (n.d.). Statistical Learning with Sparsity, 362.

1 change: 1 addition & 0 deletions rmds/header.html
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
<script src="mathjax.js"></script>
7 changes: 7 additions & 0 deletions rmds/index.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
---
title: "Rmd Gallery"
---

Here are my rmarkdown notes:

- [Overview of `glmnet`](glmnet.html)
112 changes: 112 additions & 0 deletions rmds/mathjax.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@
window.MathJax = {
showMathMenu: false,
tex2jax: {
inlineMath: [['$','$'], ['\\(','\\)']],
processEscapes:true
},
TeX: {
Macros: {
A: "{\\mathbf{A}}",
B: "{\\mathbf{B}}",
C: "{\\mathbf{C}}",
D: "{\\mathbf{D}}",
H: "{\\mathbf{H}}",
K: "{\\mathbf{K}}",
L: "{\\mathbf{L}}",
M: "{\\mathbf{M}}",
N: "{\\mathbf{N}}",
R: "{\\mathbf{R}}",
IR: "{\\mathrm{I\\!R}}",
S: "{\\mathbf{S}}",
I: "{\\mathbf{I}}",
J: "{\\mathbf{J}}",
X: "{\\mathbf{X}}",
Y: "{\\mathbf{Y}}",
Z: "{\\mathbf{Z}}",
U: "{\\mathbf{U}}",
V: "{\\mathbf{V}}",
W: "{\\mathbf{W}}",

y: "{\\mathbf{y}}",
f: "{\\mathbf{f}}",
x: "{\\mathbf{x}}",
z: "{\\mathbf{z}}",
u: "{\\mathbf{u}}",
v: "{\\mathbf{v}}",
b: "{\\mathbf{b}}",
p: "{\\mathbf{p}}",
k: "{\\mathbf{k}}",

bbZ: "{\\mathbb{Z}}",
bbE: "{\\mathbb{E}}",
bbR: "{\\mathbb{R}}",

calC: "{{\\cal{C}}}",
calS: "{{\\cal{S}}}",
calI: "{{\\cal{I}}}",
cC: "{{\\cal{C}}}",
cD: "{{\\cal{D}}}",
cS: "{\\cal{S}}",
cF: "{\\cal{F}}",
cM: "{\\cal{M}}",
cK: "{\\cal{K}}",

LOG: "{\\mathrm{log}}",
log: "{\\mathrm{log}}",
EPE: "{\\mathrm{EPE}}",
MSE: "{\\mathrm{MSE}}",
E: "{\\mathrm{E}}",
1: "{\\boldsymbol 1}",
0: "{\\boldsymbol 0}",
Cov: "{\\mathrm{Cov}}",
cov: "{\\mathrm{cov}}",
CV: "{\\mathrm{CV}}",
Var: "{\\mathrm{Var}}",
Bias: "{\\mathrm{Bias}}",
bias: "{\\mathrm{bias}}",
se: "{\\mathrm{se}}",
det: "{\\mathrm{det}\\;}",
cosh: "{\\mathrm{cosh}\\;}",
tanh: "{\\mathrm{tanh}}",
arg: "{\\mathrm{arg}\\;}",
RSS: "{\\mathrm{RSS}}",
PRSS: "{\\mathrm{PRSS}}",
argmin: "{\\mathrm{argmin}}",
Ave: "{\\mathrm{Ave}}",
median: "{\\mathrm{median}}",
card: "{\\mathrm{card}}",
Dev: "{\\mathrm{Dev}}",
null: "{\\mathrm{null}}",

inf: "{\\mathrm{inf}}",
sign: "{\\mathrm{sign}}",
df: "{\\mathrm{df}}",
tr: "{\\mathrm{tr}}",
Err: "{\\mathrm{Err}}",
err: "{\\mathrm{err}}",
logit: "{\\mathrm{logit}}",
loglik: "{\\mathrm{loglik}}",
probit: "{\\mathrm{probit}}",
trace: "{\\mathrm{trace}}",
diag: "{\\mathrm{diag}}",
st: "{\\mathrm{subject\\; to}\\;}",
pr: "{\\mathrm{pr}}",
Pr: "{\\mathrm{Pr}}",

pa: "{\\mathrm{pa}}",

bsigma: "{\\boldsymbol\\Sigma}",
bomega: "{\\boldsymbol\\Omega}",
balpha: "{\\boldsymbol\\alpha}",
bbeta: "{\\boldsymbol\\beta}",
bmu: "{\\boldsymbol\\mu}",


def: "{\\;\\overset{\\mathrm{def}}{=}\\;}",

ind: "{\\perp \\!\\!\\! \\perp}"
},
entensions: ["color.js"],
equationNumbers: { autoNumber: "AMS" }
}
};
15 changes: 15 additions & 0 deletions rmds/rmds.Rproj
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
Version: 1.0

RestoreWorkspace: Default
SaveWorkspace: Default
AlwaysSaveHistory: Default

EnableCodeIndexing: Yes
UseSpacesForTab: Yes
NumSpacesForTab: 2
Encoding: UTF-8

RnwWeave: Sweave
LaTeX: XeLaTeX

BuildType: Website
51 changes: 51 additions & 0 deletions rmds/style.css
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
body {
font-family: "Palatino Linotype", "Book Antiqua", Palatino, 'EB Garamond', serif;
font-style: normal;
font-weight: 400;
font-size: 1.5rem;
color: #444;
}

h1, h2, h3, h4, h5, h6, .h1, .h2, .h3, .h4, .h5, .h6 {
font-family: "Palatino Linotype", "Book Antiqua", Palatino, 'EB Garamond', serif;
}

h1 {
font-size: 2.0em;
font-weight: 400;
color: #aaa;
}

h2 {
font-size: 1.6em;
color: #444;
margin-top: 1em;
margin-bottom: 0.2em;
font-style: Bold;
}

h3 {
font-size: 1.2em;
color: #444;
margin-top: 1.2em;
margin-bottom: 0.2em;
font-style: Medium;
font-weight: 500;
}

h4 {
font-weight: 300;
color: #aaa;
margin-top: 0.5em;
font-style: italic;
font-weight: 400;
}

h5, h6 {
font-size: 0.8em;
color: rgb(80, 70, 70);
margin-top: 0.5em;
margin-bottom: 0.5em;
font-style: thin;
font-weight: 100;
}

0 comments on commit d08ea63

Please sign in to comment.