forked from szcf-weiya/ESL-CN
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
e7affcf
commit d08ea63
Showing
11 changed files
with
354 additions
and
9 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -13,4 +13,5 @@ other | |
unimelb_training.csv | ||
CreateGrantData.R | ||
grantData.RData | ||
.vscode/ | ||
.vscode/ | ||
_site/ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
$$ | ||
\newcommand\E{\mathbb{E}} | ||
\newcommand\y{\mathbf{y}} | ||
\newcommand\1{\boldsymbol{1}} | ||
\newcommand\0{\boldsymbol{0}} | ||
\newcommand\bmu{\boldsymbol\mu} | ||
\newcommand\Dev{\mathrm{Dev}} | ||
\newcommand\null{\mathrm{null}} | ||
$$ |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
name: "rmd-collections" | ||
navbar: | ||
title: "Rmd Gallery" | ||
left: | ||
- text: "Home" | ||
href: index.html | ||
- text: "About" | ||
href: https://hohoweiya.xyz | ||
- text: "ESL CN" | ||
href: https://esl.hohoweiya.xyz | ||
- text: "Work Yard" | ||
href: https://stats.hohoweiya.xyz | ||
- text: "Tech Note" | ||
href: https://tech.hohoweiya.xyz | ||
output: | ||
html_document: | ||
theme: cosmo | ||
highlight: textmate | ||
include: | ||
in_header: header.html | ||
after_body: footer.html | ||
css: style.css |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
<p>Copyright © 2016-2019 weiya</p> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,116 @@ | ||
--- | ||
title: "Overview of `glmnet`" | ||
author: "weiya" | ||
date: "February 26, 2019" | ||
output: html_document | ||
--- | ||
|
||
The full signature of the `glmnet` function is: | ||
|
||
```{r, eval = FALSE} | ||
glmnet(x, y, | ||
family=c("gaussian","binomial","poisson","multinomial","cox","mgaussian"), | ||
weights, offset=NULL, alpha = 1, nlambda = 100, | ||
lambda.min.ratio = ifelse(nobs<nvars,0.01,0.0001), lambda=NULL, | ||
standardize = TRUE, intercept=TRUE, thresh = 1e-07, dfmax = nvars + 1, | ||
pmax = min(dfmax * 2+20, nvars), exclude, penalty.factor = rep(1, nvars), | ||
lower.limits=-Inf, upper.limits=Inf, maxit=100000, | ||
type.gaussian=ifelse(nvars<500,"covariance","naive"), | ||
type.logistic=c("Newton","modified.Newton"), | ||
standardize.response=FALSE, type.multinomial=c("ungrouped","grouped")) | ||
``` | ||
|
||
## Family | ||
|
||
- `gaussian` | ||
- `binomial` | ||
- `multinomial` | ||
- `poisson` | ||
- `cox` | ||
|
||
### Deviance Measure | ||
|
||
- $\hat\bmu_\lambda$: the $N$-vector of fitted mean values when the parameter is $\lambda$ | ||
- $\tilde\bmu$: the unrestricted or [**saturated** fit](https://stats.stackexchange.com/questions/283/what-is-a-saturated-model)(having $\hat y=y_i$). | ||
|
||
Define | ||
|
||
$$ | ||
\Dev_\lambda \doteq 2[\ell(\y, \tilde \bmu)-\ell(\y,\hat\bmu_\lambda)]\,, | ||
$$ | ||
where $\ell(\y,\bmu)$ is the log-likelihood of the model $\bmu$, a sum of $N$ terms. | ||
|
||
#### Null deviance | ||
|
||
$$ | ||
\Dev_\null = \Dev_\infty\,. | ||
$$ | ||
Typically, $\hat\bmu_\infty=\bar y\1$, or $\hat\bmu_\infty=\0$ in the `cox` family. | ||
|
||
`glmnet` reports the **fraction of deviance explained** | ||
|
||
$$ | ||
D^2_\lambda = \frac{\Dev_\null-\Dev_\lambda}{\Dev_\null}\,. | ||
$$ | ||
|
||
The name $D^2$ is by analogy with $R^2$, the fraction of variance explained in regression. | ||
|
||
## Penalties | ||
|
||
For all models, the `glmnet` algorithm admits a range of elastic-net penalties ranging from $\ell_2$ to $\ell_1$. The general form of the penalized optimization problem is | ||
|
||
$$ | ||
\min_{\beta_0,\beta}\Big\{-\frac 1N\ell(\y;\beta_0,\beta)+\lambda\sum_{j=1}^p\gamma_j\{(1-\alpha)\beta_j^2+\alpha\vert \beta_j\vert\}\Big\}\,. | ||
$$ | ||
|
||
- $\lambda$ determines the overall complexity of the model | ||
- the elastic-net parameter $\alpha\in[0,1]$ provides a mix between ridge regression and the lasso | ||
- $\gamma_j\ge 0$ is a penalty modifier. | ||
|
||
### Example | ||
|
||
As [kjytay](http://kjytay.github.io/) discussed in his post, [A deep dive into glmnet: penalty.factor](https:// https://statisticaloddsandends.wordpress.com/2018/11/13/a-deep-dive-into-glmnet-penalty-factor/), we can find that the sum of penalty modifiers is exactly 1. | ||
|
||
First generate some data, | ||
|
||
```{r} | ||
n = 100; p = 5; p.true = 2 | ||
set.seed(1234) | ||
X = matrix(rnorm(n * p), nrow = n) | ||
beta = matrix(c(rep(1, p.true), rep(0, p - p.true)), ncol = 1) | ||
y = X %*% beta + 3 * rnorm(n) | ||
``` | ||
|
||
We fit two models, one uses the default options, another use `penalty.factor=rep(2,5)` | ||
|
||
```{r} | ||
library(glmnet) | ||
fit = glmnet(X, y) | ||
fit2 = glmnet(X, y, penalty.factor = rep(2, 5)) | ||
``` | ||
|
||
We can find that these two models have the exact same `lambda` sequence and produce the same `beta` coefficients. | ||
|
||
```{r} | ||
sum(fit$lambda != fit2$lambda) | ||
sum(fit$beta != fit2$beta) | ||
``` | ||
|
||
## Offset | ||
|
||
All the models allow for an **offset** term. That is a real valued number $o_i$ for each observation, that gets added to the linear predictor, and is not associated with any parameter: | ||
|
||
$$ | ||
\eta(x_i) = o_i + \beta_0 + \beta^Tx_i\,. | ||
$$ | ||
|
||
For Poisson models the offset allows us to model rates rather than mean counts, if the observation period differs for each observation. Suppose we observe a count $Y$ over period $t$, then $\E[Y\mid T=t,X=x]=t\mu(x)$, where $\mu(x)$ is the rate per unit time. Using the log link, we would supply $o_i=\log t_i$ for each observation. | ||
|
||
## Standardize | ||
|
||
The necessity of standardizing our features before model fitting is common practice in statistical learning. This is because that our features are on vastly different scales, the features with larger scales will tend to dominate the action | ||
|
||
## References | ||
|
||
Hastie, T., Tibshirani, R., & Wainwright, M. (n.d.). Statistical Learning with Sparsity, 362. | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
<script src="mathjax.js"></script> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
--- | ||
title: "Rmd Gallery" | ||
--- | ||
|
||
Here are my rmarkdown notes: | ||
|
||
- [Overview of `glmnet`](glmnet.html) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,112 @@ | ||
window.MathJax = { | ||
showMathMenu: false, | ||
tex2jax: { | ||
inlineMath: [['$','$'], ['\\(','\\)']], | ||
processEscapes:true | ||
}, | ||
TeX: { | ||
Macros: { | ||
A: "{\\mathbf{A}}", | ||
B: "{\\mathbf{B}}", | ||
C: "{\\mathbf{C}}", | ||
D: "{\\mathbf{D}}", | ||
H: "{\\mathbf{H}}", | ||
K: "{\\mathbf{K}}", | ||
L: "{\\mathbf{L}}", | ||
M: "{\\mathbf{M}}", | ||
N: "{\\mathbf{N}}", | ||
R: "{\\mathbf{R}}", | ||
IR: "{\\mathrm{I\\!R}}", | ||
S: "{\\mathbf{S}}", | ||
I: "{\\mathbf{I}}", | ||
J: "{\\mathbf{J}}", | ||
X: "{\\mathbf{X}}", | ||
Y: "{\\mathbf{Y}}", | ||
Z: "{\\mathbf{Z}}", | ||
U: "{\\mathbf{U}}", | ||
V: "{\\mathbf{V}}", | ||
W: "{\\mathbf{W}}", | ||
|
||
y: "{\\mathbf{y}}", | ||
f: "{\\mathbf{f}}", | ||
x: "{\\mathbf{x}}", | ||
z: "{\\mathbf{z}}", | ||
u: "{\\mathbf{u}}", | ||
v: "{\\mathbf{v}}", | ||
b: "{\\mathbf{b}}", | ||
p: "{\\mathbf{p}}", | ||
k: "{\\mathbf{k}}", | ||
|
||
bbZ: "{\\mathbb{Z}}", | ||
bbE: "{\\mathbb{E}}", | ||
bbR: "{\\mathbb{R}}", | ||
|
||
calC: "{{\\cal{C}}}", | ||
calS: "{{\\cal{S}}}", | ||
calI: "{{\\cal{I}}}", | ||
cC: "{{\\cal{C}}}", | ||
cD: "{{\\cal{D}}}", | ||
cS: "{\\cal{S}}", | ||
cF: "{\\cal{F}}", | ||
cM: "{\\cal{M}}", | ||
cK: "{\\cal{K}}", | ||
|
||
LOG: "{\\mathrm{log}}", | ||
log: "{\\mathrm{log}}", | ||
EPE: "{\\mathrm{EPE}}", | ||
MSE: "{\\mathrm{MSE}}", | ||
E: "{\\mathrm{E}}", | ||
1: "{\\boldsymbol 1}", | ||
0: "{\\boldsymbol 0}", | ||
Cov: "{\\mathrm{Cov}}", | ||
cov: "{\\mathrm{cov}}", | ||
CV: "{\\mathrm{CV}}", | ||
Var: "{\\mathrm{Var}}", | ||
Bias: "{\\mathrm{Bias}}", | ||
bias: "{\\mathrm{bias}}", | ||
se: "{\\mathrm{se}}", | ||
det: "{\\mathrm{det}\\;}", | ||
cosh: "{\\mathrm{cosh}\\;}", | ||
tanh: "{\\mathrm{tanh}}", | ||
arg: "{\\mathrm{arg}\\;}", | ||
RSS: "{\\mathrm{RSS}}", | ||
PRSS: "{\\mathrm{PRSS}}", | ||
argmin: "{\\mathrm{argmin}}", | ||
Ave: "{\\mathrm{Ave}}", | ||
median: "{\\mathrm{median}}", | ||
card: "{\\mathrm{card}}", | ||
Dev: "{\\mathrm{Dev}}", | ||
null: "{\\mathrm{null}}", | ||
|
||
inf: "{\\mathrm{inf}}", | ||
sign: "{\\mathrm{sign}}", | ||
df: "{\\mathrm{df}}", | ||
tr: "{\\mathrm{tr}}", | ||
Err: "{\\mathrm{Err}}", | ||
err: "{\\mathrm{err}}", | ||
logit: "{\\mathrm{logit}}", | ||
loglik: "{\\mathrm{loglik}}", | ||
probit: "{\\mathrm{probit}}", | ||
trace: "{\\mathrm{trace}}", | ||
diag: "{\\mathrm{diag}}", | ||
st: "{\\mathrm{subject\\; to}\\;}", | ||
pr: "{\\mathrm{pr}}", | ||
Pr: "{\\mathrm{Pr}}", | ||
|
||
pa: "{\\mathrm{pa}}", | ||
|
||
bsigma: "{\\boldsymbol\\Sigma}", | ||
bomega: "{\\boldsymbol\\Omega}", | ||
balpha: "{\\boldsymbol\\alpha}", | ||
bbeta: "{\\boldsymbol\\beta}", | ||
bmu: "{\\boldsymbol\\mu}", | ||
|
||
|
||
def: "{\\;\\overset{\\mathrm{def}}{=}\\;}", | ||
|
||
ind: "{\\perp \\!\\!\\! \\perp}" | ||
}, | ||
entensions: ["color.js"], | ||
equationNumbers: { autoNumber: "AMS" } | ||
} | ||
}; |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
Version: 1.0 | ||
|
||
RestoreWorkspace: Default | ||
SaveWorkspace: Default | ||
AlwaysSaveHistory: Default | ||
|
||
EnableCodeIndexing: Yes | ||
UseSpacesForTab: Yes | ||
NumSpacesForTab: 2 | ||
Encoding: UTF-8 | ||
|
||
RnwWeave: Sweave | ||
LaTeX: XeLaTeX | ||
|
||
BuildType: Website |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,51 @@ | ||
body { | ||
font-family: "Palatino Linotype", "Book Antiqua", Palatino, 'EB Garamond', serif; | ||
font-style: normal; | ||
font-weight: 400; | ||
font-size: 1.5rem; | ||
color: #444; | ||
} | ||
|
||
h1, h2, h3, h4, h5, h6, .h1, .h2, .h3, .h4, .h5, .h6 { | ||
font-family: "Palatino Linotype", "Book Antiqua", Palatino, 'EB Garamond', serif; | ||
} | ||
|
||
h1 { | ||
font-size: 2.0em; | ||
font-weight: 400; | ||
color: #aaa; | ||
} | ||
|
||
h2 { | ||
font-size: 1.6em; | ||
color: #444; | ||
margin-top: 1em; | ||
margin-bottom: 0.2em; | ||
font-style: Bold; | ||
} | ||
|
||
h3 { | ||
font-size: 1.2em; | ||
color: #444; | ||
margin-top: 1.2em; | ||
margin-bottom: 0.2em; | ||
font-style: Medium; | ||
font-weight: 500; | ||
} | ||
|
||
h4 { | ||
font-weight: 300; | ||
color: #aaa; | ||
margin-top: 0.5em; | ||
font-style: italic; | ||
font-weight: 400; | ||
} | ||
|
||
h5, h6 { | ||
font-size: 0.8em; | ||
color: rgb(80, 70, 70); | ||
margin-top: 0.5em; | ||
margin-bottom: 0.5em; | ||
font-style: thin; | ||
font-weight: 100; | ||
} |