hierarchical model

jrnold · May 8, 2018 · 089a606 · 089a606
1 parent 419b4f8
commit 089a606
Show file tree

Hide file tree

Showing 2 changed files with 24 additions and 21 deletions.
diff --git a/hierarchical.Rmd b/hierarchical.Rmd
@@ -1,4 +1,12 @@
-# Hierarchical Models
+# Shrinkage and Hierarchical Models
+
+```{r setup,message=FALSE}
+library("tidyverse")
+library("rstan")
+library("loo")
+```
+
+## Hierarchical Models
 
 -   *Hierarchical models:* often groups of parameters, $\{\theta_1, \dots, \theta_J\}$, are related.
 -   E.g. countries, states, counties, years, etc. Even the regression coefficients, $\beta_1, \dots, \beta_k$ seen the in the [Shrinkage and Regularization] chapter.
@@ -10,15 +18,7 @@
 -   parameters $(\theta_1, \dots, \theta_J)$ are *exchangeable* if $p(\theta_1, \dots, \theta_J)$ don't depend on the indexes.
 -   i.i.d. models are a special case of exchangeability.
 
-## Prerequisites {-}
-
-```{r setup,message=FALSE}
-library("tidyverse")
-library("rstan")
-library("loo")
-```
-
-## Example: Baseball Hits
+## Baseball Hits
 
 @EfronMorris1975a analyzed data from 18 players in the 1970 season.
 The goal was to predict the batting average of these 18 players from their first 45 at-bats for the remainder of the 1970 season.
@@ -108,7 +108,7 @@ models[["pool"]]
 models[["partial"]] <- stan_model("stan/binomial-partial-pooling-t.stan")
 ```
 ```{r}
-# models[["partial"]]
+models[["partial"]]
 ```
 
 Sample from all three models a
@@ -177,8 +177,7 @@ map2_df(names(fits), fits,
 
 To see why this is the case, plot the average errors for each observation in- and out-of-sample.
 In-sample for the no-pooling model is zero, but it over-estimates (under-estimates) the players with the highest (lowest) batting averages in their first 45 at bats---this is regression to the mean.
-In sample, the partially pooling model shrinks the estimates towards the mean and
-reducing error.
+In sample, the partially pooling model shrinks the estimates towards the mean and reducing error.
 Out of sample, the errors of the partially pooled model are not much different than the no-pooling model, except that the extreme observations have lower errors.
 ```{r}
 select(bball1970,
@@ -211,4 +210,4 @@ Extensions:
 -   Albert, Jim. [Revisiting Efron and Morris’s Baseball Study](https://baseballwithr.wordpress.com/2016/02/15/revisiting-efron-and-morriss-baseball-study/) Feb 15, 2016
 -   Bob Carpenter. [Hierarchical Bayesian Batting Ability, with Multiple Comparisons](https://lingpipe-blog.com/2009/11/04/hierarchicalbayesian-batting-ability-with-multiple-comparisons/). November 4, 2009.
 -   John Kruschke. [Shrinkage in multi-level hierarchical models](http://doingbayesiandataanalysis.blogspot.com/2012/11/shrinkage-in-multi-level-hierarchical.html). November 27, 2012.
--   See @JensenMcShaneWyner2009a for an updated hierarchical model of baseball hitting
+-   See @JensenMcShaneWyner2009a for an updated hierarchical model of baseball hitting.
diff --git a/shrinkage2.Rmd → shrinkage.Rmd b/shrinkage2.Rmd → shrinkage.Rmd
@@ -3,16 +3,15 @@
 In the frequentist framework, shrinkage estimation purposefully increases the bias
 of the estimator in order to reduce the variance.
 
-## MSE
+## Bias-Variance Tradeoff
 
-Under repeated sampling, the expected mean squared error is
-$$
-MSE = \E\left[\left(\hat{\theta} - \theta \right)^{2}\right] = \text{variance} + \text{bias}^{2}
-$$
+TODO
+
+See ISLR chapter on bias-variance tradeoff.
 
-## James-Stein
+<!--
 
-James Stein 1961
+## James-Stein Estimator
 
 Suppose that $x$ is distributed
 $$
@@ -133,3 +132,8 @@ gather(vals, estimator, error) %>%
   summarise(mean = mean(error), sd = sd(error))
 ```
 
+-->
+
+## Bayesian Shrinkage
+
+