forked from DataScienceSpecialization/courses
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge commit 'eaed53d96e44a7c16145d1ac9701827607ff819e'
* commit 'eaed53d96e44a7c16145d1ac9701827607ff819e': Added a fourth hw Added hw3
- Loading branch information
Showing
6 changed files
with
1,070 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,207 @@ | ||
--- | ||
title : Homework 3 for Stat Inference | ||
subtitle : Extra problems for Stat Inference | ||
author : Brian Caffo | ||
job : Johns Hopkins Bloomberg School of Public Health | ||
framework : io2012 | ||
highlighter : highlight.js | ||
hitheme : tomorrow | ||
#url: | ||
# lib: ../../librariesNew #Remove new if using old slidify | ||
# assets: ../../assets | ||
widgets : [mathjax, quiz, bootstrap] | ||
mode : selfcontained # {standalone, draft} | ||
--- | ||
```{r setup, cache = F, echo = F, message = F, warning = F, tidy = F, results='hide'} | ||
# make this an external chunk that can be included in any file | ||
library(knitr) | ||
options(width = 100) | ||
opts_chunk$set(message = F, error = F, warning = F, comment = NA, fig.align = 'center', dpi = 100, tidy = F, cache.path = '.cache/', fig.path = 'fig/') | ||
options(xtable.type = 'html') | ||
knit_hooks$set(inline = function(x) { | ||
if(is.numeric(x)) { | ||
round(x, getOption('digits')) | ||
} else { | ||
paste(as.character(x), collapse = ', ') | ||
} | ||
}) | ||
knit_hooks$set(plot = knitr:::hook_plot_html) | ||
``` | ||
|
||
## About these slides | ||
- These are some practice problems for Statistical Inference Quiz 3 | ||
- They were created using slidify interactive which you will learn in | ||
Creating Data Products | ||
- Please help improve this with pull requests here | ||
(https://github.com/bcaffo/courses) | ||
|
||
|
||
|
||
--- &multitext | ||
Load the data set `mtcars` in the `datasets` R package. Calculate a | ||
95% confidence interval to the nearest MPG. | ||
|
||
1. What is the lower endpoint of the interval? | ||
2. What is the upper endpoint of the interval? | ||
|
||
*** .hint | ||
Do `library(datasets)` and then `data(mtcars)` to get the data. | ||
Consider `t.test` for calculations. You may have to install | ||
the datasets package. | ||
|
||
|
||
*** .explanation | ||
```{r} | ||
library(datasets); data(mtcars) | ||
round(t.test(mtcars$mpg)$conf.int) | ||
``` | ||
|
||
<span class="answer">`r round(min(t.test(mtcars$mpg)$conf.int))`</span> | ||
<span class="answer">`r round(max(t.test(mtcars$mpg)$conf.int))`</span> | ||
|
||
--- &multitext | ||
Suppose that data of 9 paired differences has a standard error of $1$, what value would the average difference have to be to have the lower endpoint of a 95% | ||
students t confidence interval touch zero? | ||
|
||
1. Give the number here to two decimal places | ||
|
||
*** .hint | ||
The t interval is $\bar x t_{.95, 8}\pm s /sqrt{n}$ | ||
|
||
*** .explanation | ||
<span class="answer">`r round(qt(.95, df = 3) * 1 / 3, 2)`</span> | ||
|
||
We want $\bar x = t_{.95} s / sqrt{n}$ | ||
```{r} | ||
round(qt(.95, df = 3) * 1 / 3, 2) | ||
``` | ||
|
||
|
||
--- &radio | ||
An independent group Student's T interval is used over | ||
a paired T interval when: | ||
|
||
1. The observations are paired between the groups. | ||
2. _The observations within the groups are natually assumed to be statistically independent_ | ||
3. As long as you do it correctly, either is fine. | ||
4. More details are needed to answer this question | ||
|
||
*** .hint | ||
A paired interval is for paired observations. | ||
|
||
*** .explanation | ||
If the groups are independent is the correct interval. | ||
|
||
|
||
--- &multitext | ||
Consider the `mtcars` dataset. Construct a 95% T interval for MPG comparing | ||
4 to 6 cylinder cars (subtracting in the order of 4 - 6) | ||
assume a constant variance. | ||
|
||
1. What is the lower endpoint of the interval to 1 decimal place? | ||
2. What is the upper endpoint of the interval to 1 decimal place? | ||
|
||
*** .hint | ||
Use `t.test` with `var.equal=TRUE` | ||
|
||
*** .explanation | ||
|
||
```{r} | ||
m4 <- mtcars$mpg[mtcars$cyl == 4] | ||
m6 <- mtcars$mpg[mtcars$cyl == 6] | ||
#this does 4 - 6 | ||
confint <- as.vector(t.test(m4, m6, var.equal = TRUE)$conf.int) | ||
``` | ||
|
||
<span class="answer">`r round(min(confint), 1)`</span> | ||
<span class="answer">`r round(max(confint), 1)`</span> | ||
|
||
|
||
--- &radio | ||
If someone put a gun to your head and said "Your confidence interval | ||
must contain what it's estimating or I'll pull the trigger", what would | ||
be the smart thing to do? | ||
|
||
1. _Make your interval as wide as possible_ | ||
2. Make your interval as small as possible | ||
3. Call the authorities | ||
|
||
*** .hint | ||
C'mon. You don't need a hint | ||
|
||
*** .explanation | ||
This is just an example of what happens to confidence intervals as you | ||
increas the confidence level. You want to be quite sure in your interval (i.e. | ||
have a large confidence level) and so you would increase the interval's width | ||
|
||
--- &radio | ||
|
||
Refer back to comparing MPG for 4 versus 6 cylinders. What do you conclude? | ||
|
||
1. The interval is above zero, suggesting 6 is better than 4 in the terms of MPG | ||
2. _The interval is above zero, suggesting 4 is better than 6 in the terms of MPG_ | ||
3. The interval does not tell you anything about the hypothesis test; you have to do the test. | ||
4. The interval contains 0 suggesting no difference. | ||
|
||
*** .hint | ||
Refer back to the problem, consider the implications of the interval being | ||
larger than 0, double check the order in which things were subtracted and | ||
make sure the results make sense in the context of the problem. | ||
|
||
*** .explanation | ||
The interval was conducted subtracting 4 - 6 and was entirely above zero. | ||
|
||
--- &multitext | ||
Suppose that 18 obese subjects were randomized, 9 each, to a new diet pill and a placebo. Subjects' body mass indices (BMIs) were measured at a baseline and again after having received the treatment or placebo for four weeks. The average difference from follow-up to the baseline (followup - baseline) was ???3 kg/m2 for the treated group and 1 kg/m2 for the placebo group. The corresponding standard deviations of the differences was 1.5 kg/m2 for the treatment group and 1.8 kg/m2 for the placebo group. Does the change in BMI over the four week period appear to differ between the treated and placebo groups? | ||
|
||
1. Calculate the pooled variance estimate to 2 decimal places | ||
|
||
|
||
*** .hint | ||
The sample sizes are equal, so the pooled variance is the average of the | ||
individual variances | ||
|
||
|
||
*** .explanation | ||
<span class="answer">`r round(min(confint), 1)`</span> | ||
```{r} | ||
n1 <- n2 <- 9 | ||
x1 <- -3 ##treated | ||
x2 <- 1 ##placebo | ||
s1 <- 1.5 ##treated | ||
s2 <- 1.8 ##placebo | ||
spsq <- ( (n1 - 1) * s1^2 + (n2 - 1) * s2^2) / (n1 + n2 - 2) | ||
``` | ||
<span class="answer">`r round(spsq, 2)`</span> | ||
|
||
|
||
--- &radio | ||
|
||
For Binomial data the maximum likelihood estimate for the probability of | ||
a success is | ||
|
||
1. _The proportion of successes_ | ||
2. The proportion of failures | ||
3. A shrunken version of the proportion of successes | ||
4. A shrunken version of the proportion of failures | ||
|
||
*** .hint | ||
Look back at the notes about likelihood. | ||
|
||
*** .explanation | ||
The MLE for binomial data is always the proportion of successes. | ||
|
||
--- &radio | ||
|
||
Bayesian inference requires | ||
|
||
1. A type I error rate | ||
2. Setting your confidence level | ||
3. _Assigning a prior probability distribution_ | ||
4. Evaluating frequency error rates | ||
|
||
*** .explanation | ||
All of the other answers discuss frequentist concepts. All Bayesian analyses requiring setting a prior. | ||
|
||
|
Oops, something went wrong.