Skip to content

Commit

Permalink
minor conflict (extra space)
Browse files Browse the repository at this point in the history
Merge remote-tracking branch 'origin/statinf' into statinf

Conflicts:
	Regression_Models/Least_Squares_Estimation/initLesson.R
  • Loading branch information
WilCrofter committed Nov 3, 2014
2 parents c6fbe18 + 56c4d77 commit a41091f
Show file tree
Hide file tree
Showing 93 changed files with 6,865 additions and 213 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -40,13 +40,13 @@
Hint: Use library(dplyr) to load the dplyr package.

- Class: cmd_question
Output: It's important that you have dplyr version 0.2 or later. To confirm this, type packageVersion("dplyr").
Output: It's important that you have dplyr version 0.3 or later. To confirm this, type packageVersion("dplyr").
CorrectAnswer: packageVersion("dplyr")
AnswerTests: omnitest(correctExpr='packageVersion("dplyr")')
Hint: Check what version of dplyr you have with packageVersion("dplyr").

- Class: text
Output: If your dplyr version is not at least 0.2, then you should hit the Esc key now, reinstall dplyr, then resume this lesson where you left off.
Output: If your dplyr version is not at least 0.3, then you should hit the Esc key now, reinstall dplyr, then resume this lesson where you left off.

- Class: cmd_question
Output: "The first step of working with data in dplyr is to load the data into what the package authors call a 'data frame tbl' or 'tbl_df'. Use the following code to create a new tbl_df called cran: \n\ncran <- tbl_df(mydf)."
Expand Down Expand Up @@ -76,10 +76,13 @@
Output: 'According to the "Introduction to dplyr" vignette written by the package authors, "The dplyr philosophy is to have small functions that each do one thing well." Specifically, dplyr supplies five ''verbs'' that cover all fundamental data manipulation tasks: select(), filter(), arrange(), mutate(), and summarize().'

- Class: cmd_question
Output: Use ?manip to pull up the documentation for these core functions.
CorrectAnswer: ?manip
AnswerTests: omnitest(correctExpr='?manip')
Hint: ?manip will display the documentation for dplyr's five core data manipulation functions.
Output: Use ?select to pull up the documentation for the first these core functions.
CorrectAnswer: ?select
AnswerTests: omnitest(correctExpr='?select')
Hint: ?select will display the documentation for dplyr's select() function.

- Class: text
Output: Help files for the other functions are accessible in the same way.

- Class: cmd_question
Output: As may often be the case, particularly with larger datasets, we are only interested in some of the variables. Use select(cran, ip_id, package, country) to select only the ip_id, package, and country variables from the cran dataset.
Expand Down
348 changes: 348 additions & 0 deletions R_Programming/Logic/lesson.yaml

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions R_Programming/MANIFEST
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ Vectors
Missing_Values
Subsetting_Vectors
Matrices_and_Data_Frames
Logic
lapply_and_sapply
vapply_and_tapply
Looking_at_Data
Expand Down
2 changes: 1 addition & 1 deletion R_Programming/Sequences_of_Numbers/lesson.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -80,7 +80,7 @@
- Class: cmd_question
Output: Or maybe we don't care what the increment is and we just want a sequence
of 30 numbers between 5 and 10. seq(5, 10, length=30) does the trick. Give it
shot now and store the result in a new variable called my_seq.
a shot now and store the result in a new variable called my_seq.
CorrectAnswer: my_seq <- seq(5, 10, length=30)
AnswerTests: omnitest(correctExpr='my_seq <- seq(5, 10, length=30)')
Hint: 'You''re using the same function here, but changing its arguments for different
Expand Down
2 changes: 1 addition & 1 deletion R_Programming/Simulation/lesson.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -94,7 +94,7 @@
Hint: Call rbinom() with n = 1, size = 100, and prob = 0.7.

- Class: cmd_question
Output: Equivilently, if we want to see all of the 0s and 1s, we can request 100 observations, each of size 1, with success probability of 0.7. Give it a try, assigning the result to a new variable called flips2.
Output: Equivalently, if we want to see all of the 0s and 1s, we can request 100 observations, each of size 1, with success probability of 0.7. Give it a try, assigning the result to a new variable called flips2.
CorrectAnswer: flips2 <- rbinom(100, size = 1, prob = 0.7)
AnswerTests: match_call('flips2 <- rbinom(100, size = 1, prob = 0.7)')
Hint: Call rbinom() with n = 100, size = 1, and prob = 0.7 and assign the result to flips2.
Expand Down
2 changes: 1 addition & 1 deletion R_Programming/vapply_and_tapply/lesson.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -97,7 +97,7 @@
Hint: You can see a summary of populations for countries with and without the color red on their flag with tapply(flags$population, flags$red, summary).

- Class: mult_question
Output: What is the median population (in millions) for counties *without* the color red on their flag?
Output: What is the median population (in millions) for countries *without* the color red on their flag?
AnswerChoices: 9.0; 4.0; 27.6; 3.0; 22.1; 0.0
CorrectAnswer: 3.0
AnswerTests: omnitest(correctVal= '3.0')
Expand Down
348 changes: 348 additions & 0 deletions R_Programming_Alt/Logic/lesson.yaml

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions R_Programming_Alt/MANIFEST
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ Vectors
Missing_Values
Subsetting_Vectors
Matrices_and_Data_Frames
Logic
lapply_and_sapply
vapply_and_tapply
Looking_at_Data
Expand Down
2 changes: 1 addition & 1 deletion R_Programming_Alt/Simulation/lesson.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -94,7 +94,7 @@
Hint: Call rbinom() with n = 1, size = 100, and prob = 0.7.

- Class: cmd_question
Output: Equivilently, if we want to see all of the 0s and 1s, we can request 100 observations, each of size 1, with success probability of 0.7. Give it a try, assigning the result to a new variable called flips2.
Output: Equivalently, if we want to see all of the 0s and 1s, we can request 100 observations, each of size 1, with success probability of 0.7. Give it a try, assigning the result to a new variable called flips2.
CorrectAnswer: flips2 <- rbinom(100, size = 1, prob = 0.7)
AnswerTests: match_call('flips2 <- rbinom(100, size = 1, prob = 0.7)')
Hint: Call rbinom() with n = 100, size = 1, and prob = 0.7 and assign the result to flips2.
Expand Down
2 changes: 1 addition & 1 deletion R_Programming_Alt/vapply_and_tapply/lesson.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -97,7 +97,7 @@
Hint: You can see a summary of populations for countries with and without the color red on their flag with tapply(flags$population, flags$red, summary).

- Class: mult_question
Output: What is the median population (in millions) for counties *without* the color red on their flag?
Output: What is the median population (in millions) for countries *without* the color red on their flag?
AnswerChoices: 9.0; 4.0; 27.6; 3.0; 22.1; 0.0
CorrectAnswer: 3.0
AnswerTests: omnitest(correctVal= '3.0')
Expand Down
4 changes: 2 additions & 2 deletions Regression_Models/Count_Outcomes/lesson.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@
FigureType: new

- Class: figure
Output: "In a Poisson regression, the log of lambda is assumed to be a linear function of the predictors. Since we will try to model the growth of visits to a web site, the log of lambda will be a linear function the date: log(lambda) = b0 + b1*date. This implies that the average number of hits per day, lambda, is exponential in the date: lambda = exp(b0)*exp(b1)^date. Exponential growth is also suggested by the smooth, black curve drawn though the data. Thus exp(b1) would represent the percentage by which visits grow per day."
Output: "In a Poisson regression, the log of lambda is assumed to be a linear function of the predictors. Since we will try to model the growth of visits to a web site, the log of lambda will be a linear function of the date: log(lambda) = b0 + b1*date. This implies that the average number of hits per day, lambda, is exponential in the date: lambda = exp(b0)*exp(b1)^date. Exponential growth is also suggested by the smooth, black curve drawn though the data. Thus exp(b1) would represent the percentage by which visits grow per day."
Figure: hits.R
FigureType: new

Expand Down Expand Up @@ -103,7 +103,7 @@
FigureType: new

- Class: cmd_question
Output: "In the figure, the maximum number of visits occurred in late 2012. Visits from the Simply Statistics blog were also at their maximum that day. To find the exact date we can use which.max(hits[,'visits']. Do this now."
Output: "In the figure, the maximum number of visits occurred in late 2012. Visits from the Simply Statistics blog were also at their maximum that day. To find the exact date we can use which.max(hits[,'visits']). Do this now."
CorrectAnswer: which.max(hits[,'visits'])
AnswerTests: omnitest("which.max(hits[,'visits'])", 704)
Hint: Type which.max(hits[,'visits']) or something equivalent.
Expand Down
11 changes: 11 additions & 0 deletions Statistical_Inference/Asymptotics/ACComp.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
ACCompar <- function(n){
num <- 1:n
den <- n
nn <- num+2
nd <- den+4
nf <- nn/nd
of <- num/den
scor <- nf<of
print(scor)
sum(scor)
}
9 changes: 9 additions & 0 deletions Statistical_Inference/Asymptotics/ACDemo.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
n <- 20; pvals <- seq(.1, .9, by = .05); nosim <- 1000
coverage <- sapply(pvals, function(p){
phats <- (rbinom(nosim, prob = p, size = n) + 2) / (n + 4)
ll <- phats - qnorm(.975) * sqrt(phats * (1 - phats) / n)
ul <- phats + qnorm(.975) * sqrt(phats * (1 - phats) / n)
mean(ll < p & ul > p)
})
g <- ggplot(data.frame(pvals, coverage), aes(x = pvals, y = coverage)) + geom_line(size = 2) + geom_hline(yintercept = 0.95) + ylim(.75, 1.0)
print(g)
10 changes: 10 additions & 0 deletions Statistical_Inference/Asymptotics/PoisDemo.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
lambdavals <- seq(0.005, 0.10, by = .01); nosim <- 1000
t <- 100
coverage <- sapply(lambdavals, function(lambda){
lhats <- rpois(nosim, lambda = lambda * t) / t
ll <- lhats - qnorm(.975) * sqrt(lhats / t)
ul <- lhats + qnorm(.975) * sqrt(lhats / t)
mean(ll < lambda & ul > lambda)
})
g <- ggplot(data.frame(lambdavals, coverage), aes(x = lambdavals, y = coverage)) + geom_line(size = 2) + geom_hline(yintercept = 0.95)+ylim(0, 1.0)
print(g)
10 changes: 10 additions & 0 deletions Statistical_Inference/Asymptotics/PoisDemoImpr.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
lambdavals <- seq(0.005, 0.10, by = .01); nosim <- 1000
t <- 1000
coverage <- sapply(lambdavals, function(lambda){
lhats <- rpois(nosim, lambda = lambda * t) / t
ll <- lhats - qnorm(.975) * sqrt(lhats / t)
ul <- lhats + qnorm(.975) * sqrt(lhats / t)
mean(ll < lambda & ul > lambda)
})
g <- ggplot(data.frame(lambdavals, coverage), aes(x = lambdavals, y = coverage)) + geom_line(size = 2) + geom_hline(yintercept = 0.95)+ylim(0, 1.0)
print(g)
14 changes: 14 additions & 0 deletions Statistical_Inference/Asymptotics/WaldDemo.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
n <- 20
nosim <- 30
mywald <- function(p){
phats <- rbinom(nosim, prob = p, size = n) / n
ll <- phats - qnorm(.975) * sqrt(phats * (1 - phats) / n)
ul <- phats + qnorm(.975) * sqrt(phats * (1 - phats) / n)
print("Here are the p\' values")
print(phats)
print("Here are the lower")
print(ll)
print("Here are the upper")
print(ul)
mean(ll < p & ul > p)
}
9 changes: 9 additions & 0 deletions Statistical_Inference/Asymptotics/WaldFail.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
n <- 20; pvals <- seq(.1, .9, by = .05); nosim <- 1000
coverage <- sapply(pvals, function(p){
phats <- rbinom(nosim, prob = p, size = n) / n
ll <- phats - qnorm(.975) * sqrt(phats * (1 - phats) / n)
ul <- phats + qnorm(.975) * sqrt(phats * (1 - phats) / n)
mean(ll < p & ul > p)
})
g <- ggplot(data.frame(pvals, coverage), aes(x = pvals, y = coverage)) + geom_line(size = 2) + geom_hline(yintercept = 0.95) + ylim(.75, 1.0)
print(g)
9 changes: 9 additions & 0 deletions Statistical_Inference/Asymptotics/WaldPass.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
n <- 100; pvals <- seq(.1, .9, by = .05); nosim <- 1000
coverage <- sapply(pvals, function(p){
phats <- rbinom(nosim, prob = p, size = n) / n
ll <- phats - qnorm(.975) * sqrt(phats * (1 - phats) / n)
ul <- phats + qnorm(.975) * sqrt(phats * (1 - phats) / n)
mean(ll < p & ul > p)
})
g <- ggplot(data.frame(pvals, coverage), aes(x = pvals, y = coverage)) + geom_line(size = 2) + geom_hline(yintercept = 0.95) + ylim(.75, 1.0)
print(g)
3 changes: 2 additions & 1 deletion Statistical_Inference/Asymptotics/cltDice.R
Original file line number Diff line number Diff line change
Expand Up @@ -11,4 +11,5 @@ dat <- data.frame(
size = factor(rep(c(10, 20, 30), rep(nosim, 3))))
g <- ggplot(dat, aes(x = x, fill = size)) + geom_histogram(alpha = .20, binwidth=.3, colour = "black", aes(y = ..density..))
g <- g + stat_function(fun = dnorm, size = 2)
g + facet_grid(. ~ size)
g <- g + facet_grid(. ~ size)
print(g)
3 changes: 2 additions & 1 deletion Statistical_Inference/Asymptotics/cltFairCoin.R
Original file line number Diff line number Diff line change
Expand Up @@ -11,4 +11,5 @@ dat <- data.frame(
size = factor(rep(c(10, 20, 30), rep(nosim, 3))))
g <- ggplot(dat, aes(x = x, fill = size)) + geom_histogram(binwidth=.3, colour = "black", aes(y = ..density..))
g <- g + stat_function(fun = dnorm, size = 2)
g + facet_grid(. ~ size)
g <- g + facet_grid(. ~ size)
print(g)
3 changes: 2 additions & 1 deletion Statistical_Inference/Asymptotics/cltUnfairCoin.R
Original file line number Diff line number Diff line change
Expand Up @@ -11,4 +11,5 @@ dat <- data.frame(
size = factor(rep(c(10, 20, 30), rep(nosim, 3))))
g <- ggplot(dat, aes(x = x, fill = size)) + geom_histogram(binwidth=.3, colour = "black", aes(y = ..density..))
g <- g + stat_function(fun = dnorm, size = 2)
g + facet_grid(. ~ size)
g <- g + facet_grid(. ~ size)
print(g)
1 change: 1 addition & 0 deletions Statistical_Inference/Asymptotics/dependson.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
ggplot2
1 change: 1 addition & 0 deletions Statistical_Inference/Asymptotics/initLesson.R
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
library(ggplot2)
# Put initialization code in this file.
coinPlot <- function(n){
means <- cumsum(sample(0 : 1, n , replace = TRUE)) / (1 : n)
Expand Down
Loading

0 comments on commit a41091f

Please sign in to comment.