forked from mca91/EconometricsWithR
-
Notifications
You must be signed in to change notification settings - Fork 0
/
index.Rmd
executable file
·176 lines (123 loc) · 18.9 KB
/
index.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
---
title: "Introduction to Econometrics with R"
cover-image: "images/cover.png"
author: "Christoph Hanck, Martin Arnold, Alexander Gerber and Martin Schmelzer"
date: "`r Sys.Date()`"
site: bookdown::bookdown_site
output:
bookdown::gitbook:
config:
toc:
collapse: subsection
scroll_highlight: yes
fontsettings:
theme: white
family: serif
size: 2
split_by: section+number
highlight: tango
includes:
in_header: [header_include.html]
before_body: open_review_block.html
always_allow_html: yes
documentclass: book
bibliography: [book.bib, packages.bib]
biblio-style: apalike
biblatexoptions:
- sortcites
link-citations: yes
github-repo: "mca91/EconometricsWithR"
description: "Beginners with little background in statistics and econometrics often have a hard time understanding the benefits of having programming skills for learning and applying Econometrics. 'Introduction to Econometrics with R' is an interactive companion to the well-received textbook 'Introduction to Econometrics' by James H. Stock and Mark W. Watson (2015). It gives a gentle introduction to the essentials of R programming and guides students in implementing the empirical applications presented throughout the textbook using the newly aquired skills. This is supported by interactive programming exercises generated with DataCamp Light and integration of interactive visualizations of central concepts which are based on the flexible JavaScript library D3.js."
url: 'https\://www.econometrics-with-r.org/'
tags: [Tutorial, Models, Econometrics, Causal Analysis, R Programming, Textbook]
---
# Preface {-}
```{r, child="_setup.Rmd"}
```
```{r, eval=my_output == "html", results='asis', echo=FALSE}
cat('<hr style="background-color:#3C6690;height:2px">')
```
<center><img src='images/cover.png' style = 'width:60%;'></center>
<br> Chair of Econometrics <img style='float: right; margin: 0px 100px 0px 0px; width:35%' src='images/logo_claim_en_rgb.png'/> <br> Department of Business Administration and Economics <br> University of Duisburg-Essen <br> Essen, Germany <br> <a href=\"mailto:[email protected]?subject=Econometrics%20with%20R\">[email protected]</a> <br><br> `r sf <- lubridate::stamp_date('Last updated on Tuesday, September 4, 2018'); sf(Sys.Date())`
<br>
<br>
<iframe src="https://www.facebook.com/plugins/like.php?href=https%3A%2F%2Fwww.facebook.com%2FEconometricsWithR%2F&width=450&layout=standard&action=like&size=small&show_faces=true&share=true&height=80&appId" width="400" height="70" style="border:none;overflow:hidden" scrolling="no" frameborder="0" allowTransparency="true" allow="encrypted-media" align="left"></iframe>
<br>
<br>
<br>
```{r, eval=knitr::opts_knit$get("rmarkdown.pandoc.to") == "html", results='asis', echo=FALSE}
cat('<hr style="background-color:#3C6690;height:2px">')
```
Over the recent years, the statistical programming language R has become an integral part of the curricula of econometrics classes we teach at the University of Duisburg-Essen. We regularly found that a large share of the students, especially in our introductory undergraduate econometrics courses, have not been exposed to any programming language before and thus have difficulties to engage with learning R on their own. With little background in statistics and econometrics, it is natural for beginners to have a hard time understanding the benefits of having R skills for learning and applying econometrics. These particularly include the ability to conduct, document and communicate empirical studies and having the facilities to program simulation studies which is helpful for, e.g., comprehending and validating theorems which usually are not easily grasped by mere brooding over formulas. Being applied economists and econometricians, all of the latter are capabilities we value and wish to share with our students.
Instead of confronting students with pure coding exercises and complementary classic literature like the book by @venables2010, we figured it would be better to provide interactive learning material that blends R code with the contents of the well-received textbook *Introduction to Econometrics* by @stock2015 which serves as a basis for the lecture. This material is gathered in the present book *Introduction to Econometrics with R*, an empirical companion to @stock2015. It is an interactive script in the style of a reproducible research report and enables students not only to learn how results of case studies can be replicated with R but also strengthens their ability in using the newly acquired skills in other empirical applications.
#### Conventions Used in this Book {-}
+ *Italic* text indicates new terms, names, buttons and alike.
+ `r ttcode("Constant width text")` is generally used in paragraphs to refer to `r ttcode("R")` code. This includes commands, variables, functions, data types, databases and file names.
+ <code>Constant width text on gray background</code> indicates `r ttcode("R")` code that can be typed literally by you. It may appear in paragraphs for better distinguishability among executable and non-executable code statements but it will mostly be encountered in shape of large blocks of `r ttcode("R")` code. These blocks are referred to as code chunks.
#### Acknowledgement {-}
We thank the *Stifterverband für die Deutsche Wissenschaft e.V.* and the *Ministry of Science and Research North Rhine-Westphalia* for their financial support. Also, we are grateful to Alexander Blasberg for proofreading and his effort in helping with programming the exercises.
A special thanks goes to Achim Zeileis (University of Innsbruck) and Christian Kleiber (University of Basel) for their advice and constructive criticism. Another thanks goes to Rebecca Arnold from the Münster University of Applied Sciences for several suggestions regarding the website design and for providing us with her nice designs for the book cover, logos and icons. We are also indebted to all past students of our introductory econometrics courses at the University of Duisburg-Essen for their feedback.
```{r, eval=knitr::opts_knit$get("rmarkdown.pandoc.to") == "html", results='asis', echo=FALSE}
cat('<br>
![Creative Commons License](https://mirrors.creativecommons.org/presskit/buttons/88x31/svg/by-nc-sa.eu.svg)
This book is licensed under the [Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License](http://creativecommons.org/licenses/by-nc-sa/4.0/).')
```
# Introduction
```{r, eval=knitr::opts_knit$get("rmarkdown.pandoc.to") == "html", results='asis', echo=FALSE}
cat('<hr style="background-color:#3C6690;height:2px">')
```
The interest in the freely available statistical programming language and software environment `r ttcode("R")` [@R-base] is soaring. By the time we wrote first drafts for this project, more than 11000 add-ons (many of them providing cutting-edge methods) were made available on the Comprehensive `r ttcode("R")` Archive Network ([CRAN](https://cran.r-project.org/)), an extensive network of FTP servers around the world that store identical and up-to-date versions of `r ttcode("R")` code and its documentation. `r ttcode("R")` dominates other (commercial) software for statistical computing in most fields of research in applied statistics. The benefits of it being freely available, open source and having a large and constantly growing community of users that contribute to CRAN render `r ttcode("R")` more and more appealing for empirical economists and econometricians alike.
A striking advantage of using `r ttcode("R")` in econometrics is that it enables students to explicitly document their analysis step-by-step such that it is easy to update and to expand. This allows to re-use code for similar applications with different data. Furthermore, `r ttcode("R")` programs are fully reproducible, which makes it straightforward for others to comprehend and validate results.
Over the recent years, `r ttcode("R")` has thus become an integral part of the curricula of econometrics classes we teach at the University of Duisburg-Essen. In some sense, learning to code is comparable to learning a foreign language and continuous practice is essential for the learning success. Needless to say, presenting bare `r ttcode("R")` code on slides does not encourage the students to engage with hands-on experience on their own. This is why `r ttcode("R")` is crucial. As for accompanying literature, there are some excellent books that deal with `r ttcode("R")` and its applications to econometrics, e.g., @kleiber2008. However, such sources may be somewhat beyond the scope of undergraduate students in economics having little understanding of econometric methods and barely any experience in programming at all. Consequently, we started to compile a collection of reproducible reports for use in class. These reports provide guidance on how to implement selected applications from the textbook *Introduction to Econometrics* [@stock2015] which serves as a basis for the lecture and the accompanying tutorials. This process was facilitated considerably by `r ttcode("knitr")` [@R-knitr] and `r ttcode("R markdown")` [@R-rmarkdown]. In conjunction, both `r ttcode("R")` packages provide powerful functionalities for dynamic report generation which allow to seamlessly combine pure text, LaTeX, `r ttcode("R")` code and its output in a variety of formats, including PDF and HTML. Moreover, writing and distributing reproducible reports for use in academia has been enriched tremendously by the `r ttcode("bookdown")` package [@R-bookdown] which has become our main tool for this project. `r ttcode("bookdown")` builds on top of `r ttcode("R markdown")` and allows to create appealing HTML pages like this one, among other things.
Being inspired by *Using R for Introductory Econometrics* [@heiss2016]^[@heiss2016 builds on the popular *Introductory Econometrics* [@wooldridge2016] and demonstrates how to replicate the applications discussed therein using `r ttcode("R")`.] and with this powerful toolkit at hand we wrote up our own empirical companion to @stock2015. The result, which you started to look at, is *Introduction to Econometrics with R*.
Similarly to the book by @heiss2016, this project is neither a comprehensive econometrics textbook nor is it intended to be a general introduction to `r ttcode("R")`. We feel that Stock and Watson do a great job at explaining the intuition and theory of econometrics, and at any rate better than we could in yet another introductory textbook! *Introduction to Econometrics with R* is best described as an interactive script in the style of a reproducible research report which aims to provide students with a platform-independent e-learning arrangement by seamlessly intertwining theoretical core knowledge and empirical skills in undergraduate econometrics. Of course, the focus is on empirical applications with `r ttcode("R")`. We leave out derivations and proofs wherever we can. Our goal is to enable students not only to learn how results of case studies can be replicated with `r ttcode("R")` but we also intend to strengthen their ability in using the newly acquired skills in other empirical applications --- immediately within *Introduction to Econometrics with R*.
To realize this, each chapter contains interactive `r ttcode("R")` programming exercises. These exercises are used as supplements to code chunks that display how previously discussed techniques can be implemented within `r ttcode("R")`. They are generated using the [DataCamp light widget](https://github.com/datacamp/datacamp-light) and are backed by an `r ttcode("R")` session which is maintained on [DataCamp](https://www.datacamp.com/home)'s servers. You may play around with the example exercise presented below.
<iframe src="DCL/intro_1.html" frameborder="0" scrolling="no" style="width:100%;height:360px"></iframe>
As you can see above, the widget consists of two tabs. `r ttcode("script.R")` mimics an `r ttcode(".R")`-file, a file format that is commonly used for storing `r ttcode("R")` code. Lines starting with a # are commented out, that is, they are not recognized as code. Furthermore, `r ttcode("script.R")` works like an exercise sheet where you may write down the solution you come up with. If you hit the button *Run*, the code will be executed, submission correctness tests are run and you will be notified whether your approach is correct. If it is not correct, you will receive feedback suggesting improvements or hints. The other tab, `r ttcode("R Console")`, is a fully functional `r ttcode("R")` console that can be used for trying out solutions to exercises before submitting them. Of course you may submit (almost any) `r ttcode("R")` code and use the console to play around and explore. Simply type a command and hit the *Enter* key on your keyboard.
As an example, consider the following line of code presented in chunk below. It tells `r ttcode("R")` to compute the number of packages available on `r ttcode("CRAN")`. The code chunk is followed by the output produced.
```{r}
# check the number of R packages available on CRAN
nrow(available.packages(repos = "http://cran.us.r-project.org"))
```
Each code chunk is equipped with a button on the outer right hand side which copies the code to your clipboard. This makes it convenient to work with larger code segments in your version of <tt>R</tt>/*RStudio* or in the widgets presented throughout the book. In the widget above, you may click on `r ttcode("R Console")` and type `nrow(available.packages(repos = "http://cran.us.r-project.org"))` (the command from the code chunk above) and execute it by hitting *Enter* on your keyboard.^[The `r ttcode("R")` session is initialized by clicking into the widget. This might take a few seconds. Just wait for the indicator next to the button *Run* to turn green.]
Note that some lines in the widget are out-commented which ask you to assign a numeric value to a variable and then to print the variable's content to the console. You may enter your solution approach to `r ttcode("script.R")` and hit the button *Run* in order to get the feedback described further above. In case you do not know how to solve this sample exercise (don't panic, that is probably why you are reading this), a click on *Hint* will provide you with some advice. If you still can't find a solution, a click on *Solution* will provide you with another tab, `r ttcode("Solution.R")` which contains sample solution code. It will often be the case that exercises can be solved in many different ways and `r ttcode("Solution.R")` presents what we consider as comprehensible and idiomatic.
## A Very Short Introduction to `r ttcode("R")` and *RStudio*
```{r, fig.align='center', echo=FALSE, fig.cap="RStudio: the four panes"}
knitr::include_graphics('images/rstudio.jpg')
```
#### `r ttcode("R")` Basics {-}
As mentioned before, this book is not intended to be an introduction to `r ttcode("R")` but as a guide on how to use its capabilities for applications commonly encountered in undergraduate econometrics. Those having basic knowledge in `r ttcode("R")` programming will feel comfortable starting with Chapter \@ref(pt). This section, however, is meant for those who have not worked with `r ttcode("R")` or *RStudio* before. If you at least know how to create objects and call functions, you can skip it. If you would like to refresh your skills or get a feeling for how to work with *RStudio*, keep reading.
First of all start *RStudio* and open a new `r ttcode("R")` script by selecting *File*, *New File*, *R Script*. In the editor pane, type
```{r, eval = F}
1 + 1
```
and click on the button labeled *Run* in the top right corner of the editor. By doing so, your line of code is sent to the console and the result of this operation should be displayed right underneath it. As you can see, `r ttcode("R")` works just like a calculator. You can do all arithmetic calculations by using the corresponding operator (`r ttcode("+")`, `r ttcode("-")`, `r ttcode("*")`, `r ttcode("/")` or `r ttcode("^")`). If you are not sure what the last operator does, try it out and check the results.
#### Vectors {-}
`r ttcode("R")` is of course more sophisticated than that. We can work with variables or, more generally, objects. Objects are defined by using the assignment operator `r ttcode("<-")`. To create a variable named `r ttcode("x")` which contains the value `r ttcode("10")` type `x <- 10` and click the button *Run* yet again. The new variable should have appeared in the environment pane on the top right. The console however did not show any results, because our line of code did not contain any call that creates output. When you now type `x` in the console and hit return, you ask `r ttcode("R")` to show you the value of `r ttcode("x")` and the corresponding value should be printed in the console.
`r ttcode("x")` is a scalar, a vector of length $1$. You can easily create longer vectors by using the function `r ttcode("c()")` (*c* for "concatenate" or "combine"). To create a vector `r ttcode("y")` containing the numbers $1$ to $5$ and print it, do the following.
```{r, eval = T}
y <- c(1, 2, 3, 4, 5)
y
```
You can also create a vector of letters or words. For now just remember that characters have to be surrounded by quotes, else they will be parsed as object names.
```{r, eval = F}
hello <- c("Hello", "World")
```
Here we have created a vector of length 2 containing the words `r ttcode("Hello")` and `r ttcode("World")`.
Do not forget to save your script! To do so, select *File*, *Save*.
#### Functions {-}
You have seen the function `r ttcode("c()")` that can be used to combine objects. In general, all function calls look the same: a function name is always followed by round parentheses. Sometimes, the parentheses include arguments.
Here are two simple examples.
```{r, eval = T}
# generate the vector `z`
z <- seq(from = 1, to = 5, by = 1)
# compute the mean of the enries in `z`
mean(x = z)
```
In the first line we use a function called `r ttcode("seq()")` to create the exact same vector as we did in the previous section, calling it `r ttcode("z")`. The function takes on the arguments `r ttcode("from")`, `r ttcode("to")` and `r ttcode("by")` which should be self-explanatory.
The function `r ttcode("mean()")` computes the arithmetic mean of its argument `r ttcode("x")`. Since we pass the vector `r ttcode("z")` as the argument `r ttcode("x")`, the result is `r ttcode("3")`!
If you are not sure which arguments a function expects, you may consult the function's documentation. Let's say we are not sure how the arguments required for `r ttcode("seq()")` work. We then type `?seq` in the console. By hitting return the documentation page for that function pops up in the lower right pane of *RStudio*. In there, the section *Arguments* holds the information we seek. On the bottom of almost every help page you find examples on how to use the corresponding functions. This is very helpful for beginners and we recommend to look out for those.
Of course, all of the commands presented above also work in interactive widgets throughout the book. You may try them below.
```{r, echo=FALSE, results='asis'}
write_html(playground = T)
```