This page contains a collection of R tutorials, developed at the Vrije Universiteit Amsterdam for Communication Science courses that use R.
The goal is to organize relevant material into modular components, for more efficient design and maintenance of material, that can be used across courses, and that are accessible to students during and after their studies.
Below we list the relevant handouts/tutorials. Each links to the md file, see the Rmd file with the same name for the source code of the tutorials.
This is a set of tutorials designed to teach using the tidyverse functions for data cleaning, reshaping, visualizing etc. The chapter numbers are relevant chapters from the excellent (and freely available) book "R for Data Scientists" (R4DS)
Handout | Video Tutorial | R4DS ch. | Core packages / functions |
---|---|---|---|
Fun with R | Fun with R | 3 | tidyverse,ggplot2,igraph |
R Basics | Intro to R | 4 | (base R functions) |
Transforming Data | Importing and Cleaning | 5 | dplyr: filter, select, arrange, mutate |
Summarizing Data | Grouping and Summarizing | 5 | dplyr: group_by, summarize |
Visualizing Data | ggplot 1 | 7 | ggplot2 |
Reshaping data | Reshaping | 12 | tidyr: spread, gather |
Combining (merging) Data | Joining | 13 | dplyr: inner_join, left_join, etc. |
Basic string (text) handling | 14 | readr: str_detect, str_extract etc., iconv |
This is a set of tutorials designed to teach basic statistical modeling and analysis. The first tutorial includes examples for standard regression analysis as well as analysis of variance. Later tutorials exemplify the use of more advanced statistical modelling approaches including the generalized linear model and multilevel models.
Tutorial | Video tutorial | Core packages / functions |
---|---|---|
Basic statistics | Basic stats | stats: lm, aov, t.test |
Advanced statistics overview | see GLM and Multilevel | stats: glm, lme4: lmer, glmer |
Generalized linear models | GLM (on family argument) | stats: glm, family, sjPlot: tab_model, plot_model |
Multilevel Models | Multilevel | lme4: lmer, glmer, sjPlot: tab_model, plot_model |
The following tutorials can be used to teach basics of test theory and particularly confirmatory and exploratory factor analysis approaches.
Tutorial | Video tutorial | Core packages / functions |
---|---|---|
Test Theory and Confirmatory Factor Analysis | CFA in R | psych: describe, mardia; lavaan: cfa, fitMeasures, modindices; semTools: reliability |
Exploratory Factor Analysis | psych: describe, mardia, fa.parallel, nfactors, fa, fa.diagram, omega |
For a general introduction to text analysis (in R), see these videos on preprocessing and different analysis approaches
Tutorial | Video tutorial | Core packages / functions |
---|---|---|
Text analysis | corpus stats | quanteda |
Lexical sentiment analysis | dictionaries | quanteda, corpustools |
LDA Topic Modeling | Video series, Tutorial demo | topicmodels,quanteda |
Structural Topic Modeling | Variants of Topic Models; Structural Topic Models | stm, quanteda |
NLP Preprocessing with Spacy(r) | spacyr, quanteda (see also spacy itself) | |
Supervised machine learning for text classification | Supervised Machine Learning | caret |
Creating a topic browser with LDA | corpustools |
In general, most R packages can be installed without any issues. However, there are some exceptions that you need to know about.
For quanteda (that we use in the text analysis tutorials), your computer needs certain software that is not always installed, as mentioned on the quanteda website.
You can either install this software, but we rather recommend using R version 4.0.0 (or higher) where this is no longer required.
To see your current R version, enter version
in your R console.
To update, visit the R website (Windows, Mac).
When running install.packages()
You sometimes get the message that There is a binary version available but the source version is later (we're mainly seen this on Mac).
You then get the question whether you want to install from sources the package which needs compilation (Yes/no) .
To answer this question, you have to type "yes" or "no" in your R console.
Most often, you'll want to say no.
Simply put, R tells you that it has a new version of a package, but if you want to use it your computer will need to build it.
The problem is that this requires some development software that you might not have installed.
If you say no, you'll install an older version that has already been build for you.
In rare cases, installing from source is the only way, in which case you'll have to install the software that R refers to.