-
Notifications
You must be signed in to change notification settings - Fork 1
/
0202_grammar-of-graphics.Rmd
94 lines (78 loc) · 3.37 KB
/
0202_grammar-of-graphics.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
# The Grammar of Graphics
You can think of the grammar of graphics as a systematic approach for describing the components of a graph. It breaks down "classic" plots into individual components that let us make more complex, nuanced, and informative graphic through novel combinations.
The grammar of graphics has seven components (the ones in bold are required
explicitly **ggplot2**):
- **Data**
- The data that you're feeding into a plot.
- **Aesthetic mappings**
- How are variables (columns) from your data connect to a visual dimension?
- Horizontal (x) positioning, vertical (y) positioning, size, color, shape, etc.
- These visual dimensions are called "aesthetics"
- **Geometric objects**
- What are the objects that are actually drawn on the plot?
- A point, a line, a bar, a histogram, a density, etc.
- Scales
- How is a variable mapped to its aesthetic?
- Will it be mapped linearly? On a log scale? Something else?
- This includes things like the color scale
- e.g., c(control, treatment_1, treatment_2) -> c("blue", "green", "red")
- Statistical transformations
- Whether and how the data are combined/transformed before being plotted
- e.g., in a bar chart, data are transformed into their frequencies;
in a box-plot, data are transformed to a five-number summary.
- Coordinate system
- This is a specification of how the position aesthetics (x and y) are depicted on the plot.
For example, rectangular/Cartesian, or polar coordinates.
- Facet
- This is a specification of data variables that partition the data into
smaller "sub plots", or panels.
## Example: Scatterplot grammar
For example, consider the following plot from the `gapminder` data set.
For now, don't focus on the code, just the graph itself.
```{r}
ggplot(gapminder) +
aes(x = gdpPercap, y = lifeExp) +
geom_point(alpha = 0.1) +
scale_x_continuous(
name = "GDP per capita",
trans = "log10",
labels = scales::dollar_format()
) +
theme_bw() +
scale_y_continuous("Life expectancy")
```
This scatterplot has the following components of the grammar of graphics.
| Grammar Component | Specification |
|-----------------------|---------------|
| **data** | `gapminder` |
| **aesthetic mapping** | **x**: `gdpPercap`, **y:** `lifeExp` |
| **geometric object** | points |
| scale | x: log10, y: linear |
| statistical transform | none |
| coordinate system | rectangular |
| faceting | none |
Note that `x` and `y` aesthetics are required for scatterplots (or "point" geometric objects).
Each geometric object has its own required set of aesthetics.
## Activity: Bar chart grammar
Consider the following plot.
Don't concern yourself with the code at this point.
```{r, fig.width = 5, fig.height = 2}
gapminder |>
filter(year == 2007) |>
mutate(continent = fct_infreq(continent)) |>
ggplot() +
aes(x = continent, fill = continent) +
geom_bar() +
guides(fill = "none") +
theme_bw()
```
Fill in the seven grammar components for this plot.
| Grammar Component | Specification |
|-----------------------|---------------|
| **data** | `gapminder` |
| **aesthetic mapping** | FILL_THIS_IN |
| **geometric object** | FILL_THIS_IN |
| scale | FILL_THIS_IN |
| statistical transform | FILL_THIS_IN |
| coordinate system | FILL_THIS_IN |
| faceting | FILL_THIS_IN |