-
Notifications
You must be signed in to change notification settings - Fork 0
/
README.Rmd
117 lines (76 loc) · 5.51 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
---
output:
md_document:
variant: markdown_github
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, echo = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "README-"
)
```
# resurrectionr <img src="man/figures/logo.png" align="right" />
[![Travis-CI Build Status](https://travis-ci.org/mdjeric/resurrectionr.svg?branch=master)](https://travis-ci.org/mdjeric/resurrectionr)
[![lifecycle](https://img.shields.io/badge/lifecycle-maturing-blue.svg)](https://www.tidyverse.org/lifecycle/#maturing)
The purpose of resurrectionr is `recode_religion()` which recodes a set of religious identification variables from the US General Social Survey, in a way that is analytically useful and easy to implement in data processing.
Bigger sociological and political problems in recoding of religious identification are elaborated and resolved with a recoding schema in the paper by Darren E. Sherkat and Derek Lehman,[^1] based on classification[^2] from 2001. This package provides solutions for main technical issues with recoding of religious identification from GSS in R, using said schema.
1. Variables contain large number of codes: 16 (`relig`), 30 (`denom`), and 201 (`other`).
2. Same pattern of variables describes religious identification for a large number of items (e.g. spouses', father's, mother's, etc.), and all of them have, naturally, different variable names.
3. Punch codes are not consequtive indexes, but are often skipping certain numbers, and, for certain items they tend to have only punches, not labels.
This package provides universal function `recode_religion()` that can recode all these varaibles and adds additional benefits in terms of cheking values, providing a recoding key, etc.
## Installation
```{r, eval = FALSE}
# Install development version from GitHub
# install.packages("devtools") # If you don't have it
devtools::install_github("mdjeric/resurrectionr")
```
## Example
This is a basic use of function:
```{r example}
library(resurrectionr)
gss <- gss14_f
gss$religion <- recode_religion(gss$relig, gss$denom, gss$other)
summary(gss$religion)
```
## Description
`recode_religion()` is a wrapper for basic function that does all the recoding. It confirms the variables are with proper codes, same length, transforms the type, and provides additional information, warning messages, errors, etc.
`fct_rec_relig()`, the simple function, uses data frames in the list [`rdo_cdbk`](reference/rdo_cdbk.html) from which three vectors are formed where index corresponds to punch number, and label is value (of `relig`, `denom`, and `other`).
For each *new* religious identification, separate vectors (for `relig`, `denom`, and/or `other` identification) are created that contain corresponding punch codes and labels, per paper and SPSS syntax. In addition, logical vectors are created for Sectarian Protestants with valid denomination codes for sorting between 'Sectarian Protestants', 'Christian - no group given', and 'other religions'.
Next, twelve logical vectors, for respondent's belonging to each group, are created by checking against appropriate vectors containing labels. Names are assigned, function prints frequency table, and returns vector that contains new variable.
## Additional behavior of the function
Function checks that the (1) vectors are of same length, and (2) that their values are valid answers, throwing errors like this:
```{r, eval = FALSE}
religion <- recode_religion("PROTESTANT", c(12, 13), "IAP")
#> Error: Vectors must be of the same lenght, currently they are:
#> Length of `relig`: 1.
#> Length of `denom`: 2.
#> Length of `other`: 1.
```
It is also possible to combine punches with labels, and get same results, or print the coding key that was used on specific data.
```{r example-combined-punch-label}
set.seed(999) # for reproducability
short <- sample(1:2358, 50, replace = FALSE) # chose random 50 cases
gshrt <- gss[short, ]
gshrt$religion <- recode_religion(gshrt$relig, gshrt$denom, gshrt$other,
add_missing_levels = TRUE, print_key = TRUE)
```
It also works well when some (or all) variables contain punches, either as characters or numbers.
```{r example-mixed}
gshrt$d_num <- gss14_n$denom[short]
gshrt$o_num_char <- as.character(gss14_n$other[short])
gshrt$religion2 <- recode_religion(gshrt$relig, gshrt$d_num, gshrt$o_num_char,
add_missing_levels = TRUE, frequencies = FALSE)
# We can see that there is no difference
FALSE %in% (gshrt$religion2 == gshrt$religion)
```
## Further work
Three more things are planned to be included in this package:
1. Recode to often more useful group of 7 religious identifications.
2. Function that just prints the results.
3. Function that returns key with all recoding tetrades, and all recoding tetrades in particular set, as data frame (it might be useful for someone).
## Code of conduct
Please note that this project is released with a [Contributor Code of Conduct](CONDUCT.md). By participating in this project you agree to abide by its terms.
[^1]: ["After The Resurrection: The Field of the Sociology of Religion in the United States"](https://iranianredneck.wordpress.com/2016/11/29/why-reltrad-sucks-contesting-the-measure-of-american-religion/){target="_blank"}
[^2]: Sherkat, Darren E. 2001. "Tracking the restructuring of American religion: Religious affiliation and patterns of religious mobility, 1973–1998." *Social Forces* 79(4), 1459-1493. [doi:10.1353/sof.2001.0052](https://doi.org/10.1353/sof.2001.0052)