-
Notifications
You must be signed in to change notification settings - Fork 21
/
Copy pathsimple-example.html
317 lines (292 loc) · 34.8 KB
/
simple-example.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
<!DOCTYPE html>
<!-- Generated by pkgdown: do not edit by hand --><html lang="en">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>A Simple Example • bcf2</title>
<!-- jquery --><script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.min.js" integrity="sha256-FgpCb/KJQlLNfOu91ta32o/NMZxltwRo8QtmkMRdAu8=" crossorigin="anonymous"></script><!-- Bootstrap --><link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/twitter-bootstrap/3.3.7/css/bootstrap.min.css" integrity="sha256-916EbMg70RQy9LHiGkXzG8hSg9EdNy97GazNG/aiY1w=" crossorigin="anonymous">
<script src="https://cdnjs.cloudflare.com/ajax/libs/twitter-bootstrap/3.3.7/js/bootstrap.min.js" integrity="sha256-U5ZEeKfGNOja007MMD3YBI0A3OSZOQbeG6z2f2Y0hu8=" crossorigin="anonymous"></script><!-- Font Awesome icons --><link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.7.1/css/all.min.css" integrity="sha256-nAmazAk6vS34Xqo0BSrTb+abbtFlgsFK7NKSi6o7Y78=" crossorigin="anonymous">
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.7.1/css/v4-shims.min.css" integrity="sha256-6qHlizsOWFskGlwVOKuns+D1nB6ssZrHQrNj1wGplHc=" crossorigin="anonymous">
<!-- clipboard.js --><script src="https://cdnjs.cloudflare.com/ajax/libs/clipboard.js/2.0.4/clipboard.min.js" integrity="sha256-FiZwavyI2V6+EXO1U+xzLG3IKldpiTFf3153ea9zikQ=" crossorigin="anonymous"></script><!-- headroom.js --><script src="https://cdnjs.cloudflare.com/ajax/libs/headroom/0.9.4/headroom.min.js" integrity="sha256-DJFC1kqIhelURkuza0AvYal5RxMtpzLjFhsnVIeuk+U=" crossorigin="anonymous"></script><script src="https://cdnjs.cloudflare.com/ajax/libs/headroom/0.9.4/jQuery.headroom.min.js" integrity="sha256-ZX/yNShbjqsohH1k95liqY9Gd8uOiE1S4vZc+9KQ1K4=" crossorigin="anonymous"></script><!-- pkgdown --><link href="../pkgdown.css" rel="stylesheet">
<script src="../pkgdown.js"></script><meta property="og:title" content="A Simple Example">
<meta property="og:description" content="">
<meta name="twitter:card" content="summary">
<!-- mathjax --><script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/MathJax.js" integrity="sha256-nvJJv9wWKEm88qvoQl9ekL2J+k/RWIsaSScxxlsrv8k=" crossorigin="anonymous"></script><script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/config/TeX-AMS-MML_HTMLorMML.js" integrity="sha256-84DKXVJXs0/F8OTMzX4UR909+jtl4G7SPypPavF+GfA=" crossorigin="anonymous"></script><!--[if lt IE 9]>
<script src="https://oss.maxcdn.com/html5shiv/3.7.3/html5shiv.min.js"></script>
<script src="https://oss.maxcdn.com/respond/1.4.2/respond.min.js"></script>
<![endif]-->
</head>
<body>
<div class="container template-article">
<header><div class="navbar navbar-default navbar-fixed-top" role="navigation">
<div class="container">
<div class="navbar-header">
<button type="button" class="navbar-toggle collapsed" data-toggle="collapse" data-target="#navbar" aria-expanded="false">
<span class="sr-only">Toggle navigation</span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
</button>
<span class="navbar-brand">
<a class="navbar-link" href="../index.html">bcf2</a>
<span class="version label label-default" data-toggle="tooltip" data-placement="bottom" title="Released version">2.0.0</span>
</span>
</div>
<div id="navbar" class="navbar-collapse collapse">
<ul class="nav navbar-nav">
<li>
<a href="../index.html">
<span class="fas fa fas fa-home fa-lg"></span>
</a>
</li>
<li>
<a href="../reference/index.html">Reference</a>
</li>
<li class="dropdown">
<a href="#" class="dropdown-toggle" data-toggle="dropdown" role="button" aria-expanded="false">
Articles
<span class="caret"></span>
</a>
<ul class="dropdown-menu" role="menu">
<li>
<a href="../articles/predict-example.html">Prediction with BCF</a>
</li>
<li>
<a href="../articles/simple-example.html">A Simple Example</a>
</li>
</ul>
</li>
<li>
<a href="../news/index.html">Changelog</a>
</li>
</ul>
<ul class="nav navbar-nav navbar-right"></ul>
</div>
<!--/.nav-collapse -->
</div>
<!--/.container -->
</div>
<!--/.navbar -->
</header><div class="row">
<div class="col-md-9 contents">
<div class="page-header toc-ignore">
<h1>A Simple Example</h1>
<div class="hidden name"><code>simple-example.Rmd</code></div>
</div>
<p>In this vignette, we show how to use BCF to estimate treatment effects of a simulated intervention.</p>
<div class="sourceCode" id="cb1"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb1-1" title="1"><span class="kw"><a href="https://rdrr.io/r/base/library.html">library</a></span>(bcf2)</a>
<a class="sourceLine" id="cb1-2" title="2"><span class="co">#> Warning: replacing previous import 'Rcpp::LdFlags' by 'RcppParallel::LdFlags'</span></a>
<a class="sourceLine" id="cb1-3" title="3"><span class="co">#> when loading 'bcf2'</span></a>
<a class="sourceLine" id="cb1-4" title="4"><span class="co">#> </span></a>
<a class="sourceLine" id="cb1-5" title="5"><span class="co">#> Attaching package: 'bcf2'</span></a>
<a class="sourceLine" id="cb1-6" title="6"><span class="co">#> The following object is masked from 'package:stats':</span></a>
<a class="sourceLine" id="cb1-7" title="7"><span class="co">#> </span></a>
<a class="sourceLine" id="cb1-8" title="8"><span class="co">#> predict</span></a>
<a class="sourceLine" id="cb1-9" title="9"><span class="kw"><a href="https://rdrr.io/r/base/library.html">library</a></span>(ggplot2)</a>
<a class="sourceLine" id="cb1-10" title="10"><span class="kw"><a href="https://rdrr.io/r/base/library.html">library</a></span>(latex2exp)</a>
<a class="sourceLine" id="cb1-11" title="11"><span class="kw"><a href="https://rdrr.io/r/base/library.html">library</a></span>(rpart)</a>
<a class="sourceLine" id="cb1-12" title="12"><span class="kw"><a href="https://rdrr.io/r/base/library.html">library</a></span>(rpart.plot)</a></code></pre></div>
<div id="simulate-data" class="section level2">
<h2 class="hasAnchor">
<a href="#simulate-data" class="anchor"></a>Simulate data</h2>
<p>First, we simulate some data for testing. This data set has three covariates <span class="math inline">\(X\)</span>, which we divide into two control variables (that is, two covariates related to the outcome, <span class="math inline">\(y\)</span>) and one effect moderator (that is, a covariates related to the treatment effect, <span class="math inline">\(\tau\)</span>).</p>
<p>We draw three random <span class="math inline">\(X\)</span>s for each unit and generate each unit’s expected outcome without treatment, <span class="math inline">\(\mu_i\)</span>, as a function of <span class="math inline">\(x^{(1)}_i\)</span> and <span class="math inline">\(x^{(2)}_i\)</span>. Each unit’s probability of joining the intervention, <span class="math inline">\(\pi_i\)</span>, is also a function of <span class="math inline">\(\mu_i\)</span>, so that units with larger responses are more likely to participate in the intervention. We then assign units to treatment (<span class="math inline">\(z_i = 1\)</span>) or comparison (<span class="math inline">\(z_i = 0\)</span>) as a function of <span class="math inline">\(\pi_i\)</span>.</p>
<p>Then we generate the true treatment effect for each unit, <span class="math inline">\(\tau_i\)</span>. As noted above, <span class="math inline">\(\tau_i\)</span> is a function of <span class="math inline">\(x^{(3)}_i\)</span>. The observed outcome, <span class="math inline">\(y_i\)</span>, is a function of <span class="math inline">\(\mu_i\)</span>, <span class="math inline">\(\tau_i\)</span>, a random error term with variance <span class="math inline">\(\sigma^{2}\)</span>, and weights <span class="math inline">\(w\)</span> if applicable.</p>
<div class="sourceCode" id="cb2"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb2-1" title="1"><span class="kw"><a href="https://rdrr.io/r/base/Random.html">set.seed</a></span>(<span class="dv">1</span>)</a>
<a class="sourceLine" id="cb2-2" title="2"></a>
<a class="sourceLine" id="cb2-3" title="3">p <-<span class="st"> </span><span class="dv">3</span> <span class="co"># two control variables and one effect moderator</span></a>
<a class="sourceLine" id="cb2-4" title="4">n <-<span class="st"> </span><span class="dv">1000</span></a>
<a class="sourceLine" id="cb2-5" title="5">n_burn <-<span class="st"> </span><span class="dv">2000</span></a>
<a class="sourceLine" id="cb2-6" title="6">n_sim <-<span class="st"> </span><span class="dv">1000</span></a>
<a class="sourceLine" id="cb2-7" title="7"></a>
<a class="sourceLine" id="cb2-8" title="8">x <-<span class="st"> </span><span class="kw"><a href="https://rdrr.io/r/base/matrix.html">matrix</a></span>(<span class="kw"><a href="https://rdrr.io/r/stats/Normal.html">rnorm</a></span>(n<span class="op">*</span>p), <span class="dt">nrow=</span>n)</a>
<a class="sourceLine" id="cb2-9" title="9">weights <-<span class="st"> </span><span class="kw"><a href="https://rdrr.io/r/base/MathFun.html">abs</a></span>(<span class="kw"><a href="https://rdrr.io/r/stats/Normal.html">rnorm</a></span>(n))</a>
<a class="sourceLine" id="cb2-10" title="10"></a>
<a class="sourceLine" id="cb2-11" title="11"><span class="co"># create targeted selection, whereby a unit's likelihood of joining the intervention (pi) is related to its expected outcome (mu)</span></a>
<a class="sourceLine" id="cb2-12" title="12">q <-<span class="st"> </span><span class="dv">-1</span><span class="op">*</span>(x[,<span class="dv">1</span>]<span class="op">></span>(x[,<span class="dv">2</span>])) <span class="op">+</span><span class="st"> </span><span class="dv">1</span><span class="op">*</span>(x[,<span class="dv">1</span>]<span class="op"><</span>(x[,<span class="dv">2</span>])) <span class="fl">-0.1</span></a>
<a class="sourceLine" id="cb2-13" title="13"></a>
<a class="sourceLine" id="cb2-14" title="14"><span class="co"># generate treatment variable</span></a>
<a class="sourceLine" id="cb2-15" title="15">pi <-<span class="st"> </span><span class="kw"><a href="https://rdrr.io/r/stats/Normal.html">pnorm</a></span>(q)</a>
<a class="sourceLine" id="cb2-16" title="16">z <-<span class="st"> </span><span class="kw"><a href="https://rdrr.io/r/stats/Binomial.html">rbinom</a></span>(n,<span class="dv">1</span>,pi)</a>
<a class="sourceLine" id="cb2-17" title="17"></a>
<a class="sourceLine" id="cb2-18" title="18"><span class="co"># tau is the true treatment effect; it varies across units as a function of</span></a>
<a class="sourceLine" id="cb2-19" title="19"><span class="co"># X3, the effect moderator</span></a>
<a class="sourceLine" id="cb2-20" title="20">tau <-<span class="st"> </span><span class="dv">1</span><span class="op">/</span>(<span class="dv">1</span> <span class="op">+</span><span class="st"> </span><span class="kw"><a href="https://rdrr.io/r/base/Log.html">exp</a></span>(<span class="op">-</span>x[,<span class="dv">3</span>]))</a>
<a class="sourceLine" id="cb2-21" title="21"></a>
<a class="sourceLine" id="cb2-22" title="22"><span class="co"># generate the response using q, tau and z</span></a>
<a class="sourceLine" id="cb2-23" title="23">mu <-<span class="st"> </span>(q <span class="op">+</span><span class="st"> </span>tau<span class="op">*</span>z)</a>
<a class="sourceLine" id="cb2-24" title="24"></a>
<a class="sourceLine" id="cb2-25" title="25"><span class="co"># set the noise level relative to the expected mean function of Y</span></a>
<a class="sourceLine" id="cb2-26" title="26">sigma <-<span class="st"> </span><span class="kw"><a href="https://rdrr.io/r/base/diff.html">diff</a></span>(<span class="kw"><a href="https://rdrr.io/r/base/range.html">range</a></span>(q <span class="op">+</span><span class="st"> </span>tau<span class="op">*</span>pi))<span class="op">/</span><span class="dv">8</span></a>
<a class="sourceLine" id="cb2-27" title="27"></a>
<a class="sourceLine" id="cb2-28" title="28"><span class="co"># draw the response variable with additive error</span></a>
<a class="sourceLine" id="cb2-29" title="29">y <-<span class="st"> </span>mu <span class="op">+</span><span class="st"> </span>sigma<span class="op">*</span><span class="kw"><a href="https://rdrr.io/r/stats/Normal.html">rnorm</a></span>(n)<span class="op">/</span><span class="kw"><a href="https://rdrr.io/r/base/MathFun.html">sqrt</a></span>(weights)</a></code></pre></div>
</div>
<div id="fit-bcf-model" class="section level2">
<h2 class="hasAnchor">
<a href="#fit-bcf-model" class="anchor"></a>Fit BCF model</h2>
<p>In this data set we have observed <span class="math inline">\(y_i\)</span>, <span class="math inline">\(x_i\)</span>, and <span class="math inline">\(\pi_i\)</span> values to which we can fit our BCF model. We can then compare the <span class="math inline">\(\hat{\tau}_i\)</span> estimates from BCF to the true <span class="math inline">\(\tau_i\)</span> from the data-generating process. Note that we are using the <code>n_chains</code> and ‘n_chain_clusters’ arguments to <code><a href="../reference/bcf.html">bcf()</a></code>, which allow us to run several MCMC chains in parallel and assess whether they have converged to the posterior distribution.</p>
<div class="sourceCode" id="cb3"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb3-1" title="1">bcf_out <-<span class="st"> </span>bcf2<span class="op">::</span><span class="kw"><a href="https://rdrr.io/pkg/bcf2/man/bcf.html">bcf</a></span>(<span class="dt">y =</span> y,</a>
<a class="sourceLine" id="cb3-2" title="2"> <span class="dt">z =</span> z,</a>
<a class="sourceLine" id="cb3-3" title="3"> <span class="dt">x_control =</span> x,</a>
<a class="sourceLine" id="cb3-4" title="4"> <span class="dt">x_moderate =</span> x,</a>
<a class="sourceLine" id="cb3-5" title="5"> <span class="dt">pihat =</span> pi,</a>
<a class="sourceLine" id="cb3-6" title="6"> <span class="dt">nburn =</span> n_burn,</a>
<a class="sourceLine" id="cb3-7" title="7"> <span class="dt">nsim =</span> n_sim,</a>
<a class="sourceLine" id="cb3-8" title="8"> <span class="dt">w =</span> weights,</a>
<a class="sourceLine" id="cb3-9" title="9"> <span class="dt">n_chains =</span> <span class="dv">4</span>,</a>
<a class="sourceLine" id="cb3-10" title="10"> <span class="dt">n_chain_clusters =</span> <span class="dv">2</span>,</a>
<a class="sourceLine" id="cb3-11" title="11"> <span class="dt">random_seed =</span> <span class="dv">1</span>,</a>
<a class="sourceLine" id="cb3-12" title="12"> <span class="dt">update_interval =</span> <span class="dv">1</span>)</a></code></pre></div>
</div>
<div id="check-mcmc-diagnostics" class="section level2">
<h2 class="hasAnchor">
<a href="#check-mcmc-diagnostics" class="anchor"></a>Check MCMC diagnostics</h2>
<p>We use the <code>summarise_bcf</code> function to obtain posterior summary stats and MCMC diagnostics. We uses those diagnostics to assess convergence of our run.</p>
<div class="sourceCode" id="cb4"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb4-1" title="1">bcf2<span class="op">::</span><span class="kw"><a href="https://rdrr.io/pkg/bcf2/man/summarise_bcf.html">summarise_bcf</a></span>(bcf_out)</a>
<a class="sourceLine" id="cb4-2" title="2"><span class="co">#> Summary statistics for each Markov Chain Monte Carlo run</span></a>
<a class="sourceLine" id="cb4-3" title="3"><span class="co">#> </span></a>
<a class="sourceLine" id="cb4-4" title="4"><span class="co">#> Iterations = 1:1000</span></a>
<a class="sourceLine" id="cb4-5" title="5"><span class="co">#> Thinning interval = 1 </span></a>
<a class="sourceLine" id="cb4-6" title="6"><span class="co">#> Number of chains = 4 </span></a>
<a class="sourceLine" id="cb4-7" title="7"><span class="co">#> Sample size per chain = 1000 </span></a>
<a class="sourceLine" id="cb4-8" title="8"><span class="co">#> </span></a>
<a class="sourceLine" id="cb4-9" title="9"><span class="co">#> 1. Empirical mean and standard deviation for each variable,</span></a>
<a class="sourceLine" id="cb4-10" title="10"><span class="co">#> plus standard error of the mean:</span></a>
<a class="sourceLine" id="cb4-11" title="11"><span class="co">#> </span></a>
<a class="sourceLine" id="cb4-12" title="12"><span class="co">#> Mean SD Naive SE Time-series SE</span></a>
<a class="sourceLine" id="cb4-13" title="13"><span class="co">#> sigma 0.33332 0.007795 0.0001232 0.0001378</span></a>
<a class="sourceLine" id="cb4-14" title="14"><span class="co">#> tau_bar 0.46840 0.034902 0.0005519 0.0009617</span></a>
<a class="sourceLine" id="cb4-15" title="15"><span class="co">#> mu_bar -0.08748 0.020434 0.0003231 0.0006031</span></a>
<a class="sourceLine" id="cb4-16" title="16"><span class="co">#> yhat_bar 0.12988 0.011666 0.0001845 0.0001845</span></a>
<a class="sourceLine" id="cb4-17" title="17"><span class="co">#> </span></a>
<a class="sourceLine" id="cb4-18" title="18"><span class="co">#> 2. Quantiles for each variable:</span></a>
<a class="sourceLine" id="cb4-19" title="19"><span class="co">#> </span></a>
<a class="sourceLine" id="cb4-20" title="20"><span class="co">#> 2.5% 25% 50% 75% 97.5%</span></a>
<a class="sourceLine" id="cb4-21" title="21"><span class="co">#> sigma 0.3183 0.3279 0.33332 0.33858 0.34891</span></a>
<a class="sourceLine" id="cb4-22" title="22"><span class="co">#> tau_bar 0.4013 0.4446 0.46860 0.49219 0.53733</span></a>
<a class="sourceLine" id="cb4-23" title="23"><span class="co">#> mu_bar -0.1268 -0.1015 -0.08731 -0.07387 -0.04683</span></a>
<a class="sourceLine" id="cb4-24" title="24"><span class="co">#> yhat_bar 0.1062 0.1222 0.12987 0.13775 0.15254</span></a>
<a class="sourceLine" id="cb4-25" title="25"><span class="co">#> </span></a>
<a class="sourceLine" id="cb4-26" title="26"><span class="co">#> </span></a>
<a class="sourceLine" id="cb4-27" title="27"><span class="co">#> ----</span></a>
<a class="sourceLine" id="cb4-28" title="28"><span class="co">#> Effective sample size for each parameter</span></a>
<a class="sourceLine" id="cb4-29" title="29"><span class="co">#> sigma tau_bar mu_bar yhat_bar </span></a>
<a class="sourceLine" id="cb4-30" title="30"><span class="co">#> 3290.455 1323.737 1158.787 4000.000 </span></a>
<a class="sourceLine" id="cb4-31" title="31"><span class="co">#> </span></a>
<a class="sourceLine" id="cb4-32" title="32"><span class="co">#> ----</span></a>
<a class="sourceLine" id="cb4-33" title="33"><span class="co">#> Gelman and Rubin's convergence diagnostic for each parameter</span></a>
<a class="sourceLine" id="cb4-34" title="34"><span class="co">#> Potential scale reduction factors:</span></a>
<a class="sourceLine" id="cb4-35" title="35"><span class="co">#> </span></a>
<a class="sourceLine" id="cb4-36" title="36"><span class="co">#> Point est. Upper C.I.</span></a>
<a class="sourceLine" id="cb4-37" title="37"><span class="co">#> sigma 1.00 1.01</span></a>
<a class="sourceLine" id="cb4-38" title="38"><span class="co">#> tau_bar 1.00 1.00</span></a>
<a class="sourceLine" id="cb4-39" title="39"><span class="co">#> mu_bar 1.01 1.03</span></a>
<a class="sourceLine" id="cb4-40" title="40"><span class="co">#> yhat_bar 1.00 1.00</span></a>
<a class="sourceLine" id="cb4-41" title="41"><span class="co">#> </span></a>
<a class="sourceLine" id="cb4-42" title="42"><span class="co">#> Multivariate psrf</span></a>
<a class="sourceLine" id="cb4-43" title="43"><span class="co">#> </span></a>
<a class="sourceLine" id="cb4-44" title="44"><span class="co">#> 1.02</span></a>
<a class="sourceLine" id="cb4-45" title="45"><span class="co">#> </span></a>
<a class="sourceLine" id="cb4-46" title="46"><span class="co">#> ----</span></a></code></pre></div>
<p>Since our “<span class="math inline">\(\hat{R}\)</span>” values (Gelman and Rubin’s convergence statistics, which equal to 1 at convergence) are between 0.9 and 1.1, we know that our run successfully achieved convergence.</p>
</div>
<div id="explore-the-posterior" class="section level2">
<h2 class="hasAnchor">
<a href="#explore-the-posterior" class="anchor"></a>Explore the posterior</h2>
<p>Now that we’ve successfully fit our model, let’s expore the output. First, since this is a simulation, let’s compare the unit-speciifc treatment effect estimates <span class="math inline">\(\hat{\tau}_i\)</span> to the true unit-specific treatment effects <span class="math inline">\(\tau_i\)</span>.</p>
<div id="estimates-vs--true-treatment-effects" class="section level3">
<h3 class="hasAnchor">
<a href="#estimates-vs--true-treatment-effects" class="anchor"></a>Estimates vs. true treatment effects</h3>
<p>In the plot below, the red line is the identity line and the blue line is a regression line fitted through our points.</p>
<div class="sourceCode" id="cb5"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb5-1" title="1"><span class="kw">ggplot</span>(<span class="ot">NULL</span>, <span class="kw">aes</span>(<span class="dt">x =</span> tau, <span class="dt">y =</span> <span class="kw"><a href="https://rdrr.io/r/base/colSums.html">colMeans</a></span>(bcf_out<span class="op">$</span>tau))) <span class="op">+</span></a>
<a class="sourceLine" id="cb5-2" title="2"><span class="st"> </span><span class="kw">geom_smooth</span>(<span class="dt">method =</span> <span class="st">"lm"</span>, <span class="dt">se =</span> <span class="ot">FALSE</span>) <span class="op">+</span></a>
<a class="sourceLine" id="cb5-3" title="3"><span class="st"> </span><span class="kw">geom_abline</span>(<span class="dt">color =</span> <span class="st">"red"</span>) <span class="op">+</span></a>
<a class="sourceLine" id="cb5-4" title="4"><span class="st"> </span><span class="kw">geom_point</span>() <span class="op">+</span></a>
<a class="sourceLine" id="cb5-5" title="5"><span class="st"> </span><span class="kw">xlab</span>(<span class="kw">TeX</span>(<span class="st">"True unit-specific treatment effect ($</span><span class="ch">\\</span><span class="st">tau_i$)"</span>)) <span class="op">+</span></a>
<a class="sourceLine" id="cb5-6" title="6"><span class="st"> </span><span class="kw">ylab</span>(<span class="kw">TeX</span>(<span class="st">"Estimate (posterior mean) of unit-specific treatment effect ($</span><span class="ch">\\</span><span class="st">hat{</span><span class="ch">\\</span><span class="st">tau_i}$)"</span>)) <span class="op">+</span></a>
<a class="sourceLine" id="cb5-7" title="7"><span class="st"> </span><span class="kw"><a href="https://rdrr.io/r/graphics/plot.window.html">xlim</a></span>(<span class="dv">0</span>, <span class="dv">1</span>) <span class="op">+</span><span class="st"> </span></a>
<a class="sourceLine" id="cb5-8" title="8"><span class="st"> </span><span class="kw"><a href="https://rdrr.io/r/graphics/plot.window.html">ylim</a></span>(<span class="dv">0</span>, <span class="dv">1</span>)</a></code></pre></div>
<p><img src="simple-example_files/figure-html/unnamed-chunk-2-1.png" width="700"></p>
<p>BCF recovers the true unit-specific treatment effects with a high degree of accuracy in this simple example.</p>
</div>
<div id="distribution-of-treatment-effects" class="section level3">
<h3 class="hasAnchor">
<a href="#distribution-of-treatment-effects" class="anchor"></a>Distribution of treatment effects</h3>
<p>Now that we have confidence in BCF’s estimates of unit-specific treatment effects, we examine the results more closely. First we consider the distribution of unit-specific treatment effects, to assess treatment effect heterogeneity across units. This plot shows the impact estimate and 95 percent credible interval for each unit, arranged by the size of the impact.</p>
<div class="sourceCode" id="cb6"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb6-1" title="1">tau_ests <-<span class="st"> </span><span class="kw"><a href="https://rdrr.io/r/base/data.frame.html">data.frame</a></span>(<span class="dt">Mean =</span> <span class="kw"><a href="https://rdrr.io/r/base/colSums.html">colMeans</a></span>(bcf_out<span class="op">$</span>tau),</a>
<a class="sourceLine" id="cb6-2" title="2"> <span class="dt">Low95 =</span> <span class="kw"><a href="https://rdrr.io/r/base/apply.html">apply</a></span>(bcf_out<span class="op">$</span>tau, <span class="dv">2</span>, <span class="cf">function</span>(x) <span class="kw"><a href="https://rdrr.io/r/stats/quantile.html">quantile</a></span>(x, <span class="fl">0.025</span>)),</a>
<a class="sourceLine" id="cb6-3" title="3"> <span class="dt">Up95 =</span> <span class="kw"><a href="https://rdrr.io/r/base/apply.html">apply</a></span>(bcf_out<span class="op">$</span>tau, <span class="dv">2</span>, <span class="cf">function</span>(x) <span class="kw"><a href="https://rdrr.io/r/stats/quantile.html">quantile</a></span>(x, <span class="fl">0.975</span>)))</a>
<a class="sourceLine" id="cb6-4" title="4"></a>
<a class="sourceLine" id="cb6-5" title="5">tau_ests <-<span class="st"> </span>tau_ests[<span class="kw"><a href="https://rdrr.io/r/base/order.html">order</a></span>(tau_ests<span class="op">$</span>Mean), ]</a>
<a class="sourceLine" id="cb6-6" title="6">tau_ests[, <span class="st">"unit"</span>] <-<span class="st"> </span><span class="dv">1</span><span class="op">:</span><span class="kw"><a href="https://rdrr.io/r/base/nrow.html">ncol</a></span>(bcf_out<span class="op">$</span>tau)</a>
<a class="sourceLine" id="cb6-7" title="7"></a>
<a class="sourceLine" id="cb6-8" title="8"><span class="kw">ggplot</span>(tau_ests, <span class="kw">aes</span>(<span class="dt">x =</span> unit, <span class="dt">y =</span> Mean)) <span class="op">+</span></a>
<a class="sourceLine" id="cb6-9" title="9"><span class="st"> </span><span class="kw">theme_bw</span>() <span class="op">+</span></a>
<a class="sourceLine" id="cb6-10" title="10"><span class="st"> </span><span class="kw">geom_pointrange</span>(<span class="kw">aes</span>(<span class="dt">ymin =</span> Low95, <span class="dt">ymax =</span> Up95), <span class="dt">color =</span> <span class="st">"blue"</span>) <span class="op">+</span></a>
<a class="sourceLine" id="cb6-11" title="11"><span class="st"> </span><span class="kw">geom_vline</span>(<span class="dt">xintercept =</span> <span class="dv">0</span>, <span class="dt">color =</span> <span class="st">"black"</span>) <span class="op">+</span></a>
<a class="sourceLine" id="cb6-12" title="12"><span class="st"> </span><span class="kw">ylab</span>(<span class="st">"Estimate of unit-specific treatment effect (posterior mean and 95% CI)"</span>) <span class="op">+</span></a>
<a class="sourceLine" id="cb6-13" title="13"><span class="st"> </span><span class="kw">xlab</span>(<span class="st">"Unit"</span>) <span class="op">+</span></a>
<a class="sourceLine" id="cb6-14" title="14"><span class="st"> </span><span class="kw">coord_flip</span>()</a></code></pre></div>
<p><img src="simple-example_files/figure-html/unnamed-chunk-4-1.png" width="700"></p>
<p>In this example, although we do see variation in the point estimates across units, most uncertainty intervals overlap, except for units at the extremes.</p>
</div>
<div id="identifying-possible-subgroups" class="section level3">
<h3 class="hasAnchor">
<a href="#identifying-possible-subgroups" class="anchor"></a>Identifying possible subgroups</h3>
<p>The BCF model obtains more accurate treatment effect estimates than other approaches in part by allowing flexible relationships with many possible effect modifiers. These relationships are often of substantive interest; researchers want to know what covariates are associated with higher or lower impacts. However, the BCF model is difficult to interpret directly: we cannot simply look at the coefficients on the effect modifiers to determine which characteristics drive the impacts. Instead, we can look for patterns in the unit-specific treatment effects, linking each unit’s treatment effect estimate to its covariates.</p>
<p>Specifically, the BCF paper (<a href="https://arxiv.org/pdf/1706.09523.pdf" class="uri">https://arxiv.org/pdf/1706.09523.pdf</a>) recommends fitting a CART model where the response variable is the estimated unit-specific treatment effect from BCF and the predictor variables are the effect modifiers used to fit the BCF model. The resulting tree identifies covariates that best distinguish units with high vs. low impacts, which are strong candidate subgroup variables.</p>
<div class="sourceCode" id="cb7"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb7-1" title="1">tree <-<span class="st"> </span><span class="kw">rpart</span>(<span class="kw"><a href="https://rdrr.io/r/base/colSums.html">colMeans</a></span>(bcf_out<span class="op">$</span>tau) <span class="op">~</span><span class="st"> </span>x, <span class="dt">weights =</span> weights)</a>
<a class="sourceLine" id="cb7-2" title="2"><span class="kw">rpart.plot</span>(tree)</a></code></pre></div>
<p><img src="simple-example_files/figure-html/unnamed-chunk-5-1.png" width="700"></p>
<p>Our tree identifies <span class="math inline">\(x_3\)</span> as the strongest effect modifier, which is correct, because we specified it as such in the data generation step.</p>
<p>Because the CART fit does not account for the uncertainty in the treatment effect estimates, it is possible that its tree splits do not identify characteristics that create subgroups with meaningfully different impacts. To conduct inference on whether impacts differ meaningfully based on a characteristic, we can use the posterior draws from the fitted BCF model.</p>
<p>Here we show just a comparison of the average treatment effect for units in the top quartile of <span class="math inline">\(X_3\)</span> vs. the average treatment effect for units in the bottom quartile of <span class="math inline">\(X_3\)</span>.</p>
<div class="sourceCode" id="cb8"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb8-1" title="1">q1 <-<span class="st"> </span>x[,<span class="dv">3</span>] <span class="op"><</span><span class="st"> </span><span class="kw"><a href="https://rdrr.io/r/stats/quantile.html">quantile</a></span>(x[,<span class="dv">3</span>], <span class="fl">0.25</span>)</a>
<a class="sourceLine" id="cb8-2" title="2">q4 <-<span class="st"> </span>x[,<span class="dv">3</span>] <span class="op">></span><span class="st"> </span><span class="kw"><a href="https://rdrr.io/r/stats/quantile.html">quantile</a></span>(x[,<span class="dv">3</span>], <span class="fl">0.75</span>)</a>
<a class="sourceLine" id="cb8-3" title="3"></a>
<a class="sourceLine" id="cb8-4" title="4">q1Taus <-<span class="st"> </span>bcf_out<span class="op">$</span>tau[,q1]</a>
<a class="sourceLine" id="cb8-5" title="5">q4Taus <-<span class="st"> </span>bcf_out<span class="op">$</span>tau[,q4]</a>
<a class="sourceLine" id="cb8-6" title="6"></a>
<a class="sourceLine" id="cb8-7" title="7">wq1Taus <-<span class="st"> </span><span class="kw"><a href="https://rdrr.io/r/base/apply.html">apply</a></span>(q1Taus, <span class="dv">1</span>, weighted.mean, weights[q1])</a>
<a class="sourceLine" id="cb8-8" title="8">wq4Taus <-<span class="st"> </span><span class="kw"><a href="https://rdrr.io/r/base/apply.html">apply</a></span>(q4Taus, <span class="dv">1</span>, weighted.mean, weights[q4])</a>
<a class="sourceLine" id="cb8-9" title="9"></a>
<a class="sourceLine" id="cb8-10" title="10">groupTaus <-<span class="st"> </span><span class="kw"><a href="https://rdrr.io/r/base/data.frame.html">data.frame</a></span>(<span class="dt">taus =</span> <span class="kw"><a href="https://rdrr.io/r/base/c.html">c</a></span>(wq1Taus, wq4Taus),</a>
<a class="sourceLine" id="cb8-11" title="11"> <span class="dt">quartile =</span> <span class="kw"><a href="https://rdrr.io/r/base/c.html">c</a></span>(<span class="kw"><a href="https://rdrr.io/r/base/rep.html">rep</a></span>(<span class="st">"q1"</span>, <span class="dv">4</span><span class="op">*</span>n_sim), <span class="kw"><a href="https://rdrr.io/r/base/rep.html">rep</a></span>(<span class="st">"q4"</span>, <span class="dv">4</span><span class="op">*</span>n_sim)))</a>
<a class="sourceLine" id="cb8-12" title="12"></a>
<a class="sourceLine" id="cb8-13" title="13"><span class="kw">ggplot</span>(groupTaus, <span class="kw">aes</span>(taus, <span class="dt">fill =</span> quartile, <span class="dt">color =</span> quartile)) <span class="op">+</span></a>
<a class="sourceLine" id="cb8-14" title="14"><span class="st"> </span><span class="kw">geom_density</span>(<span class="dt">alpha =</span> <span class="fl">0.5</span>)</a></code></pre></div>
<p><img src="simple-example_files/figure-html/unnamed-chunk-6-1.png" width="700"></p>
<p>The posterior distributions for the average treatment effect of the top- and bottom-quartile units do not overlap.</p>
<p>We would therefore conclude that <span class="math inline">\(X_3\)</span> is an important subgroup variable. We can confirm this by computing the probability that the average treatment effect for the top-quartile units (q4) is greater than the average treatment effect for the bottom-quartile units (q1).</p>
<div class="sourceCode" id="cb9"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb9-1" title="1"><span class="kw"><a href="https://rdrr.io/r/base/mean.html">mean</a></span>(wq4Taus <span class="op">></span><span class="st"> </span>wq1Taus)</a>
<a class="sourceLine" id="cb9-2" title="2"><span class="co">#> [1] 1</span></a></code></pre></div>
<p>We found that there is a 100% probability that treatment effects for the top quartile (q4) are greater than those for the bottom quartile (q1).</p>
</div>
</div>
</div>
<div class="col-md-3 hidden-xs hidden-sm" id="sidebar">
<div id="tocnav">
<h2 class="hasAnchor">
<a href="#tocnav" class="anchor"></a>Contents</h2>
<ul class="nav nav-pills nav-stacked">
<li><a href="#simulate-data">Simulate data</a></li>
<li><a href="#fit-bcf-model">Fit BCF model</a></li>
<li><a href="#check-mcmc-diagnostics">Check MCMC diagnostics</a></li>
<li><a href="#explore-the-posterior">Explore the posterior</a></li>
</ul>
</div>
</div>
</div>
<footer><div class="copyright">
<p>Developed by Jared S. Murray, P. Richard Hahn, Carlos Carvalho.</p>
</div>
<div class="pkgdown">
<p>Site built with <a href="https://pkgdown.r-lib.org/">pkgdown</a> 1.4.1.</p>
</div>
</footer>
</div>
</body>
</html>