Skip to content

Commit

Permalink
modify vignette & data
Browse files Browse the repository at this point in the history
  • Loading branch information
nociale committed Sep 27, 2021
1 parent 014d4c2 commit b19c66e
Show file tree
Hide file tree
Showing 5 changed files with 40 additions and 29 deletions.
16 changes: 11 additions & 5 deletions R/data.R
Original file line number Diff line number Diff line change
@@ -1,17 +1,23 @@
#' Antidepressant trial data.
#'
#' A dataset containing the data from a public available antidepressant clinical trial of an active drug versus placebo.
#' A dataset containing data from a publicly available antidepressant clinical trial of an active drug versus placebo.
#' The dataset is available [here](https://www.lshtm.ac.uk/research/centres-projects-groups/missing-data#dia-missing-data).
#' The relevant endpoint is the Hamilton 17-item rating scale for depression (HAMD17) which was assessed at baseline and weeks 1, 2, 4, and 6.
#' Study drug discontinuation occurred in 24% (20/84) for the active drug and 26% (23/88) for placebo.
#' Study drug discontinuation occurred in 24% subjects from the active drug and 26% from placebo.
#' All data after study drug discontinuation are missing and there is a single additional intermittent missing observation.
#'
#' @format A data frame with 608 rows and 11 variables:
#' - `PATIENT`: patients IDs.
#' - `HAMATOTL`: total score Hamilton Anxiety Rating Scale.
#' - `PGIIMP`: patient’s Global Impression of Improvement Rating Scale.
#' - `RELDAYS`: number of days between visit and baseline.
#' - `VISIT`: post-baseline visit. Has levels 4,5,6,7.
#' - `THERAPY`: the treatment group variable. It is equal to `PLACEBO` for observations
#' from the placebo arm, or `DRUG` for observations from the active arm.
#' - `basval`: baseline outcome value.
#' - `GENDER`: patient's sex.
#' - `POOLINV`: pooled investigator.
#' - `BASVAL`: baseline outcome value.
#' - `HAMDTL17`: Hamilton 17-item rating scale value.
#' - `change`: change from baseline in the Hamilton 17-item rating scale.
#' - ...
#' - `CHANGE`: change from baseline in the Hamilton 17-item rating scale.
#'
"antidepressant_data"
Binary file modified data/antidepressant_data.rda
Binary file not shown.
15 changes: 10 additions & 5 deletions man/antidepressant_data.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion man/method.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

36 changes: 18 additions & 18 deletions vignettes/quickstart.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ In particular the core functions are:

## The Data

In order to demonstrate the package we will use a publicly available example data set from an antidepressant clinical trial of an active drug versus placebo. The relevant endpoint is the Hamilton 17-item rating scale for depression (HAMD17) which was assessed at baseline and weeks 1, 2, 4, and 6. Study drug discontinuation occurred in 24% (20/84) for the active drug and 26% (23/88) for placebo. All data after study drug discontinuation are missing and there is a single additional intermittent missing observation.
In order to demonstrate the package we will use a publicly available example data set from an antidepressant clinical trial of an active drug versus placebo. The relevant endpoint is the Hamilton 17-item rating scale for depression (HAMD17) which was assessed at baseline and weeks 1, 2, 4, and 6. Study drug discontinuation occurred in 24% subjects from the active drug and 26% subjects from placebo. All data after study drug discontinuation are missing and there is a single additional intermittent missing observation.

```{r}
library(rbmi)
Expand All @@ -40,10 +40,10 @@ data("antidepressant_data")
dat <- antidepressant_data
```

We consider an imputation model with the mean change from baseline in the HAMD17 score as the outcome (variable `change` in the dataset), included the treatment group (`THERAPY`), the (categorical) visit (`VISIT`), treatment-by-visit interactions, the baseline HAMD17 score (`basval`), and baseline HAMD17-by-visit interactions as covariates, and assumed a common unstructured covariance matrix in both groups. The chosen analysis model is ANCOVA which adjusts for the baseline HAMD17 value.
We consider an imputation model with the mean change from baseline in the HAMD17 score as the outcome (variable `CHANGE` in the dataset), included the treatment group (`THERAPY`), the (categorical) visit (`VISIT`), treatment-by-visit interactions, the baseline HAMD17 score (`BASVAL`), and baseline HAMD17-by-visit interactions as covariates, and assumed a common unstructured covariance matrix in both groups. The chosen analysis model is ANCOVA which adjusts for the baseline HAMD17 value.

`rbmi` expects its input dataset to be complete; that is that there must be 1 row
per patient per visit. Missing outcome values should be coded as `NA`, while missing values in the covariates are not allowed. If your dataset is incomplete then the `expand_locf()` helper function can be used to add in any missing rows, using LOCF imputation to impute the covariate values. In our dataset the rows corresponding to missing outcomes are missing, so we use the `expand_locf()` function
per patient per visit. Missing outcome values should be coded as `NA`, while missing values in the covariates are not allowed. If your dataset is incomplete then the `expand_locf()` helper function can be used to add in any missing rows, using LOCF imputation to impute the covariate values. In our dataset the rows corresponding to missing outcomes are not present, to address this we will therefore use the `expand_locf()` function
as follows:

```{r}
Expand All @@ -53,7 +53,7 @@ dat <- expand_locf(
dat,
PATIENT = levels(dat$PATIENT), # expand by PATIENT and VISIT
VISIT = levels(dat$VISIT),
vars = c("basval", "THERAPY"), # fill with LOCF basval and THERAPY
vars = c("BASVAL", "THERAPY"), # fill with LOCF BASVAL and THERAPY
group = c("PATIENT"),
order = c("PATIENT", "VISIT")
)
Expand All @@ -68,7 +68,7 @@ function include:
- `data` the primary longitudinal data.frame containing the outcome variable and all covariates
- `data_ice` a data.frame specifying which visit (if any) the patient's intercurrent
event (ICE) occurred on, or more precisely the first visit in which the outcome has been affected by the ICE. If the patient had multiple ICEs this should
specify the first visit affected by the ICE addressed with a non-MAR imputation strategy. It also
specify the first visit affected by the ICE which is to be imputed by a non-MAR. It also
specifies which reference based imputation strategy we want to use.
- `method` specifies what method we want to use to fit our imputation models as well as what
method we want to use to generate our imputed values.
Expand All @@ -77,22 +77,22 @@ In our example the patients ICE visit is the
first visit in which a missing value has occurred. We assume that all patients will be
imputed under the Jump To Reference (JR) strategy. We will create 150 imputation models using Bayesian
methods to sample the model coefficients from their posterior distributions for a model
of `change ~ 1 + basval * VISIT + THERAPY * VISIT`.
of `CHANGE ~ 1 + BASVAL * VISIT + THERAPY * VISIT`.

```{r}
# create data_ice setting the imputation method to JR for
# each patient with at least one missing value
dat_ice <- dat %>%
arrange(PATIENT, VISIT) %>%
filter(is.na(change)) %>%
filter(is.na(CHANGE)) %>%
group_by(PATIENT) %>%
slice(1) %>%
ungroup() %>%
select(PATIENT, VISIT) %>%
mutate(strategy = "JR")
# The patient with id 3618 is the unique one that has an intermittent missing values.
# Actually he does not stop the treatment -> remove from data_ice
# The patient with id 3618 is the unique one that has an intermittent missing values ->
# remove from data_ice since he does not experience any ICE.
# (it will be automatically imputed under MAR assumption)
dat_ice <- dat_ice[-which(dat_ice$PATIENT == 3618),]
Expand All @@ -101,18 +101,18 @@ dat_ice
# Define the names of key variables in our dataset using `set_vars()`
# Note that covariates argument can contain interactions
vars <- set_vars(
outcome = "change",
outcome = "CHANGE",
visit = "VISIT",
subjid = "PATIENT",
group = "THERAPY",
covariates = c("basval*VISIT", "THERAPY*VISIT")
covariates = c("BASVAL*VISIT", "THERAPY*VISIT")
)
# Define what method we want to use e.g. here we specify we
# want to use Baysian methods to create 150 samples
# want to use Bayesian methods to create 100 samples
method <- method_bayes(
burn_in = 200,
burn_between = 10,
burn_between = 5,
n_samples = 150,
verbose = FALSE
)
Expand Down Expand Up @@ -167,7 +167,7 @@ imputeObj
```

In this instance we are specifying that group `PLACEBO` should use itself as its reference group and that group `DRUG` should
use group `PLACEBO` as its reference group (as standard for imputation using reference-based methods).
use the group `PLACEBO` as its reference group (as standard for imputation using reference-based methods).

Generally speaking, there is no need to see or directly interact with any of the imputed
datasets. However if you do wish to inspect them they can be extracted from the imputation
Expand Down Expand Up @@ -200,10 +200,10 @@ anaObj <- analyse(
ancova,
vars = set_vars(
subjid = "PATIENT",
outcome = "change",
outcome = "CHANGE",
visit = "VISIT",
group = "THERAPY",
covariates = c("basval")
covariates = c("BASVAL")
)
)
anaObj
Expand Down Expand Up @@ -244,10 +244,10 @@ anaObj_delta <- analyse(
delta = delta_df,
vars = set_vars(
subjid = "PATIENT",
outcome = "change",
outcome = "CHANGE",
visit = "VISIT",
group = "THERAPY",
covariates = c("basval")
covariates = c("BASVAL")
)
)
```
Expand Down

0 comments on commit b19c66e

Please sign in to comment.