-
Notifications
You must be signed in to change notification settings - Fork 24
/
Copy pathranger_surv.unify.Rd
90 lines (77 loc) · 4.01 KB
/
ranger_surv.unify.Rd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/unify_ranger_surv.R
\name{ranger_surv.unify}
\alias{ranger_surv.unify}
\title{Unify ranger survival model}
\usage{
ranger_surv.unify(
rf_model,
data,
type = c("risk", "survival", "chf"),
times = NULL
)
}
\arguments{
\item{rf_model}{An object of \code{ranger} class. At the moment, models built on data with categorical features
are not supported - please encode them before training.}
\item{data}{Reference dataset. A \code{data.frame} or \code{matrix} with the same columns as in the training set of the model. Usually dataset used to train model.}
\item{type}{A character to define the type of model prediction to use. Either \code{"risk"} (default), which uses the risk score calculated as a sum of cumulative hazard function values, \code{"survival"}, which uses the survival probability at certain time-points for each observation, or \code{"chf"}, which used the cumulative hazard values at certain time-points for each observation.}
\item{times}{A numeric vector of unique death times at which the prediction should be evaluated. By default \code{unique.death.times} from model are used.}
}
\value{
For \code{type = "risk"} a unified model representation is returned - a \code{\link{model_unified.object}} object. For \code{type = "survival"} or \code{type = "chf"} - a \code{\link{model_unified_multioutput.object}} object is returned, which is a list that contains unified model representation (\code{\link{model_unified.object}} object) for each time point. In this case, the list names are time points at which the survival function was evaluated.
}
\description{
Convert your ranger model into a standardized representation.
The returned representation is easy to be interpreted by the user and ready to be used as an argument in \code{treeshap()} function.
}
\details{
The survival forest implemented in the \code{ranger} package stores cumulative hazard
functions (CHFs) in the leaves of survival trees, as proposed for Random Survival Forests
(Ishwaran et al. 2008). The final model prediction is made by averaging these CHFs
from all the trees. To provide explanations in the form of a survival function,
the CHFs from the leaves are converted into survival functions (SFs) using
the formula SF(t) = exp(-CHF(t)).
However, it is important to note that averaging these SFs does not yield the correct
model prediction as the model prediction is the average of CHFs transformed in the same way.
Therefore, when you obtain explanations based on the survival function,
they are only proxies and may not be fully consistent with the model predictions
obtained using for example \code{predict} function.
}
\examples{
library(ranger)
data_colon <- data.table::data.table(survival::colon)
data_colon <- na.omit(data_colon[get("etype") == 2, ])
surv_cols <- c("status", "time", "rx")
feature_cols <- colnames(data_colon)[3:(ncol(data_colon) - 1)]
train_x <- model.matrix(
~ -1 + .,
data_colon[, .SD, .SDcols = setdiff(feature_cols, surv_cols[1:2])]
)
train_y <- survival::Surv(
event = (data_colon[, get("status")] |>
as.character() |>
as.integer()),
time = data_colon[, get("time")],
type = "right"
)
rf <- ranger::ranger(
x = train_x,
y = train_y,
data = data_colon,
max.depth = 10,
num.trees = 10
)
unified_model_risk <- ranger_surv.unify(rf, train_x, type = "risk")
shaps <- treeshap(unified_model_risk, train_x[1:2,])
# compute shaps for 3 selected time points
unified_model_surv <- ranger_surv.unify(rf, train_x, type = "survival", times = c(23, 50, 73))
shaps_surv <- treeshap(unified_model_surv, train_x[1:2,])
}
\seealso{
\code{\link{ranger.unify}} for regression and classification \code{\link[ranger:ranger]{ranger models}}
\code{\link{lightgbm.unify}} for \code{\link[lightgbm:lightgbm]{LightGBM models}}
\code{\link{gbm.unify}} for \code{\link[gbm:gbm]{GBM models}}
\code{\link{xgboost.unify}} for \code{\link[xgboost:xgboost]{XGBoost models}}
\code{\link{randomForest.unify}} for \code{\link[randomForest:randomForest]{randomForest models}}
}