The xrayspecs package provides model-agnostic interpretations of
black-box models. The package is designed to integrate into the
parsnip
package’s unified modelling framework. Currently the package
supports: permutation importance plots, partial dependence plots and
individual conditional expectation plots.
You cannot currently install xrayspecs from CRAN. However, you can install the development version of xrayspecs from GitHub with:
# install.packages("devtools")
devtools::install_github("mt-edwards/xrayspecs")
library(xrayspecs)
library(tidyverse)
library(tidymodels)
library(ranger)
The bike
data set is used for this example.This data set contains
features only of type double (<dbl>
) and type factor (<fct>
). This
is required so that the plot_dependence()
function knows how to plot
the feature predictions. For example, the predictions of double
(continuous) features are displayed with line plots and the predictions
of factor (categorical) features are displayed with bar plots.
data("bike")
The bike
data set set is split into a training set and a test set with
the rsample package. The trainiing set is used to train the predictive
models and the test set is used to test and interpret the predictive
models.
split <- initial_split(bike, prop = 4 / 5, strata = "bikes_rented")
bike_train <- training(split)
bike_test <- testing(split)
A random forest is
trained on the bike_train
data set using the parsnip
package. The
parsnip
package provides a unified framework for fitting models in
R
. The models that are available for training in parsnip
are listed
here.
The xrayspecs package is designed to integrate into the parsnip
package’s unified framework.
rf <- rand_forest(mode = "regression") %>%
set_engine("ranger") %>%
fit(bikes_rented ~ ., data = bike_train)
predict(rf, bike_test) %>%
bind_cols(bike_test) %>%
mae(truth = bikes_rented, estimate = .pred)
#> # A tibble: 1 x 3
#> .metric .estimator .estimate
#> <chr> <chr> <dbl>
#> 1 mae standard 461.
To display a permutation
importance
plot of the random forest features all you need to do is pipe the rf
object into the plot_importance()
function along with the test data
(bike_test
) the target (bike_rentals
) and a metric from the
yardstick
package, e.g. mae
(Mean Absolute Error). A feature
importance is equal to the absolute difference between the metric
estimate when the feature is and is not permuted in the data. Note:
metrics must be appropriate for the target variable.
rf %>%
plot_importance(bike_test, bikes_rented, mae) +
labs(title = "Permutation Importance Plot")
To display a partial
dependence
plot of a feature for the random forest model all you need to do is pipe
the rf
object into the plot_dependence()
function along with the
data (bike_test
) and the feature. Here the partial dependence plots of
temperature
, humidity
and wind_speed
are displayed.
rf %>%
plot_dependence(bike_test, temperature) +
labs(title = "Partial Dependence Plot") +
labs(x = "Temperature")
rf %>%
plot_dependence(bike_test, humidity) +
labs(title = "Partial Dependence Plot") +
labs(x = "Humidity")
rf %>%
plot_dependence(bike_test, wind_speed) +
labs(title = "Partial Dependence Plot") +
labs(x = "Wind Speed")
rf %>%
plot_dependence(bike_test, season) +
labs(title = "Partial Dependence Plot") +
labs(x = "Season")
To display an individual conditional
expectation
plot of a feature for the random forest model all you need to do is pipe
the rf
object into the plot_dependence()
function along with the
data (bike_test
) and the feature with the optional argument examples
assigned to TRUE
. Individual conditional expectation plots are not
available for features of type factor
rf %>%
plot_dependence(bike_test, days_since_2011) +
labs(title = "Partial Dependence Plot") +
labs("Days Since 2011")
rf %>%
plot_dependence(bike_test, days_since_2011, examples = TRUE) +
labs(title = "Individual Conditional Expectation Plot") +
labs(x = "Days Since 2011")
rf %>%
plot_dependence(bike_test, days_since_2011, examples = TRUE, center = TRUE) +
labs(title = "Centered Individual Conditional Expectation Plot") +
labs(x = "Days Since 2011")