mlr3learners.lightgbm brings the LightGBM gradient booster to the mlr3 framework by using the official lightgbm R implementation.
- integrated native CV before the actual model training to find the optimal
num_iterations
for the given training data and parameter set - GPU support
Before you can install the mlr3learners.lightgbm
package, you need to install the lightgbm R package according to its documentation (this is necessary since lightgbm is neither on CRAN nor installable via devtools::install_github
).
git clone --recursive https://github.com/microsoft/LightGBM
cd LightGBM && \
Rscript build_r.R
If the lightgbm R package is installed, you can continue and install the mlr3learners.lightgbm R package:
install.packages("devtools")
devtools::install_github("mlr3learners/mlr3learners.lightgbm")
library(mlr3)
task = mlr3::tsk("iris")
learner = mlr3::lrn("classif.lightgbm", objective = "multiclass")
learner$param_set$values = mlr3misc::insert_named(
learner$param_set$values,
list(
"early_stopping_round" = 10,
"learning_rate" = 0.1,
"seed" = 17L,
"metric" = "multi_logloss",
"num_iterations" = 100,
"num_class" = 3
)
)
learner$train(task, row_ids = 1:120)
predictions = learner$predict(task, row_ids = 121:150)
For further information and examples, please view the mlr3learners.lightgbm
package vignettes and the mlr3book.
The mlr3learners.lightgbm
can also be used with lightgbm's GPU compiled version.
To install the lightgbm R package with GPU support, execute the following commands (lightgbm manual):
git clone --recursive --branch stable --depth 1 https://github.com/microsoft/LightGBM
cd LightGBM && \
sed -i -e 's/use_gpu = FALSE/use_gpu = TRUE/g' R-package/src/install.libs.R && \
Rscript build_r.R
In order to use the GPU acceleration, the parameter device_type = "gpu"
(default: "cpu") needs to be set. According to the LightGBM parameter manual, 'it is recommended to use the smaller max_bin
(e.g. 63) to get the better speed up'.
learner$param_set$values = mlr3misc::insert_named(
learner$param_set$values,
list(
"objective" = "multiclass",
"device_type" = "gpu",
"max_bin" = 63L,
"early_stopping_round" = 10,
"learning_rate" = 0.1,
"seed" = 17L,
"metric" = "multi_logloss",
"num_iterations" = 100,
"num_class" = 3
)
)
All other steps are similar to the workflow without GPU support.
The GPU support has been tested in a Docker container running on a Linux 19.10 host, Intel i7, 16 GB RAM, an NVIDIA(R) RTX 2060, CUDA(R) 10.2 and nvidia-docker.
- Microsoft's LightGBM: https://lightgbm.readthedocs.io/en/latest/
- mlr3: https://github.com/mlr-org/mlr3