Skip to content

statist-bhfz/mlr3learners.lightgbm

 
 

Repository files navigation

mlr3learners.lightgbm

R CMD Check via {tic} Parameter Check pipeline status coverage report

mlr3learners.lightgbm brings the LightGBM gradient booster to the mlr3 framework by using the official lightgbm R implementation.

Features

  • integrated native CV before the actual model training to find the optimal num_iterations for the given training data and parameter set
  • GPU support

Installation

Before you can install the mlr3learners.lightgbm package, you need to install the lightgbm R package according to its documentation (this is necessary since lightgbm is neither on CRAN nor installable via devtools::install_github).

git clone --recursive https://github.com/microsoft/LightGBM
cd LightGBM && \
Rscript build_r.R

If the lightgbm R package is installed, you can continue and install the mlr3learners.lightgbm R package:

install.packages("devtools")
devtools::install_github("mlr3learners/mlr3learners.lightgbm")

Example

library(mlr3)
task = mlr3::tsk("iris")
learner = mlr3::lrn("classif.lightgbm", objective = "multiclass")

learner$param_set$values = mlr3misc::insert_named(
  learner$param_set$values,
    list(
    "early_stopping_round" = 10,
    "learning_rate" = 0.1,
    "seed" = 17L,
    "metric" = "multi_logloss",
    "num_iterations" = 100,
    "num_class" = 3
  )
)

learner$train(task, row_ids = 1:120)
predictions = learner$predict(task, row_ids = 121:150)

For further information and examples, please view the mlr3learners.lightgbm package vignettes and the mlr3book.

GPU acceleration

The mlr3learners.lightgbm can also be used with lightgbm's GPU compiled version.

To install the lightgbm R package with GPU support, execute the following commands (lightgbm manual):

git clone --recursive --branch stable --depth 1 https://github.com/microsoft/LightGBM
cd LightGBM && \
sed -i -e 's/use_gpu = FALSE/use_gpu = TRUE/g' R-package/src/install.libs.R && \
Rscript build_r.R

In order to use the GPU acceleration, the parameter device_type = "gpu" (default: "cpu") needs to be set. According to the LightGBM parameter manual, 'it is recommended to use the smaller max_bin (e.g. 63) to get the better speed up'.

learner$param_set$values = mlr3misc::insert_named(
  learner$param_set$values,
    list(
      "objective" = "multiclass",
      "device_type" = "gpu",
      "max_bin" = 63L,
      "early_stopping_round" = 10,
      "learning_rate" = 0.1,
      "seed" = 17L,
      "metric" = "multi_logloss",
      "num_iterations" = 100,
      "num_class" = 3
      )
  )

All other steps are similar to the workflow without GPU support.

The GPU support has been tested in a Docker container running on a Linux 19.10 host, Intel i7, 16 GB RAM, an NVIDIA(R) RTX 2060, CUDA(R) 10.2 and nvidia-docker.

More Infos:

About

Learners from {lightgbm} for mlr3

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • R 100.0%