Skip to content
/ Lale.jl Public
forked from IBM/Lale.jl

a Julia wrapper of Python's lale automl package

License

Notifications You must be signed in to change notification settings

hirzel/Lale.jl

 
 

Repository files navigation

Lale.jl: Julia wrapper of python's lale package

Documentation Build Status Help

Lale.jl is a Julia wrapper of Python's Lale library for semi-automated data science. Lale makes it easy to automatically select algorithms and tune hyperparameters of pipelines that are compatible with scikit-learn, in a type-safe fashion.

Instructions for Lale developers can be found here.

For a quick notebook demo: Lale Notebook Demo or you can view it with NBViewer.

Package Features

  • automation: provides a consistent high-level interface to existing pipeline search tools including Hyperopt, GridSearchCV, and SMAC
  • correctness checks: uses JSON Schema to catch mistakes when there is a mismatch between hyperparameters and their type, or between data and operators
  • interoperability: supports growing library of transformers and estimators

Here is an example of a typical Lale pipeline using the following processing elements: Principal Component Analysis (PCA), NoOp (no operation), Random Forest Regression (RFR), and Decision Tree Regression (DTree):

lalepipe  = (PCA + NoOp) >> (RFR | DTree)
laleopt   = LalePipeOptimizer(lalepipe,max_evals = 10,cv = 3)
laletr    = fit!(laleopt, Xtrain,Ytrain)
pred      = transform!(laletr,Xtest)

The block of code above will jointly search the optimal hyperparameters of both Random Forest and Decision Tree learners and select the best learner while at the same time searching the optimal hyperparameters of the PCA.

The pipe combinator, p1 >> p2, first runs sub-pipeline p1 and then pipes its output into sub-pipeline p2. The union combinator, p1 + p2, runs sub-pipelines p1 and p2 separately over the same data, and then concatenates the output columns of both. The or combinator, p1 | p2, creates an algorithmic choice for the optimizer to search and select which between p1 and p2 yields better results.

Installation

Lale is in the Julia General package registry. The latest release can be installed from the julia prompt:

julia> using Pkg
julia> Pkg.update()
julia> Pkg.add("Lale")

or use Julia's pkg shell which can be triggered by ]

julia> ]
pkg> update
pkg> add Lale

Sample Lale Workflow

using Lale

using DataFrames
using AutoMLPipeline: Utils

# load data
iris = getiris()
Xreg = iris[:,1:3] |> DataFrame
Yreg = iris[:,4]   |> Vector
Xcl  = iris[:,1:4] |> DataFrame
Ycl  = iris[:,5]   |> Vector

# lale ops
pca     = laleoperator("PCA")
rb      = laleoperator("RobustScaler")
noop    = laleoperator("NoOp","lale")
rfr     = laleoperator("RandomForestRegressor")
rfc     = laleoperator("RandomForestClassifier")
treereg = laleoperator("DecisionTreeRegressor")

# Lale regression
lalepipe  = (pca + noop) >>  (rfr | treereg )
lale_hopt = LalePipeOptimizer(lalepipe,max_evals = 10,cv = 3)
laletrain = fit(lale_hopt,Xreg,Yreg)
lalepred  = transform(laletrain,Xreg)
lalermse  = score(:rmse,lalepred,Yreg)

# Lale classification
lalepipe  = (rb + pca) |> rfc
lale_hopt = LalePipeOptimizer(lalepipe,max_evals = 10,cv = 3)
laletrain = fit(lale_hopt,Xcl,Ycl)
lalepred  = transform(laletrain,Xcl)
laleacc   = score(:accuracy,lalepred,Ycl)

About

a Julia wrapper of Python's lale automl package

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Julia 100.0%