Skip to content

Latest commit

 

History

History

sklearn_classifier

Training Functions

sklearn-classifer

Run any scikit-learn compatible classifier or list of classifiers

steps

  1. generate a scikit-learn model configuration using the model_pkg_class parameter
    • input a package and class name, for example, sklearn.linear_model.LogisticRegression
    • mlrun will find the class and instantiate a copy using default parameters
    • You can modify both the model class instantiator and the fit methods (other functions could be similarly modified)
  2. get a sample of data from a data source
    • select all rows using -1
    • select a random sample of rows using a negative integer
    • select consecutive rows using a positive integer
  3. split the data into train, validation, and test sets
    • the test set is saved as an artifact and never seen again until testing
    • WIP: this will be parametrized to produce cross-validator splits (one way of performing CV)
  4. train the model
  5. pickle / serialize the model
    • models can be pickled or saved as json
  6. evaluate the model
    • a custom evaluator can be provided, see function doc for details