Here you can find a list of practical examples on how you can use ZenML integrations with brief descriptions for each example. In case you run into any open questions while interacting with our examples, feel free to check our docs.
The examples contain aspects that span across the whole range of tools and concepts that are integral to the MLOps ecosystem.
Certain phases of machine learning projects require a large amount of experimentation with many possible approaches. Experiment tracking is vital to capture and compare all your experiments so that you can narrow down your solution space.
- mlflow_tracking: Track and visualize experiment runs with MLflow Tracking.
- wandb_tracking: Track and visualize experiment runs with Wandb Experiment Tracking.
Quickly iterating is usually easiest when you code on your local machine. But there comes a point where you will want to have your pipelines run free from all the limitations of your local setup (performance, data access, uptime, etc ...). With ZenML you can quickly switch out the pipeline code orchestrator using the CLI. Here are some examples on how:
- airflow_orchestration: Running pipelines with Airflow locally.
- kubeflow_pipelines_orchestration: Shows how to orchestrate a pipeline using a local Kubeflow stack.
What good are your models if no-one gets to interact with them? ZenML offers you some easy ways to quickly deploy your model.
- mlflow_deployment: Deploys your trained models to a local MLflow deployment service and allows you to run predictions against this endpoint.
- seldon_core_deployment: Take your model deployment to the next level with Seldon. This example gives you detailed instructions to help you deploy your model onto a Kubernetes cluster.
- kserve_deployment: Take your model deployment to the next level with KServe. This example gives you detailed instructions to help you deploy a Pytorch model onto a Kubernetes cluster with TorchServe.
Not all steps are created equal. While some steps need only a bit of computational power, the training step is usually a different beast altogether, with a big appetite for CUDA cores and VRAM. This is where Step Operators will make your life easy. With just a bit of configuration your training step can easily be run on Vertex AI, Sagemaker or AzureML. Check out our example to see how.
- step_operator_remote_training: Run your compute-intensive steps on one of the big three hyperscalers Vertex AI, Sagemaker or AzureML.
Some of our integrations don't really fit into a specific category.
- huggingface:
Hugging Face
is a startup in the Natural Language Processing (NLP) domain offering its library of SOTA models in particular around Transformers. See how you can get started using huggingface datasets, models and tokenizers with ZenML. - neural_prophet: NeuralProphet is a time-series model that bridges the gap between traditional time-series models and deep learning methods. Try this example to find out how this type of model can be trained using ZenML
- xgboost: XGBoost is an optimized distributed gradient boosting library that provides a parallel tree boosting algorithms.
- lightgbm: LightGBM is a gradient boosting framework that uses tree-based learning algorithms with a focus on distributed, efficient training.
What is data-centric machine learning without data? Feature stores are the modern approach to advanced data management layer for machine learning that allows to share and discover features for creating more effective machine learning pipelines.
- feast_feature_store: Use a feature store hosted on a local Redis server to get started with Feast.
Testing your data for integrity problems and monitoring how its characteristics evolve and drift over time are best practices that are essential not only for developing highly accurate machine learning models, but also for keeping them from deteriorating after they are deployed in production. Data Validators enhance your pipelines with data quality testing and monitoring and allow you to visualize the results in the post-execution workflow.
- deepchecks_data_validation: Run data integrity, data drift and model drift tests in your pipelines with our Deepchecks integration.
- evidently_drift_detection: Detect drift with our Evidently integration.
- great_expectations_data_validation: Validate your data with our Great Expectations integration.
- whylogs_data_profiling: Profile your data using the whylogs integration.
- facets_visualize_statistics: The facets integration allows you to retroactively go through pipeline runs and analyze the statistics of the data artifacts.
For some of these examples, ZenML provides a handy CLI command to pull them
directly into your local environment. First install zenml
and spin up the
dashboard:
# Install ZenML
pip install zenml[server]
# Start the ZenServer to enable dashboard access
zenml up
Then you can view all the examples:
zenml example list
And pull individual ones:
# By doing this, a `zenml_examples` directory would be created with the examples
zenml example pull EXAMPLE_NAME
You can now even run the example directly with a one-liner:
zenml example run EXAMPLE_NAME # not implemented for all examples
Some of our examples feature remote stack components, or they have an emphasis on the collaboration aspect of ZenML. In order to see this in full effect, you might require a ZenML server.
In order to deploy a ZenML server, you can follow the instructions here.
Have any questions? Want more examples? Did you spot any outdated, frustrating examples? We got you covered!
Feel free to let us know by creating an issue here on our GitHub or by reaching out to us on our Slack.
We are also always looking for contributors. So if you want to enhance our existing examples or add new ones, feel free to make all Pull Request. Find out more here.