Skip to content

databricks-demos/dbx-demo

Repository files navigation

DBX Demo using Lending Club dataset

This is a sample project for Databricks, generated via cookiecutter.

While using this project, you need Python 3.X and pip or conda for package management.

Installing project requirements

pip install -r unit-requirements.txt

Install project package in a developer mode

pip install -e .

Testing

For local unit testing, please use pytest:

pytest tests/unit

For an integration test on interactive cluster, use the following command:

dbx execute --cluster-name=<name of interactive cluster> --job=lendingclub_scoring_dbx-sample-integration-test

For a test on a automated job cluster, use launch instead of execute:

dbx launch --job=lendingclub_scoring_dbx-sample-integration-test

Interactive execution and development

  1. dbx expects that cluster for interactive execution supports %pip and %conda magic commands.
  2. Please configure your job in conf/deployment.json file.
  3. To execute the code interactively, provide either --cluster-id or --cluster-name.
dbx execute \
    --cluster-name="<some-cluster-name>" \
    --job=job-name

Multiple users also can use the same cluster for development. Libraries will be isolated per each execution context.

Preparing deployment file

Next step would be to configure your deployment objects. To make this process easy and flexible, we're using JSON for configuration.

By default, deployment configuration is stored in conf/deployment.json.

Deployment

To start new deployment, launch the following command:

dbx deploy

You can optionally provide requirements.txt via --requirements option, all requirements will be automatically added to the job definition.

Launch

After the deploy, launch the job via the following command:

dbx launch --job=lendingclub_scoring_dbx-sample

CICD pipeline settings

Please set the following secrets or environment variables. Follow the documentation for GitHub Actions or for Azure DevOps Pipelines:

  • DATABRICKS_HOST
  • DATABRICKS_TOKEN

Releases

No releases published

Packages

No packages published