About the Problem:

The problem chosen consists of a classification problem, based on the Bank Marketing Dataset, that was obtained here. The problem rests in the correct classification if a bank customer accepts or does not subscribe to a bank term deposit based on the client data and data obtained by the marketing campaign. More details about the dataset and the problem itself can be found in the Notebook in the repository.

Used setup

For the development of this project, the setup used was a Windows 10 machine with the wsl2 installed with the Ubuntu 20.04 Distro. In addition to that, the Anaconda3 was installed in the virtual machine and was creating the mlops environment, which was responsible to execute the entire pipeline.

Setup and Running the project

It's assumed that Anaconda3 was already installed on the computer. With that said, the mlops environment can be created using the following command:

conda env create -f env.yml

Once then environment has been installed, it's necessary to activate it:

conda activate mlops

Since the Wandb is used during the entire execution of the pipeline, it's necessary to make a login using the API keys, which can be found in the account settings in the Wandb. To login into the account the use the command:

wandb login --relogin

A few notes before execute the mlflow

When chaining together the steps, the output artifact of a step should be the input artifact of the next one (when applicable). Also use the artifact_type options so that the final visualization of the pipeline highlights the different steps. For example, you can use raw_data for the artifact containing the downloaded data, preprocessed_data for the artifact containing the data after the preprocessing, and so on.

Parameters can be override using the parameter main.execute_steps to only execute one or more steps of the pipeline, instead of the entire pipeline. This is useful for debugging.

For example, this only executes the svm step:

mlflow run . -P hydra_options="main.execute_steps='svm'"

and this executes download and preprocess:

mlflow run . -P hydra_options="main.execute_steps='download,preprocess'"

To run the entire pipeline just use the following command to execute the project with the default settings defined in the config.yaml in the folder ml_pipeline of the repository:

mlflow run .

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
Notebook		Notebook
ml_pipeline		ml_pipeline
README.md		README.md
env.yml		env.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About the Problem:

Used setup

Setup and Running the project

A few notes before execute the mlflow

Authors:

About

Releases 2

Packages

Contributors 3

Languages

thiagomaiasouto/Bank-Marketing-Practical-MLOps

Folders and files

Latest commit

History

Repository files navigation

About the Problem:

Used setup

Setup and Running the project

A few notes before execute the mlflow

Authors:

About

Resources

Stars

Watchers

Forks

Releases 2

Packages 0

Contributors 3

Languages

Packages