Practical MLOps using Azure

Hey there! I'm Anupam 👋

A data scientist from India 🇮🇳

👨🏻‍💻 About Me

I work as a data scientist in the consulting industry.
I love solving problems and automating solutions.
I love music and travelling!

Project description

This project implements automated MLOps. Every week it fetches data via an API, registers the data as a dataset in Azure datastore. It updates the versions of required pipeline components like run ID, model version etc.. Then it trains the registers a pytorch LSTM model in Azure ML studio. This model is then deployed to an online endpoint. Everything is done through cron jobs.

To read in more detail, the article for this project is published here: https://towardsdatascience.com/practical-mlops-using-azure-ml-c6a3cb201d2b

This project is still in development. :

Improvement items:

Pytests during CI
Implement option of monitoring data drift
Model profiling
Infrastructure performance
Model performance assessment on data slices
Model performance on live data

How to use the project

Fork this repository
Create an Azure ML Studio and resource group if you do not have already
Create a Service Principal(ref.) in Azure and save the generated JSON as AZURE_CREDENTIALS in Settings> Secrets and variables> Actions>New repository secret You would need this to access ML Studio
Create a Personal Access Token from from Settings>Developer settings>Personal access tokens and save it as PAT to your repository secrets. Allow workflow read/write access, actions, code, commit statuses, merge queues, pull requests.
Update your workspace and resource group details in cli/setup.sh

Data pipeline

The cron job is .github/workflows/data_pipeline.yml

Initial setup
Downloads the data via an API
Updates different components' version
Pushes changes to the repository
Registers the dataset in Azure ML Studio datastore

Associated files: jobs/data_upload.yml, jobs/data_download.py, jobs/update_yamls.py

Training pipeline

The cron job is .github/workflows/model_pipeline.yml

Initial setup
Creates a compute
Trains a LSTM model using pytorch lightning on the latest dataset
Registers the new model in the workspace

Associated files: jobs/train.yml, jobs/train.py

Deployment pipeline

The cron job is .github/workflows/deployment_pipeline.yml

Creates new online endpoint and deployment for the first time
For the successive training runs, the new model would be deployed to the existing online endpoint

Associated files: jobs/deploy.yml, jobs/deployment.py

Currently all the cron jobs have been paused and the endpoint has been deleted to avoid costs.

Name		Name	Last commit message	Last commit date
Latest commit History 277 Commits
.github/workflows		.github/workflows
__pycache__		__pycache__
cli		cli
config		config
data		data
jobs		jobs
notebooks		notebooks
src		src
tests		tests
utils		utils
.DS_Store		.DS_Store
.gitignore		.gitignore
Pipelines.md		Pipelines.md
README.md		README.md
__init__.py		__init__.py
random.ipynb		random.ipynb
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Practical MLOps using Azure

Hey there! I'm Anupam 👋

A data scientist from India 🇮🇳

👨🏻‍💻 About Me

Project description

How to use the project

Data pipeline

Training pipeline

Deployment pipeline

About

Releases

Packages

Contributors 2

Languages

coderkol95/MLOps_stock_prediction

Folders and files

Latest commit

History

Repository files navigation

Practical MLOps using Azure

Hey there! I'm Anupam 👋

A data scientist from India 🇮🇳

👨🏻‍💻 About Me

Project description

How to use the project

Data pipeline

Training pipeline

Deployment pipeline

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages