Predictive Maintenance with AI

AI-Driven Predictive Maintenance Application with Language Models

Introduce some simple approaches to handle this data, and also try to test a very light-weight LLM approach to demonstrate the idea of using LLM for historical data. Regarding the application, I use Streamlit to create a basic web application for this equipment failure prediction problem, later on, we can enhance and customize the Streamlit app by adding more components, visualizations, and interactive features to make it more user-friendly and informative.

Keywords: Tabular, Boosting, LLM, Streamlit.

Main features

A basic example of a full Python-based project that includes a backend with endpoints to receive historical sensor data and a front-end interface using Streamlit to visualize and prepare the data for processing by an AI model.
Some simple boosting and time series methods (LSTM) have been tested with naive feature selection.
Light-weight LLM using Distilbert model to create the tabular embedding with small tokenizer_max_length 16

Requirements

pytorch
streamlit
flask
scikit-learn
xgboost
joblib
imblearn
'arize[AutoEmbeddings]'
lamma-cpp-python [optional]
transformer [optional]

Dataset

Summary of Telemetry Data Observations:

The data in question possibly consists of artificially generated records spanning from January 1, 2015, to January 1, 2016.
Each row corresponds to a specific machine's condition during a given hour.
The dataset comprises readings from 100 distinct machines and is devoid of any duplicated entries or missing values.
The four metrics—voltage, vibration, pressure, and rotation—follow a normal distribution pattern.
The labels show this is very imbalance dataset

Getting Start

Training:

# Download data from this link: https://www.kaggle.com/datasets/arnabbiswas1/microsoft-azure-predictive-maintenance?sort=recent-comments
mkdir data; data/origin
# Then extract it to  data/origin/ to train the model, or you can download *feather file and put directly to data/
python train.py --model [x,r,l,lm]

Serving:

# Run:
streamlit run app.py
#From the open localhost, upload the csv files, the press Predict button

TO DO:

Explore in deep the data and do feature engineering
Increase the embedding tokenizer_max_length
Test more LLM approaches
Live deployment on Heroku, Render or Azure instead of local.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
assets		assets
model		model
weights		weights
.gitignore		.gitignore
README.md		README.md
app.py		app.py
dataset.py		dataset.py
eda_ref.ipynb		eda_ref.ipynb
requirements.txt		requirements.txt
test.py		test.py
train.py		train.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Predictive Maintenance with AI

Main features

Requirements

Dataset

Getting Start

TO DO:

Reporting Issues and Feedbacks

About

Releases

Packages

Languages

tuanho27/e2e_AzurePM_python

Folders and files

Latest commit

History

Repository files navigation

Predictive Maintenance with AI

Main features

Requirements

Dataset

Getting Start

TO DO:

Reporting Issues and Feedbacks

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages