Industrial_Copper_Modeling

Overview

This project addresses challenges in the copper industry related to sales and lead classification. The dataset undergoes preprocessing, including handling skewness, outliers, and cleaning. Two machine learning models are implemented: a regression model predicting 'Selling_Price' and a classification model predicting 'Status' (WON or LOST). A Streamlit GUI is created for user interaction, allowing input of values for prediction.

Problem Statement

The copper industry faces issues with manual predictions due to skewed and noisy data. A regression model is implemented to predict 'Selling_Price', and a classification model predicts 'Status' (WON/LOST). The project involves data exploration, preprocessing, EDA, feature engineering, model building, and evaluation.

Project Workflow Execution

Data Understanding:
- Identify variable types and distributions.
- Treat 'Material_Reference' values starting with '00000' as null.
- Treat reference columns as categorical variables.
- Remove the 'ID' column as it may not be useful.
Data Preprocessing:
- Treat outliers using IQR.
- Identify and treat skewness using log transformation or other techniques.
- Encode categorical variables using suitable techniques.
EDA (Exploratory Data Analysis):
- Visualize outliers and skewness using Seaborn's plots.
- Use boxplot, distplot, and violinplot for visualization.
Feature Engineering:
- Create new features if applicable.
- Drop highly correlated columns using a heatmap.
Model Building and Evaluation:
- Split the dataset into training and testing sets.
- Train and evaluate regression and classification models.
- Use metrics like accuracy, precision, recall, F1 score, and AUC curve.
- Optimize model hyperparameters using cross-validation and grid search.
Model GUI (Streamlit):
- Create an interactive page with task input (Regression or Classification).
- Enter values for each column except 'Selling_Price' for regression and 'Status' for classification.
- Predict new data from Streamlit and display the output.

LinkedIn Profile

Link: www.linkedin.com/in/akashkumarl Visit the link to see the project video

Project Structure

main.py: Main Python script containing the Streamlit application.
Copper_Cleaned.csv: Cleaned dataset used for modeling.
regression_model.joblib: Joblib file containing the trained regression model.
classification_model.joblib: Joblib file containing the trained classification model.
Copper_Set.xlsx: Original dataset used for preprocessing.
Data Preprocessing.ipynb: Data Preprocessing steps performed in a notebook.
Model Building - Regression or Classification: Model Building steps performed for finding best model.

Usage

Install required libraries: pip install streamlit pandas scikit-learn joblib.
Run the Streamlit application: streamlit run main.py.
Choose between 'Home' and 'ML Prediction' in the sidebar.
For 'ML Prediction,' select the model type (Regression or Classification).
Input values in the Streamlit GUI and click 'Submit' for predictions.

Technologies Used

Streamlit
Pandas
Scikit-Learn
Seaborn
Data Wrangling

Feel free to contribute, report issues, or suggest improvements!

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
Copper_Cleaned.csv		Copper_Cleaned.csv
Copper_Set.xlsx.csv		Copper_Set.xlsx.csv
Data Preprocessing.ipynb		Data Preprocessing.ipynb
Model Building - Regression.ipynb		Model Building - Regression.ipynb
Model Building - Classification.ipynb		Model Building - Classification.ipynb
README.md		README.md
main.py		main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Industrial_Copper_Modeling

Overview

Problem Statement

Project Workflow Execution

LinkedIn Profile

Project Structure

Usage

Technologies Used

About

Releases

Packages

Languages

AkashKumar305/Industrial_Copper_Modeling

Folders and files

Latest commit

History

Repository files navigation

Industrial_Copper_Modeling

Overview

Problem Statement

Project Workflow Execution

LinkedIn Profile

Project Structure

Usage

Technologies Used

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages