BISI-CST2101-W23-Project_Diabetes_Analysis-and-Prediction

Project Backgroud

This project comes from CST2101- Python Programming, Alqonguin BISI program. In this project, we are requested to study a dataset called 'pima', which contains 9 features and 1000 observations. The features include 'Pregnancies', 'Glucose', 'Blood pressure', 'SkinThickness', 'Insulin', 'BMI', 'DiabetesPedigreeFunction', 'Age' and 'Outcome'. The last feature 'Outcome' is Class variable with '0' meaning the person is not diabetic or '1' meaning the person is diabetic.

Project Objective

The aim of this project is to use the first 8 features in the dataset to make predication on outcome. Two machine learning models, logistic regression and random forest model were adopted in this study and the rate of accuracy of these two models will be calculated and compared.

Process and Tools

Exploratory Data Analysis (seaborn, matplotlib);
Machine learning models (Logistic regressioin and Random forest, sklearn and its functions)

Conclusions

From the accuracy result of these two models, it indicates the 'Random Forest' performs slightly better than 'Logistic regression' model.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
README.md		README.md
Yue MA_041088896.ipynb		Yue MA_041088896.ipynb
diabetes.csv		diabetes.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BISI-CST2101-W23-Project_Diabetes_Analysis-and-Prediction

Project Backgroud

Project Objective

Process and Tools

Conclusions

About

Releases

Packages

Languages

masonma99/BISI-CST2101-W23-Project_Diabetes_Analysis-and-Prediction

Folders and files

Latest commit

History

Repository files navigation

BISI-CST2101-W23-Project_Diabetes_Analysis-and-Prediction

Project Backgroud

Project Objective

Process and Tools

Conclusions

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages