Student Performance prediction

Machine Learning - Supervised Learning for student performance prediction

The aim of this project is to improve the current trends in the higher education systems and to find out which factors might help in creating successful students. It is really necessary to find successful students as it motivates higher education systems to know them well and one way to know this is by using valid management and processing of the student’s database.

Data Description

Data source link: http://archive.ics.uci.edu/ml/datasets/student+performance
Data format: Integer
Size: 396 rows X 33 columns
Number of Instances: 396
Number of Attributes: 33

This data is of student’s achievement in secondary education of Portuguese school. The data attributes include student grades, demographic, social and school related features) and it was collected by using questionnaires and school reports. Dataset are provided regarding the performance in subject: Mathematics. The target attribute G3 has a strong correlation with attributes G2 and G1. This occurs because G3 is the final year grade, while G1 and G2 correspond to the 1st and 2nd period grades.

During the data pre-processing set we found out that data present in our dataset was clean, as a result we did not had to perform the data cleaning methods.

In our dataset we had 33 attributes and as result we had to reduce some of the attributes which were not so important, to get better accuracy and low-cost tree. In organizations these kind of strategies is performed to reduce the data, so we also decided to do the same.

Decision tree

A decision tree is a decision support tool that uses a tree-like graph or model of decisions and their possible consequences

Naive Bayesian

Naive Bayes classifiers are a family of simple probabilistic classifiers based on applying Bayes' theorem with strong (naive) independence assumptions between the features.

SVM

A Support Vector Machine (SVM) is a discriminative classifier formally defined by a separating hyperplane.

K nearest neighbor

In pattern recognition, the k-nearest neighbour’s algorithm (k-NN) is a non-parametric method used for classification and regression.

We have implemented our algorithms with the help of Python. We have made use of in-built python libraries and packages to implement our classification algorithms. We have made use of the following libraries and packages:

Numpy
Pandas
Scikit-learn
Matplotlib

Highest Accuracy achieved = 80%

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
Comparision.png		Comparision.png
DecisionTree.png		DecisionTree.png
K-Nearest Neighbors.png		K-Nearest Neighbors.png
NaiveBayesian.png		NaiveBayesian.png
README.md		README.md
Support Vector Machine.png		Support Vector Machine.png
datat.csv		datat.csv
decision tree.png		decision tree.png
finaldecisiontree.py		finaldecisiontree.py
knn.py		knn.py
naiv.py		naiv.py
svm1.py		svm1.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Student Performance prediction

Machine Learning - Supervised Learning for student performance prediction

Data Description

Decision tree

Naive Bayesian

SVM

K nearest neighbor

About

Releases

Packages

Languages

ashishT1712/Data-Mining-Student-Performance

Folders and files

Latest commit

History

Repository files navigation

Student Performance prediction

Machine Learning - Supervised Learning for student performance prediction

Data Description

Decision tree

Naive Bayesian

SVM

K nearest neighbor

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages