Skip to content

Machine Learning with Spark MLlib is one of the project titles taken up as a part of the UE19CS322 Big Data course at PES University. This simulates a real world scenario with enormous amount of data for predictive modelling. The data source is a stream and the application faces the constraint of only being able to handle batches of a stream at …

Notifications You must be signed in to change notification settings

SiddhanthM8055/Spark-Streaming-for-Machine-Learning

Repository files navigation

Spark-Streaming-for-Machine-Learning

Machine Learning with Spark MLlib is one of the project titles taken up as a part of the UE19CS322 Big Data course at PES University. This simulates a real world scenario with enormous amount of data for predictive modelling. The data source is a stream and the application faces the constraint of only being able to handle batches of a stream at any given point in time.

This project uses spark to train ML algorithms to help classify mails based on the subject and body of the mail into either spam or Ham(Non-Spam) mails. We have used three models MLP(Multilayer Perceptron) MNB(Multinomial Naive bayes) and PAC(Passive Aggressive Classifier) which are Supervised Learning Models and Mini Batch K-means clustering which is an unsupervised learning algorithm

About

Machine Learning with Spark MLlib is one of the project titles taken up as a part of the UE19CS322 Big Data course at PES University. This simulates a real world scenario with enormous amount of data for predictive modelling. The data source is a stream and the application faces the constraint of only being able to handle batches of a stream at …

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •