Skip to content


Repository files navigation


(look inside the project folders for detailed description and requirements for running the scripts)

The project descriptions are as follows:

Indeed posts classifier: (ASU) Jan-May 2020

• Scraped data from indeed job postings using beautiful soup to train an NLP Classification model.
• Pre-processed data by removing stop words, punctuation, noise, etc.
• Achieved a cross-validation score of 86 percent after grid search.
• Classified data scraped per company to respective functional areas using the trained model.

Movie Recommendation System: (ASU) Aug-Dec 2019

• Created a Recommendation model to suggest products to users and boost sales.
• Trained the model using matrix factorization and content-based filtering to predict the user and movie vectors.
• Tuned parameters using grid search and compared it with other ML models.
• Evaluated the model with other trained models using RMSE values.
• Built a UI to display recommended movies for the given movie.

Image Recommendation (ASU) Aug-Dec 2019

• Modeled an Image recommendation system to recommend similar users, images based on TF, DF, TF-IDF, and Global color histograms with the help of a NoSQL Database.
• Reduced the dimensionality and used Mahalanobis distance to measure similarity.
• Built Clusters of similar images using Spectral and Normalize cut partition algorithm from an image-image graph.
• Also, Visualized K- dominant Images based on input to generate relevant images using Personalized page ranking algorithm, Locality sensitive Hashing.

American Sign Language Recognition (ASU) Aug-May 2018

• Engineered a Classification model to classify Sign Gestures based on video key point data generated.
• Calibrated every frame in a gesture with its fixed points and used Feature selection techniques, FFT and DWT for frequency and temporal properties, and kept the features with most variance.
• Achieved 84 percent classification accuracy and F1 score of 0.6 using a set of trained Neural networks on testing data among other models Random forest, SVM, and logistic regression.

Amazon Analytics Website (ASU) Jan-May 2019

• Pre-processed the amazon dataset and classified products based on catagories
• Also, analysed data using pySpark to get information of top 10 products, positive and negative words for products
• Displayed the analytics using Sunburst, word cloud and bubble chart using interactive visualizations
• Hosted the website on AWS