This project conducts comparison of popular sentiment of two actresses - Anne Hathaway and Jennifer Lawrence based on Twitter API. We trained a XGBoost model to classify the sentiment of tweets and visualized the results using d3.js
The training dataset should be a csv file with 3 columns: Id, Category,Tweet, The test dataset should be a csv file with 2 columns: Id, Tweet.
The main code is in prepocessing&classfier/sentiment_analysis.ipynb The pre-trained word2vec and dataset are very large, so we didn't put them here.
Visualization code and a sample data file are in /visualization
Visit our project website for more info https://bigdatabeauty.weebly.com/