What is in this repository

Product/Service Monitoring System

This project was inspired sentdex. (https://github.com/Sentdex/socialsentiment/)

What is in this repository

Jupyter Notebooks

product_service_monitoring_system.ipynb: contains training and comparing of various models including BERT, TextBlob, and VADER models.

Python Files

twitter_stream.py: connects to Twitter API and streamline tweets with various keywords
tweet_analyzer.py: cleans and analyzes streamed tweets
app.py: Dash dashboard

Folders

images: contains image files
models: contains (1) BERT NLP and (2) spaCy TF-IF vectorization sentiment analysis models *(BERT model excluded)
data: contains SQLite3 databases pulled from Twitter using Tweepy (only sample database is included) and bag_of_words
datasets: contains datasets that were used to train various NLP sentiment analysis models
src: contains useful codes that were used in creating models
keys: contains Twitter API key information *(files excluded)

How to use this project

Add Twitter API information in keys directory, and make sure the PATH is correctly defined in twitter_stream.py.
Adjust any keywords or queries in twitter_stream.py.
Set up SQLITE3 database path and run twitter_stream.py.
Download BERT_2 model using this test and store it in a directory of your choice.
In tweet_analyzer.py: 1) correctly set up SQLITE3 database path that matches step #3 and 2) correctly set up pathway for BERT_2 model.
Run tweet_analyzer.py
Run app.py

Introduction

Business Case

As of January of 2020, there are approximately 145 million users on Twitter. 22% of Americans are on Twitter and 500 million tweets are sent each day globally. This is why many companies internationally use Twitter for marketing. In fact 80% of Twitter users have mentioned a brand in a tweet, and 77% of Twitter users feel more positive when their tweets have been replied by the mentioned brand [1].

We believe that Twitter is one of the platforms that provides people to share their opinion, evaluations, attitudes, and emotions about virtually anything including certain products freely, and for any companies, this is like a gold mine waiting to be mined for opinions are central to almost all human activities and are key influencers of our behaviors [2]. Sources:

[1] https://unsplash.com/photos/ulRlAm1ITMU

[2]. Bing Liu, https://www.morganclaypool.com/doi/abs/10.2200/s00416ed1v01y201204hlt016

Goals

The goals of this project are to

[1] stream Twitter with on various topics (ex. Microsoft, Starbucks, Google, etc.) and
[2] effectively implement various NLP models (including TextBlob, BERT, and TF-IDF models) to classify tweets
[3] to flag the user for any strongly negative tweets on a Dashboard to effectively respond to them.

Conclusion

Models Used

BERT model has accuracy of 84%, however due to its computing time, VADER was used as an initial model to classify polarity of each tweet. The BERT model was used to confirm any strongly negative sentiment tweets classified by VADER.

Dashboard

Keyword You are able to select up to two keywords to monitor online. If you choose to choose two keywords, you must separate them by a comma!

*if no keyword is selected, then it streams

Graph1 This graph represents moving average value of sentiments towards the keywords you have selected along with number of tweets sent.

Graph2 This pie graph represents sentiment distribution towards your keyword.

By clicking Generate Word Cloud, it generates frequently used words in tweets related to the keywords for each sentiment.

Table1 This table shows recent tweets sent filtered by the keywords chosen. You are able to click on link button to access the actual tweet.

Table2 This table shows recent flagged tweets that are strongly negative.

Saving data into .csv file You must click GENERATE CSV FILE before you can download the files. You can either download the whole raw tweet data or just the flagged ones.

Live Dashboard

Currently working on deploying the app through Heroku!

Future Work

Database structure
- As the database size increases, it might get too much for sqlite3 to handle. So, we want to separate database into parts so only when a large number of data is requested, we can combine or join smaller databases.
Reply feature
- It would be nice if we could add a feature where you could reply to any of tweets shown in the dashboard without going to twitter page.
Multiple keywords
- It would be nice if multiple keywords can be analyzed and followed at a given time.
Table Editing Mode
- Modify flagged tweets in the dashboard so that a user can classify flagged tweets as 'resolved', 'false negative', or 'other' for further customer service / data analysis.
Further analysis on both positive and negative tweets
- Find any correlation between tweet trends with how the company is doing to help with future direction of a company.

Name		Name	Last commit message	Last commit date
Latest commit History 49 Commits
data		data
datasets		datasets
images		images
models		models
src		src
.DS_Store		.DS_Store
.gitignore		.gitignore
README.ipynb		README.ipynb
README.md		README.md
ShowCasePresentation.pptx		ShowCasePresentation.pptx
app.py		app.py
capstone_twitter_analysis.pdf		capstone_twitter_analysis.pdf
capstone_twitter_analysis.pptx		capstone_twitter_analysis.pptx
note.ipynb		note.ipynb
product_service_monitoring_system.ipynb		product_service_monitoring_system.ipynb
requirements.txt		requirements.txt
tweet_analyzer.py		tweet_analyzer.py
twitter_stream.py		twitter_stream.py
twitter_stream_v2.py		twitter_stream_v2.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

What is in this repository

Introduction

Business Case

Goals

Conclusion

Models Used

Dashboard

Live Dashboard

Future Work

About

Releases

Packages

Languages

singsang2/capstone-twitter-analysis

Folders and files

Latest commit

History

Repository files navigation

What is in this repository

Introduction

Business Case

Goals

Conclusion

Models Used

Dashboard

Live Dashboard

Future Work

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages