A dashboard for viewing character targeted sentiment analysis and story stats.
It uses Selenium and BeautifulSoup for scraping comments + Pandas and NLTK for data preprocessing and sentiment analysis + Streamlit and Plotly Express for the dashboard and visulization. You can find the deployed website here!
- The
scrape.py
files use Selenium and BeautifulSoup for scraping around 35k comments from over 100 chapters of my work on Wattpad, here. - The
preprocess.py
files clean, preprocess and split the comments into sentiment scores using NLTK's VADER lexicon. VADER relies on a dictionary that maps lexical features to emotion intensities known as sentiment scores. - Most computations are performed on the Compound sentiment score. The sentiment score of a text can be obtained by summing up the intensity of each word in the text.
- The
sentiment.py
file creates the dashboard which visualises stats and inferences.
selenium
andbeautifulsoup:
data mining.pandas:
formatting and cleaning the data.nltk:
sentiment analysis using VADER.plotly express:
visualisations.streamlit:
web framework.
The live project is deployed on https://share.streamlit.io/rubyruins/sentifluent/sentiment.py.
You must have Python 3.6 or higher to run the file.
- Create a new virtual environment for running the application. You can follow the instructions here.
- Navigate to the virtual environment and activate it.
- Install the dependancies using
pip install -r requirements.txt
- Run the
sentiment.py
file withstreamlit run sentiment.py
Note: to run the scraping and preprocessing scripts locally, you must have a version of chromedriver.exe
that matches the one installed on your device.