Social Media Data Science Pipeline (CS 515)

Project Implementation

Group

the_gladiators

Team Members

Nikita Mandlik ([email protected])
Brinda Eshwar ([email protected])
Kanishk Bharampurkar ([email protected])
Harshad Bhandwaldar ([email protected])

Introduction

The data collected using the Twitter API and the RedditAPI, will be used to study and develop insights about the public opinion on the economic crisis, inflation, and the influence of upcoming-recession updates. The data will help to predict the impact of the upcoming recession on the community. Also, to analyze the insights generated from the public opinionated datasets (Reddit and Twitter) and news articles (collected from NewYork Times API).

Data Flow Diagram

Data Sources

Twitter API
Reddit API
New York Times API

How to use?

Scrapper: app folder has twitter, reddit and ny times api to scrape data and store it in database.
UI: Using fastapi data can be visualize. The api function only has functionality to display the database data.

Building application

Building env: Installing poetry and setting up virtiual environment.

$ sh build.sh

Running application (run each scrapper independently):

$ python3 /app/twitter/scrapper_twitter.py
$ python3 /app/reddit/scrapper_reddit.py
$ python3 /app/nyt/scrapper_nyt.py

Running UI:

cd ui/
$ uvicorn main:app --reload

Go to http://localhost:8000/docs to open UI.

Configuration

API keys and database credentials changed from config.py located at root of each app.

System requirement

Name	Requirement
Memory	8Gb
OS	Linux

References

[1] Twitter API Documentation. https://developer.twitter.com/en/docs/twitter-api

[2] Reddit API Documentation. https://www.reddit.com/dev/api/

[3] NYTimes API Documentation. https://developer.nytimes.com/apis

[4] Article Search API Documentation. https://developer.nytimes.com/docs/articlesearch-product/1/overview

[5] Docker Documentation. https://docs.docker.com/get-started/overview/

[6] Poetry Documentation. https://python-poetry.org/docs/

[7] MySQL Documentation. https://dev.mysql.com/doc/

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
project1_implementation/project-1-implementation-the_gladiators-project-1-version1		project1_implementation/project-1-implementation-the_gladiators-project-1-version1
project1_proposal		project1_proposal
project1_report/project-1-report-the_gladiators-main		project1_report/project-1-report-the_gladiators-main
project2_implementation		project2_implementation
project2_proposal/project-2-proposal-the_gladiators-main		project2_proposal/project-2-proposal-the_gladiators-main
project2_report		project2_report
project3_implementation		project3_implementation
project3_proposal/project-3-proposal-the_gladiators-main		project3_proposal/project-3-proposal-the_gladiators-main
project3_report/project-3-report-the_gladiators-main		project3_report/project-3-report-the_gladiators-main
.gitattributes		.gitattributes
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Social Media Data Science Pipeline (CS 515)

Project Implementation

Group

Team Members

Introduction

Data Flow Diagram

Data Sources

How to use?

Building application

Configuration

System requirement

References

About

Releases

Packages

Languages

kanishkb1/Recession_Condition_Analysis_on_Social_Media_Platforms

Folders and files

Latest commit

History

Repository files navigation

Social Media Data Science Pipeline (CS 515)

Project Implementation

Group

Team Members

Introduction

Data Flow Diagram

Data Sources

How to use?

Building application

Configuration

System requirement

References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages