Reddit Scraper For Stock Picks

Summary

This repository contains python scripts which will scrape r/Stocks, r/Investing, and r/StockMarket. The goal is to scrape comments where there has been a stock ticker identified within the comment. The goal is to find redditors who consistently suggest stocks which provide high returns.

Medium Article

I wrote a medium article which explains the inspiration behind this project. The article can be found in the link below.

https://medium.com/@adhamsuliman/redditors-vs-the-market-a0e0833081c1

How to get up and running

Reddit Application

In order to use this repo, users must create a reddit application. Felippe Rodrigues wrote a great article on how to go about getting those credentials. The link to that article is provided below.

https://www.storybench.org/how-to-scrape-reddit-with-python/

Python Module Requirements

All the libraries used in this example can be installed using pip with the requirements.txt file included. Open any terminal or command prompt and type in the following line. Use conda install if working with a conda environment.

pip install requirements.txt

Current Processing Methodology

Scrape Reddit
1. Scrape new posts for r/stocks, r/investing, r/stockmarket.
2. If a new user is identified when scraping, scrape all of that user's comments.
3. Use Named Entity Recogntion (NER) to identify stocks within comments
4. Use vader sentiment to determine how positive a comment is.
Find stock prices
1. Use yfinance to pull data for all stocks found.
2. Calculate the return if a redditor had bought one share the day that he/she posted the initial commment.
3. Calculate the average daily return by comment where only the number of weekdays are taken into consideration.
4. Find the average daily return by user.

Dashboard

There is a dashboard which looks at this data which can be found in the following link. https://datastudio.google.com/reporting/7d5c20ed-b6a6-4162-81ba-062bf869c5c0/page/z9ScB

Page 1

The table on the top right list redditors with the number of comments found with a ticker in it and the average daily return of all their picks since the day of each comment. There are multiple filters that can be used for the table on the bottom half of the page. The filters available include by date range, by redditor, by stock, and by comment positivity. The positivity score was pulled using the vader sentiment module which was trained on a large twitter corpus.

Page 2

This line graph looks at ticker mention by day. Users can filter by date range, ticker, redditor, and comment positivity.

Page 3

This table aims to identify who was the first redditor to mention a certain ticker. A disclaimer to note is that the scraper is only run on the top comments for 2020, and it has been scraping all new posts on r/stocks, r/investing, and r/stockmarket since late August.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
data		data
images		images
reddit_api		reddit_api
scripts		scripts
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Reddit Scraper For Stock Picks

Summary

Medium Article

How to get up and running

Reddit Application

Python Module Requirements

Current Processing Methodology

Dashboard

Page 1

Page 2

Page 3

About

Releases

Packages

Languages

asuliman17/reddit_stocks

Folders and files

Latest commit

History

Repository files navigation

Reddit Scraper For Stock Picks

Summary

Medium Article

How to get up and running

Reddit Application

Python Module Requirements

Current Processing Methodology

Dashboard

Page 1

Page 2

Page 3

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages