Classifying Tweets with NLP

Summary

Following a disaster, typically you get millions of communications when the disaster response organizations have the least capacity. Often only one in a thousand messages that require a response from these organizations. In this project, we have created a machine learning model that is used to classify these communications so that they can be directed to the right response organization.

This project is submitted in partial fulfillment of the Data Scientist Nanodegree Program from Udacity.

Steps to be Followed for the First Time Run

Create a virtual environment (e.g. python -m venv .venv on Windows) and activate it.
Install the dependencies with pip install -r requirements.txt.
Run python process_data.py disaster_messages.csv disaster_categories.csv DisasterResponse.db in the data directory directory to run the ETL pipeline that gets the tweets and categories, cleans this data, and loads it into an sqlite database.
Run python train_classifier.py ../data/DisasterResponse.db classifier.pkl in the models directory to train the model and create a pickle file.
Run python run.py in the app directory to run the web application that uses this pickle file.

File Descriptions

data/process_data.py loads the data from disaster_categories.csv and disaster_messages.csv, does the necessary transformations to clean and merge these datasets, then loads it into the SQLite database in DisasterResponse.db.
train_classifier.py loads the data from the SQLite database, splits the dataset into training and test sets, builds a text processing and machine learning pipeline, trains and tunes a model using GridSearchCV, outputs results on the test set, and exports the final model as a pickle file.
run.py is a simple Flask app that used to create a user interface for classifying new messages manually and to display visualizations about the data.

Acknowledgements

Data used for this project is from Figure Eight.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
app		app
data		data
models		models
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Classifying Tweets with NLP

Summary

Steps to be Followed for the First Time Run

File Descriptions

Acknowledgements

About

Releases

Packages

Languages

cansinacarer/Classifying-Messages-with-NLP

Folders and files

Latest commit

History

Repository files navigation

Classifying Tweets with NLP

Summary

Steps to be Followed for the First Time Run

File Descriptions

Acknowledgements

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages