README.md

SMS Spam Detection

This project focuses on detecting spam messages using natural language processing (NLP) techniques and machine learning models.

The dataset includes SMS messages labeled as spam or ham (not spam).

Data Preprocessing:
- Text cleaning and normalization.
- Tokenization and stemming.
- Converting text data into numerical representations using techniques like TF-IDF.
Exploratory Data Analysis (EDA):
- Visualizing the distribution of spam and ham messages.
- Analyzing common words and phrases in spam messages.
Model Building:
- Training various machine learning models like Naive Bayes, SVM, and Random Forest.
- Evaluating model performance using metrics such as accuracy, precision, recall, and F1-score.
Model Evaluation:
- Comparing different models.
- Selecting the best model based on evaluation metrics.

To run this project, ensure you have the required packages installed and execute the notebook.

Refer to the requirements.txt file for a list of dependencies.