The stakeholder is the telecom business. They are interested in reducing how much money is lost because of customers who don't stick around very long. If we can accurately identify any patterns, we can minimize the turn rate for the company, and as a result decrease costs associated with losing and having to regrow its customer base.
The data contains 3333 entries and the target variable, the customer's churn status.
Overall, this data set is very clean. There are no missing values or NaNs to handle. We have dropped the area code and phone numbers for the privacy of the customer. We have added the total number of domestic minutes, calls and charges to gauge the total bill's effect on turn rate. Resource https://www.kaggle.com/becksddf/churn-in-telecoms-dataset
Base on out businness problem we are looking to minimize False Negatives, so we will focus on optimizing our Model's Recall Score. In our context, a false positive is identifying someone will leave when they actually stay. Alternatively, a false negative would be identifying someone who will stay but they will actually leave. A false negative is much worse for our stakeholders because missing customers that will leave defeats the purpose of analyzing this data set and developing the model. Missing these customers costs the company more money than a false positive. For this reason we will need to optimize our recall score for the models.
Models we tried:
- Logistic Regression
- Nearest Neighbor
- Decision Tree
- XGBoost
This is baseless model based on raw data
This is our Logistic Regression model we used as a baseline model.
This is our final model, tuned Decision Tree
Our final model produced a 87% recall score. And, it determined 3 of the most important features, which were Customer Services Calls, total price of the bill, and states with high churn.
Offer discount on 4th customer service call
Provide a monthly rate option to customers
Target marketing campaign to high churn states
├── notebooks
│ ├──Dave
│ └──Mellissa
├── images
│ ├──CMatrix.JPG
│ ├──BaseMatrix.JPG
│ └──FirstChurn.JPG
├── README.md
├── presentation.pdf
└── notebook.ipynb