This repository is for the final project of APM 4990 class in which has 6 files in the following order:
1: Analysis.ipynb. This is our analysis file.
2: Submission.ipynb. This is our prediction procedure.
sumbission.csv.zip is the final prediction file and taxi_data.csv is our dataset which contains over 160 million records.
Xiaoyun Qin: Data cleaning, exploratory data analysis, Feature selection, eXtreme Gradient Boosting.
Yi Nian: Data cleaning, exploratory data analysis, Feature selection, Decision Tree.
Ting Cai: Data cleaning, exploratory data analysis, Feature selection, Lasso Regression.
Xinze Liu: Data cleaning, exploratory data analysis, Feature selection, Back-Propagation Neural Network .