The notebook performs exploratory data analysis on the titanic data set. The data set was downloaded from kaggle website. Please click here to download the train data set if you want to follow along.
Multiple packages were used in the notenook. These packages were imported into python 2.7.3. The packages used are:
- numpy
- pandas
- matplotlib
- seaborn
The first two packages are for combutation and data analysis, and the other two packages are for data visulaization.
Predicting the likelikelihood of survival using scikit learn package will be discussed in future posts.