Link with dataset : https://www.drivendata.org/competitions/54/machine-learning-with-a-heart/
The file SamplePipeline.ipynb contains a baseline pipeline for working with the dataset
- Fork this repo into your personal github profile
- Ensure you can view/open the notebook by either opening it in Github or using https://nbviewer.jupyter.org/
NOTE : NBViewer requires the repo to be public
- Clone the repo into your local desktop/laptop and run the IPYNB file using jupyter
NOTE : Make sure the dataset and paths are matching.
- Uncomment lines within the EDA section in the notebook, run the whole notebook again
NOTE: The pairplot will take some time
- commit the changes with a comment (mandatory) and push them into your personal repo
- Check in your online github profile to see if the notebook is rendering the changes (or use nbviewer)
- Code documentation in steps such as "Are there any missing data points?"
- Train and test split MUST be done BEFORE encoding and scaling
This article is where I learnt it from : https://digitaldrummerj.me/git-sync-fork-to-master/
In a nutshell, Open the terminal in your working folder (the folder where your forked repo is), then type
- git remote add upstream [original repo path].git
- git fetch upstream
- git merge upstream/master
- Resolve merge conflicts if any
- git push