The project deals with the prediction of the USA airline delay based on the analysis of flight characteristics like traffic and weather at different times of years based on the previously available dataset.
Big data concepts (for analysis): Apache Spark and Hadoop.
Machine learning and statistics (for prediction): logistic regression, z-score.
- Apache Spark configuration
- mpl_tookits.basemap (in case of an error for graph plotting)
Try:
pip install git+https://github.com/matplotlib/basemap
Step 1: Download the zip file
Step 2: Extract the file
Step 3: Upload the .ipynb file to google colab
step 4: Run all
Delay and frequency of flight in USA (december)
Delay and frequency of flight in USA (June)
The lines represent the origin and the destination of the flight.
Darker the line, the higher the possibility of delay.
Flight path in june
delay does each carrier has
Heat map to show avg delays per hour of the day