Starred repositories
Kalman Filter book using Jupyter Notebook. Focuses on building intuition and experience, not formal proofs. Includes Kalman filters,extended Kalman filters, unscented Kalman filters, particle filte…
Set of real world data science tasks completed using the Python Pandas library
Observations from Ian on successfully delivering data science products
Includes notes on using Apache Spark in general, notes on using Spark for Physics, how to run TPCDS on PySpark, how to create histograms with Spark, tools for performance testing CPUs, Jupyter note…
An introductory workshop on pandas with notebooks and exercises for following along. Slides contain all solutions.
API for manipulating time series on top of Apache Spark: lagged time values, rolling statistics (mean, avg, sum, count, etc), AS OF joins, downsampling, and interpolation
An extension to the Apache Spark framework that allows easy and fast processing of very large geospatial datasets.
Modelling marine traffic in the ice-covered Baltic Sea using AIS data
Combined repository for the final tutorial material presented at the 2020 ICESat-2 Cryosphere-themed Hackweek presented virtually by the University of Washington.
sktime - python toolbox for time series: pipelines and transformers
Bare-bone implementation of algorithms and explanations.
"Improving Maritime Traffic Emission Estimations on Missing Data with Conditional RBMs" paper code and data for reproducibility