This is the repository for PHYS 440/540 "Big Data Physics: Methods of Machine Learning" at Drexel University, taught by Prof. Gordon Richards. The course syllabus can be found at http://www.physics.drexel.edu/~gtr/teaching/phys_440_540/
The course is a series of jupyter notebooks, building on previous versions of this course (https://github.com/gtrichards/PHYS_T480_F18 and https://github.com/gtrichards/PHYS_T480), where I have drawn heavily from resources from the following people/places:
Jake Vanderplas (University of Washington) -- one of the primary code developers of scikit-learn and astroML. I originally drew a lot from https://github.com/jakevdp/ESAC-stats-2014, but you can find a lot more from him too: https://github.com/jakevdp/.
Zeljko Ivezic (University of Washington) -- the lead author of the textbook that we use (https://press.princeton.edu/books/hardcover/9780691198309/statistics-data-mining-and-machine-learning-in-astronomy) and instructor (along with Mario Juric) for https://github.com/uw-astr-302-w18/astr-302-w18
Aurelien Geron's book: https://www.amazon.com/Hands-Machine-Learning-Scikit-Learn-TensorFlow/dp/1492032646/ref=sr_1_5?\dchild=1&keywords=machine+learning&qid=1596499152&sr=8-5 "Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems"
Andy Connolly (University of Washington), particularly http://cadence.lsst.org/introAstroML/
Karen Leighly (University of Oklahoma), particularly http://seminar.ouml.org/
Adam Miller (Northwestern University), particularly https://github.com/LSSTC-DSFP/LSSTC-DSFP-Sessions/
Jo Bovy (University of Toronto), particularly http://astro.utoronto.ca/~bovy/teaching.html
Thomas Wiecki, particularly http://twiecki.github.io/blog/2015/11/10/mcmc-sampling/
My thanks also to Maher Harb (Drexel University), Liam Coatman (Cambridge), Nathalie Thibert (UWO), and Kevin Footer (Deloitte).
I also acknowledge updates to my own class from Stephen Taylor's class at Vanderbilt.
I have tried to be careful about properly attributing anything drawn from these resources, but if it isn't clear where something comes from, it is probably there. Others are welcome to draw from here for their own Machine Learning courses. Please send any corrections to [email protected].
If you have any interest in using these materials for your own Machine Learning course, please e-mail me and I'll send you my post lecture notes about what worked, what didn't, what took too long, what didn't take long enough -- basically what I would change for next time.
Lecture 1 (9/19, Monday): Motivation.ipynb and InitialSetup.ipynb
Lecture 2 (9/21, Wednesday): HistogramExample.ipynb
Lecture 3 (asynchronous): BasicStats.ipynb
Lecture 4 (asynchronous): BasicStats2.ipynb
Lecture 5 (10/3, Monday): Inference.ipynb
Lecture 6 (10/5, Wednesday or 10/7, Friday; TBD): Inference2.ipynb
Lecture 7 (10/10, Monday, or 10/14, Friday; TBD): Scikit-Learn-Intro.ipynb
Lecture 8 (10/17, Monday): DensityEstimation.ipynb
Lecture 9 (10/19, Wednesday): DensityEstimation2.ipynb
Lecture 10 (10/24, Monday): DimensionReduction.ipynb
Lecture 11 (10/28, Friday): DimensionReduction2.ipynb
Lecture 12 (10/31, Monday): Regression.ipynb
Lecture 13 (11/2, Wednesday): Regression2.ipynb
Lecture 14 (11/7, Monday): Classification.ipynb
Lecture 15 (11/11, Friday): Classification2.ipynb
Lecture 16 (11/14, Monday): NeuralNetworks.ipynb
Lecture 17 (11/18, Friday): NeuralNetworks2.ipynb
Lecture 18 (11/21, Monday): TensorFlow.ipynb
Lecture 19 (11/28, Monday): TimeSeries.ipynbp
Lecture 20 (11/30, Wednesday; TBC): TimeSeries2.ipynb
Note that this repository is constantly being updated from year to year. You can find links to the older versions of this repository used in previous years below: