Skip to content

Krzin/mlcourse

 
 

Repository files navigation

Notable Changes from 2017 to 2018

  • New module on back propagation.
  • Added a note on conditional expectations, since many students find the notation confusing.
  • Added a note on the correlated features theorem for elastic net, which was basically a translation of Zou and Hastie's 2005 paper "Regularization and variable selection via the elastic net." into the notation of our class, dropping an unnecessary centering condition, and using a more standard definition of correlation.
  • Changes to EM Algorithm presentation: Added several diagrams (slides 10-14) to give the general idea of a variational method, and made explicit that the marginal log-likelihood is exactly the pointwise supremum over the variational lower bounds (slides 31 and 32)).
  • New worked example for predicting Poisson distributions with linear and gradient boosting models.

Notable Changes from 2016 to 2017

  • New lecture on principal component analysis (Brett)
  • Added slide on k-means++ (Brett)
  • Added slides on explicit feature vector for 1-dim RBF kernel
  • Created notebook to regenerate the buggy lasso/elastic net plots from Hastie's book (Vlad)
  • L2 constraint for linear models gives Lipschitz continuity of prediction function (Thanks to Brian Dalessandro for pointing this out to me).
  • Expanded discussion of L1/L2/ElasticNet with correlated random variables (Thanks Brett for the figures)

Notable Changes from 2015 to 2016

Possible Future Topics

Basic Techniques

  • Gaussian processes
  • MCMC (or at least Gibbs sampling)
  • Importance sampling
  • Density ratio estimation (for covariate shift, anomaly detection, conditional probability modeling)

Applications

  • Collaborative filtering / matrix factorization (building on this lecture on matrix factorization and Brett's lecture on PCA)
  • Learning to rank and associated concepts
  • Bandits / learning from logged data?
  • Generalized additive models for interpretable nonlinear fits (smoothing way, basis function way, and gradient boosting way)
  • Automated hyperparameter search (with GPs, random, hyperband,...)
  • Active learning
  • Domain shift / covariate shift adaptation
  • Reinforcement learning (minimal path to REINFORCE)

Latent Variable Models

  • PPCA / Factor Analysis and non-Gaussian generalizations
  • Latent Dirichlet Allocation / topic models
  • Latent variable model as autoencoder

Bayesian Models

  • Relevance vector machines
  • BART
  • Gaussian process regression and conditional probability models

Technical Points

Other

  • Class imbalance
  • Black box feature importance measures (building on Ben's 2018 lecture)
  • Quantile regression and conditional prediction intervals (perhaps integrated into homework on loss functions);
  • More depth on basic neural networks: weight initialization, vanishing / exploding gradient, possibly batch normalization
  • Finish up 'structured prediction' with beam search / Viterbi
    • give probabilistic analogue with MEMM's/CRF's
  • Generative vs discriminative (Jordan & Ng's naive bayes vs logistic regression, plus new experiments including regularization)
  • Something about causality?

Citation Information

Creative Commons License
Machine Learning Course Materials by Various Authors is licensed under a Creative Commons Attribution 4.0 International License. The author of each document in this repository is considered the license holder for that document.

About

Machine learning course materials.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 58.6%
  • HTML 23.9%
  • TeX 14.2%
  • R 2.4%
  • Python 0.6%
  • Makefile 0.1%
  • Other 0.2%