Skip to content

mtchibozo/Columbia

Repository files navigation

Columbia

Things I studied at Columbia.

Looks like Columbia is more sensitive about us sharing our work than Telecom. This repo contains a few group projects that are already publicly available, as well as some labs instructors authorized us to share. In the mean time, you might find some useful code in the Telecom folder.

Feel free to email me ([email protected]) or message me (https://www.linkedin.com/in/maxime-tchibozo/) for more information on the private stuff.

Third Semester (Fall 2020)

Course Labs/Projects
Capstone Project First Report
Second Report
Final Report
Code Repository
Presentation Video
Foundations of Graphical Models (David M. Blei) Reading Reports
Homework 0: Basic Probability and Statistics
Homework 1: MCMC, Gibbs Sampling
Homework 2: Variational Inference, Mixed-Membership Models
Final Project: Identifying Bias in Text (supervised LDA - sLDA)
Bayesian Models for Machine Learning (John Paisley) (Private) HW1: EM Algorithm, MCMC
HW2: Variational Inference, MAP estimation
HW3: Mixture Models, Bayesian nonparametric Gibbs sampler

Second Semester (Spring 2020)

Course Labs/Projects
Computer Systems for Data Science SQL, Google Cloud Platform
ACID, Transactions, 2-Phase Locking
Apache Spark, GCP
Tensoflow, Google Cloud
Applied Machine Learning Visualisation
Scikit-learn Tricks
Feature Engineering
Transformers, BERT
Deep Learning
Machine Learning (Private) HW1: Maximum Likelihood, Bias-Variance Tradeoff
HW2: Linear Classifiers, Decision Trees
HW3: Optimisation, Logistic Regression, SVM
HW4: Optimisation, Neural Networks, Kernels
Causal Inference for Data Science HW1: Counterfactuals, Causal Effects, Experiment Design
HW2: Bayesian Graphs, Backdoor Sets
HW3: Propensity Scoring, Doubly Robust Treatment Effect Estimation
HW4: Instrumental Variables, Mechanisms, Front-Door Criterion

First Semester (Fall 2019)

Course Labs/Projects
Exploratory Data Analysis and Visualisation (Private) Project
Community Contribution
Problem Set 1: tidyr, Shapiro test, histograms
Problem Set 2: ggplot2, Cleveland plots, web scraping
Problem Set 3: likert, parallel coordinate plots
Problem Set 4: tidyquant, D3, missing values, time series
Problem Set 5: SVG, D3, interactive web app
Statistical Inference and Modeling (Private) HW1: Estimation, Rao-Blackwell
HW2: Survival Data, Missing Data, Markov Chains, Time Series
HW3: Linear Models, Generalised Linear Models
HW4: Generalised Additive Models, Hypothesis Testing
Algorithms for Data Science (Private) HW1: Complexity, Sorting Algorithms
HW2: Dynamic Programming, Trees, Graph Algorithms
HW3: Flow Networks, Linear Programming
HW4: Graphs, Flow, NP-Completeness, Integer Programming
Machine Learning for Image Analysis (Private) HW1: KMeans for Face Recognition
HW2: Analysing fMRI Data, SPM12
HW3: Brain Data Classification, PCA, SVM, Neural Networks
Iris Recognition Group Project (w. Hariz Johnson)
Digit Recognition Project, Convolutional Neural Networks