To understand the mathematics underlying the normal equation, read the following materials:
-
Chapter 4 Numerical Computation, Section 4.3 Gradient-Based Optimization
-
Chapter 5 Machine Learning Basics, Subsection 5.1.4 Example: Linear Regression
-
Additional materials: proof of convexity of MSE and computation of gradient of MSE
-
Colab notebook for solving linear regression using normal equation
-
Colab notebook for solving linear regression for artificial data set
-
Colab notebook for loading and exploring the MNIST digits data set
-
Colab notebook for classifying MNIST digits with dense layers and analyzing model performance
-
Colab notebook for classifying MNIST fashion items with dense layers and analyzing model performance
-
Code for creating sequential neural networks with dense layers and training them with backprop and mini-batch SGD; currently, code is limited to (1) mean squared error loss and (2) sigmoid activations.
TO DO: add note of preventing overfitting with data augmentation (also, add L2/L1 regularization and dropout earlier!)
Classification of MNIST digits and fashion items
Classification of cats and dogs
based on Chapter 5 Deep learning for computer vision of the book Deep learning with Python by F. Chollet
-
training convnet from scratch, using data augmentation and dropout
-
using VGG16 conv base for fast feature extraction (data augmentation not possible), using dropout
-
using VGG16 conv base for feature extraction, using data augmentation, not using dropout
based on Google ML Practicum: Image Classification
-
Colab notebook for training a convolutional neural network from scratch
-
Colab notebook for training a CNN from scratch with data augmentation and dropout
Visualizing what convnets learn
based on chapter 5 Deep learning for computer vision of the book Deep learning with Python by F. Chollet
-
Visualizing convnet filters, the convnet filter visualizations at the bottom of the notebook look pretty cool!
-
Visualizing heatmaps of class activations, modified version, changes softmax to linear activation in last layer
Some cool looking stuff
Based on Section 8.2 DeepDream and Section 8.3 Neural style transfer of the book Deep learning with Python by F. Chollet. I am not going to explain in detail how deep dream and neural style transfer work. I just wanted to include these notebooks to show you two cool examples of what can be done with deep neural networks.
The goal is to introduce more advanced architectures and concepts. This is based onthe Keras documentation: CIFAR-10 ResNet.
The relevant research papers are:
Notebooks
I have made several changes to the code from the Keras documentation. In the above notebook, I had to change the number of epochs and the learning rate schedule because the model is only trained on 40k and validated on 10k, whereas the model in the Keras documentation is trained on 50k and not validated at all. I wanted to have a situation that is similar to the situation in HW 2 so we can better compare the performance of the ResNet and the (normal) CNN.