This repo contains a collection of Jupyter Notebooks to accompany the Udacity Connect Intensive Machine Learning Nanodegree. The code is written for Python 2.7, but should be (mostly) compatible with Python 3.
-
Week 1 (
wk1/
)PythonPractice_1.ipynb
: Introduction to Jupyter Notebook and basic Python programming (including data types, if and while loops, list comprehension, lambda expression, etc.).PythonPractice_2.ipynb
: Introduction to Numpy, including how to create Numpy Array, built-in methods in Numpy array, array indexing/selection/slicing, broadcasting.PythonPractice_3.ipynb
: Introduction to Pandas. Topics include inputting data into DataFrame and getting summary information, selection and indexing, conditional selection with DataFrame, etc.PythonPractice_4.ipynb
: Introduction to data visualization with Matplotlib and Seaborn.data/
: containing one sample dataset for the notebooks and one for exercise.
-
Week 2 (
wk2/
)SklearnTutorial.ipynb
: Introduction to scikit-learn (sklearn
) and a step-by-step guide of building a machine learning model withsklearn
with the Titanic Survival dataset from KaggleSklearnTutorial-solution.ipynb
: The solution to theSklearnTutorial.ipynb
notebook.
-
Week 3 (
wk3/
)RegressionModels.ipynb
: Introduction to the implementation and evaluatoin of regression models to predict housing price withsklearn
.RegressionModels-solution.ipynb
: The solution to theRegressionModels.ipynb
-
Week 4 (
wk4/
)NeuralNets_Miniproject.ipynb
: Introduction to the fundamentals of neural networks, the implementation of single layer and multi layer perceptrons, and perceptron withscikit-learn
.NeuralNets_Miniproject-solution.ipynb
: The solution to theNeuralNets_Miniproject.ipynb
notebook.
-
Week 5 (
wk5/
)BayesNLP_Miniproject.ipynb
: Introduction to Bayes theorem and application of Bayes theorem in natual language processing. Write Python methods to calculate maximum likelihood of a word based on the preceding word, and build a Bayes classifier that computes with a context the optimal label for a second missing word based on the possible words that could be in the first blank.BayesNLP_Miniproject-solution.ipynb
: The solution to theBayesNLP_Miniproject.ipynb
notebook.Quiz.pdf
: some quiz on supervised learning.
-
Week 6 (
wk6/
)Clustering.ipynb
: Perform K-Means clustering on the Enron dataset. Visualize different clusters that form before and after feature scaling. Plot decision boundaries that arise from K-Means clustering using two of the features.Clustering-solution.ipynb
: The solution to theClustering.ipynb
notebook.
-
Week 7 (
wk7/
)PCA.ipynb
: Perform Principal Component Analysis (PCA) on a large set of features to explain as much of the variance as possible in the data using a smaller set of features. Visualize the eigenfaces (orthonormal basis of components) that result from PCA. The dataset omes from "Labeled Faces in the Wild" (LFW), a database of more than 13,000 face photographs designed for studying the problem of unconstrained face recognition.PCA-solution.ipynb
: The solution to thePCA.ipynb
notebook.
-
Week 8 (
wk8/
)FeatureSelection.ipynb
: Introduction to Chi-square test statistics and Pearson's Chi-square test. Learn how to perform univariate feature selection using the 'SelectKBest' class from scikit-learn. Learn how to do recursive feature elimination using theRFE
class and RFE with cross-validation (RFECV
) from scikit-learn.FeatureSelection-solution.ipynb
: The solution to theFeatureSelection.ipynb
notebook.
-
Week 9 (
wk9/
)Kaggle.ipynb
: Introduction to the process of creating machine learning solutions to a real world problem. Practice kills such as data analysis and visualization, model building, evaluation and optimization on a real world dataset.
-
Week 10 (
wk10/
)MNIST_Demo.ipynb
: Introduction to TensorFlow and Keras. Learn how to build neural networks using TensorFlow and Keras to solve a multiclass classification problem. Compare TensorFlow and Keras.