Skip to content

Commit

Permalink
Pre-Class 1 Materials
Browse files Browse the repository at this point in the history
  • Loading branch information
paspeur committed Apr 18, 2017
0 parents commit e102515
Show file tree
Hide file tree
Showing 14 changed files with 2,383 additions and 0 deletions.
65 changes: 65 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
# ![](https://ga-dash.s3.amazonaws.com/production/assets/logo-9f88ae6c9c3871690e33280fcf557f33.png) [DS-SF-34](https://github.com/ga-students/DS-SF-34)

Course materials for [General Assembly's Data Science course](https://generalassemb.ly/education/data-science/san-francisco) in San Francisco (4/17/17 - 6/26/17)

## Schedule

| Class | Date | Topic | Soft Deadline | Hard Deadline<br/>(by 6:30 PM) |
|:---:|:---:|:---|:---|:---|
| [01](./classes/01) | 4/17 | [What is Data Science](./classes/01) | | |
| 02 | 4/19 | The _pandas_ Library | | |
| 03 | 4/24 | Databases, Scrapping, and APIs | | |
| 04 | 4/26 | Exploratory Data Analysis | | |
| 05 | 5/1 | k-Nearest Neighbors | **[Unit Project 1](./unit-project/1)** | |
| 06 | 5/3 | Applied Data Wrangling and Exploratory Data Analysis | | **[Unit Project 1](./unit-project/1)** |
| 07 | 5/8 | Linear Regression | | |
| 08 | 5/10 | Linear Regression, Part 2 | **[Final Project 1](./final-project/1)** | |
| 09 | 5/15 | Linear Regression, Part 3 | **[Unit Project 2](./unit-project/2)** | |
| 10 | 5/17 | Regularization | | **[Final Project 1](./final-project/1)** |
| 11 | 5/22 | Logistic Regression | | **[Unit Project 2](./unit-project/2)** |
| 12 | 5/24 | Applied Machine Learning Modeling | | |
| 13 | 5/31 | Advanced Metrics | **[Final Project 2](./final-project/2)** | |
| 14 | 6/5 | Clustering | **[Unit Project 3](./unit-project/3)** | |
| 15 | 6/7 | Intermediate Project Presentations | | **[Final Project 2](./final-project/2)** |
| 16 | 6/12 | Trees | | **[Unit Project 3](./unit-project/3)** |
| 17 | 6/14 | Applied Machine Learning Modeling, Part 2 | | |
| 18 | 6/19 | Natural Language Processing | | |
| 19 | 6/21 | Time Series | | |
| 20 | 6/26 | Final Project Presentations and Wrap-Up | **[Final Project 3](./final-project/3)** | **[Final Project 3](./final-project/3)** |

## Your Team

**Lead Instructor:** [Ivan Corneillet](mailto:[email protected])

**Associate Instructors:** [Gus Ostow](mailto:[email protected]) and [Mohit Nalavadi](mailto:[email protected])

**Course Producer:** [Matt Jones](mailto:[email protected])

## Office Hours

- Gus and Mohit: TBD
- Ivan: On demand/per request; usually just before or after class and online (e.g., Slack)

## Slack

You've all been invited to use [Slack](https://ds-sf-34.slack.com) for chat during class and the day. Please consider this the primary way to contact other students. Gus and Mohit will be on Slack during class and office hours to handle questions.

## Unit Projects

| Unit Project | Description | Objective | Soft Deadline | Hard Deadline<br/>(by 6:30 PM) |
|:---:|:---|:---|:---:|:---:|
| [1](./unit-project/1) | [Research Design](./unit-project/1) | Create a problem statement, analysis plan, and data dictionary | 5/1 | 5/3 |
| [2](./unit-project/2) | [Exploratory Data Analysis](./unit-project/2) | Perform exploratory data analysis using visualizations and statistical analysis | 5/15 | 5/22 |
| [3](./unit-project/3) | [Machine Learning Modeling and Executive Summary](./unit-project/3) | Engineer features, perform logistic regressions, and predict class probabilities; write up an executive summary that outlines your findings and the methods used | 6/5 | 6/12 |

## Final Project

| Final Project | Description | Objective | Soft Deadline | Hard Deadline<br/>(by 6:30 PM) |
|:---:|:---|:---|:---:|:---:|
| [1](./final-project/1) | [Lightning Pitch](./final-project/1) | Prepare a two- to three-minutes lightning talk covering three potential project topics | 5/10 | 5/17 |
| [2](./final-project/2) | [Experimental Write-Up and Exploratory Data Analysis](./final-project/2) | Create an outline of your research design approach, including hypothesis, assumptions, goals, and success metrics; confirm your data and create an exploratory data analysis notebook with statistical analysis and visualization | 5/31 | 6/7 |
| [3](./final-project/3) | [Notebook and Final Presentation](./final-project/3) | Detailed technical Jupyter notebook with a summary of your statistical analysis, model, and evaluation metrics; presentation deck that relates your data, model, findings, and recommandations to a non-technical audience | 6/26 | 6/26 |

## Exit Tickets

[Fill me out at the end of each class!](http://tiny.cc/ds-sf-34)
Original file line number Diff line number Diff line change
@@ -0,0 +1,296 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# DS-SF-34 | 01 | What is Data Science | Assignment | Starter Code"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Python Review"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Programming is a must-have skill for data scientists. Today, to give you some more practice beyond the course pre-requisites, we are going to implement a few functions in Python. This assignment covers to some extend the following topics:\n",
"\n",
"- Functions (defining and using your own functions but also calling functions from packages)\n",
"- Loops\n",
"- Arithmetic operations\n",
"- Conditional statements\n",
"\n",
"**Don't worry if you get stuck. Ask around, review the answer key, and ask around more. As this course progresses, your programming proficiency will increase.**"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"> ### Question 1. Multiples of 3 and 5\n",
">\n",
"> If we list all the natural numbers below 10 that are multiples of 3 or 5, we get 3, 5, 6 and 9. The sum of these multiples is 23.\n",
">\n",
"> Find the sum of all the multiples of 3 or 5 below 1,000.\n",
">\n",
"> (Source: [Project Euler | Problem 1](https://projecteuler.net/problem=1))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# TODO"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Answer: TODO"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"> ### Question 2. Estimating square roots\n",
">\n",
"> Given a real number $m$, let's define the series $u$ as follow:\n",
"> - $u_0 = 1$\n",
"> - $u_{n+1} = \\frac{u_n ^ 2 + n}{2u_m}$\n",
">\n",
">\n",
"> Implement the calculations of the series $u$ above to estimate square roots. Verify that $\\sqrt{144} = 12$ and use your function to calculate $\\sqrt{1024}$."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# TODO"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Answer: TODO"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"> ### Question 3. Prime Numbers\n",
">\n",
"> A prime (number) is a natural number greater than 1 that has no positive divisors other than 1 and itself. ([Wikipedia](https://en.wikipedia.org/wiki/Prime_number))\n",
">\n",
"> Calculate all primes below 1,000. What's their sum?"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# TODO"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Answer: TODO"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"> ### Question 4. Largest prime factor\n",
">\n",
"> The prime factors of 13195 are 5, 7, 13 and 29.\n",
">\n",
"> What is the largest prime factor of the number 600851475143?\n",
">\n",
"> (Source: [Project Euler | Problem 3](https://projecteuler.net/problem=3))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"# TODO"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Answer: TODO"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"> ### Question 5. Mean\n",
">\n",
"> Write a function to calculate the mean (average) of a list.\n",
">\n",
"> What's the mean of 10, 8, 13, 9, 11, 14, 6, 4, 12, 7, and 5?"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"# TODO"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Answer: TODO"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"> ### Question 6. Sample standard deviation\n",
">\n",
"> Write a function to calculate the standard deviation of a sample.\n",
">\n",
"> Given the sample $x_1, x_2, ..., x_N$, its standard deviation is defined as $s = \\sqrt{\\frac{1}{N - 1} \\sum_{i = 1}^{N} (x_i - \\bar{x})^2}$, with $\\bar{x}$ as the sample mean.\n",
">\n",
"> What's the standard deviation of the following sample: 10, 8, 13, 9, 11, 14, 6, 4, 12, 7, and 5?\n",
">\n",
"> ([Wikipedia](https://en.wikipedia.org/wiki/Standard_deviation#Sample_standard_deviation))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"# TODO"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Answer: TODO"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"> ### Question 7. Median\n",
">\n",
"> Write a function to calculate the median (\"middle value\") of a list. ([Wikipedia](https://en.wikipedia.org/wiki/Median))\n",
">\n",
"> What's the median of 10, 8, 13, 9, 11, 14, 6, 4, 12, 7, and 5?"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"# TODO"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Answer: TODO"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"> ### Question 8. Mode\n",
">\n",
"> Write a function to calculate the mode (\"most frequent value\") of a list. ([Wikipedia](https://en.wikipedia.org/wiki/Mode_(statistics)))\n",
">\n",
"> What's the mode of 10, 8, 13, 9, 11, 14, 6, 4, 12, 7 and 5? How about the mode of 8, 8, 8, 8, 8, 8, 19, 8, 8 and 8?"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"# TODO"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Answer: TODO"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 2",
"language": "python",
"name": "python2"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 2
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython2",
"version": "2.7.13"
}
},
"nbformat": 4,
"nbformat_minor": 0
}
Loading

0 comments on commit e102515

Please sign in to comment.