Skip to content

Commit

Permalink
Clare Corthell's Transcript
Browse files Browse the repository at this point in the history
  • Loading branch information
clarecorthell committed May 16, 2014
1 parent eba3391 commit d66b849
Showing 1 changed file with 104 additions and 0 deletions.
104 changes: 104 additions & 0 deletions transcripts/clare-corthell-2013.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
### The Open-Source Masters

I couldn't wait to go back to grad school. Literally. So I designed my own grad school and spent 5 months learning & hacking in great delight!

### My Background ([linkedin](http://bit.ly/clarecorthell))

I'm a Stanford-educated Engineer, previously a Front-End Developer and UX Designer on early-stage products. I'm always in hot pursuit of deeper insight to social questions!

### Goals & Motivations of the Open Source M.S.

Data Science is an ideal marriage for my technical capacities, social research inquisitions, and my geekish-freakish love of statistics.

### Next Steps?

I'm now a Data Scientist with an incredible team at [Mattermark](http://www.mattermark.com)!

***

## The Data Science Curriculum / April-August 2013

* **Intro to Data Science** [UW / Coursera](https://www.coursera.org/course/datasci)
* *Topics:* Python NLP on Twitter API, Distributed Computing Paradigm, MapReduce/Hadoop & Pig Script, SQL/NoSQL, Relational Algebra, Experiment design, Statistics, Graphs, Amazon EC2, Visualization.

### Math
* Linear Algebra / Levandosky [Stanford / Book](http://www.amazon.com/Linear-Algebra-Steven-Levandosky/dp/0536667470/ref=sr_1_1?ie=UTF8&qid=1376546498&sr=8-1&keywords=linear+algebra+levandosky#)
* Statistics [Stats in a Nutshell / Book](http://shop.oreilly.com/product/9780596510497.do)
* Problem-Solving Heuristics "How To Solve It" [Polya / Book](http://en.wikipedia.org/wiki/How_to_Solve_It)

### Computing
* **Algorithms**
* Algorithms Design & Analysis I [Stanford / Coursera](https://www.coursera.org/course/algo)
* Algorithm Design [Kleinberg & Tardos / Book](http://www.amazon.com/Algorithm-Design-Jon-Kleinberg/dp/0321295358/ref=sr_1_1?ie=UTF8&qid=1376702127&sr=8-1&keywords=kleinberg+algorithms)

* **Databases**
* Introduction to Databases [Stanford / Coursera](https://www.coursera.org/course/db)

* **Data Mining**
* Mining Massive Data Sets [Stanford / Book](http://i.stanford.edu/~ullman/mmds.html)
* Mining The Social Web [O'Reilly / Book](http://shop.oreilly.com/product/0636920010203.do)
* Introduction to Information Retrieval [Stanford / Book](http://nlp.stanford.edu/IR-book/information-retrieval-book.html)

* **Machine Learning**
* Machine Learning / Ng [Stanford / Coursera](https://www.coursera.org/course/ml)
* Programming Collective Intelligence [O'Reilly / Book](http://shop.oreilly.com/product/9780596529321.do)
* Statistics [The Elements of Statistical Learning / Book](http://www-stat.stanford.edu/~tibs/ElemStatLearn/) ** *en process*

* **Probabilistic Graphical Models**
* Probabilistic Programming and Bayesian Methods for Hackers [Github / Tutorials] (https://github.com/CamDavidsonPilon/Probabilistic-Programming-and-Bayesian-Methods-for-Hackers)
* PGMs / Koller [Stanford / Coursera](https://www.coursera.org/course/pgm) ** *en process*

* **Natural Language Processing**
* NLP with Python [O'Reilly / Book](http://shop.oreilly.com/product/9780596516499.do)

* **Analysis**
* Python for Data Analysis [O'Reilly / Book](http://www.kqzyfj.com/click-7040302-11260198?url=http%3A%2F%2Fshop.oreilly.com%2Fproduct%2F0636920023784.do&cjsku=0636920023784)
* Big Data Analysis with Twitter [UC Berkeley / Lectures](http://blogs.ischool.berkeley.edu/i290-abdt-s12/)
* Social and Economic Networks: Models and Analysis / [Stanford / Coursera](https://www.coursera.org/course/networksonline)
* Information Visualization ["Envisioning Information" Tufte / Book](http://www.amazon.com/Envisioning-Information-Edward-R-Tufte/dp/0961392118/ref=sr_1_8?ie=UTF8&qid=1376709039&sr=8-8&keywords=information+design)

* **Python** (Learning)
* New To Python: [Learn Python the Hard Way](http://learnpythonthehardway.org/), [Google's Python Class](code.google.com/edu/languages/google-python-class/)

* **Python** (Libraries)
* Basic Packages [Python, virtualenv, NumPy, SciPy, matplotlib and IPython ](http://www.lowindata.com/2013/installing-scientific-python-on-mac-os-x/)
* Bayesian Inference | [pymc](https://github.com/pymc-devs/pymc)
* Labeled data structures objects, statistical functions, etc [pandas](https://github.com/pydata/pandas) (See: Python for Data Analysis)
* Python wrapper for the Twitter API [twython](https://github.com/ryanmcgrath/twython)
* Tools for Data Mining & Analysis [scikit-learn](http://scikit-learn.org/stable/)
* Network Modeling & Viz [networkx](http://networkx.github.io/)
* Natural Language Toolkit [NLTK](http://nltk.org/)

### Projects
* Coursework
* Sentiment analysis, trending topics, and friendship mapping with Twitter API
* Joins and Matrix Manipulation in MapReduce (AWS EC2)
* In-database Text analysis (SQL)
* Sentiment analysis of movie tweets (Python)


***
### A Note on Tools

This degree is brought to you by: "THE INTERNET".

Information is more democratized^ now than it was at any point in history. Given a little initiative and interest, you can tailor and excel in an education of your own design. The connective web made me what I am today, growing from the child obsessed with [Number Munchers](http://en.wikipedia.org/wiki/Munchers#Number_Munchers) to an adult jaw-dropping over [DBSCAN](http://en.wikipedia.org/wiki/DBSCAN).

The most valuable resources I used were:
* [Coursera](http://coursera.org)
* [Khan Academy](https://www.khanacademy.org/math/probability/random-variables-topic/random_variables_prob_dist/v/term-life-insurance-and-death-probability)
* [Wolfram Alpha](http://www.wolframalpha.com/input/?i=torus)
* [Wikipedia](http://en.wikipedia.org/wiki/List_of_cognitive_biases)
* [Quora](http://www.quora.com/Programming-Challenges-1/What-are-some-good-toy-problems-in-data-science)
* **Kindle .mobis** (carrying textbooks is so 90s.)
* PopSci Read: [The Signal and The Noise](http://www.amazon.com/Signal-Noise-Predictions-Fail-but-ebook/dp/B007V65R54/ref=tmm_kin_swatch_0?_encoding=UTF8&sr=8-1&qid=1376699450) Nate Silver
* **Friends & Family** (Impossible without their support! Special Thanks to N.S.)

*^ given internet access - an issue near and dear to me.*

***


### I "Forked" this into the [Open Source Data Science Masters](http://datasciencemasters.org) Curriculum.

[Follow me on Twitter @clarecorthell](http://twitter.com/clarecorthell)

0 comments on commit d66b849

Please sign in to comment.