Skip to content

Commit

Permalink
Added and edited all script headings. Chapters 6, 7, 8, 10, and 12 st…
Browse files Browse the repository at this point in the history
…ill require descriptions.
  • Loading branch information
drewconway committed Feb 10, 2012
1 parent d70ae86 commit 6e4cba4
Show file tree
Hide file tree
Showing 14 changed files with 135 additions and 20 deletions.
4 changes: 2 additions & 2 deletions 01-Introduction/code/package_installer.R
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
# File-Name: package_installer.R
# Date: 2011-11-01
# Date: 2012-02-10
# Author: Drew Conway ([email protected]) and John Myles White ([email protected])
# Purpose: Install all of the packages needed for the Machine Learning for Hackers case studies
# Data Used: n/a
# Packages Used: n/a

# All source code is copyright (c) 2011, under the Simplified BSD License.
# All source code is copyright (c) 2012, under the Simplified BSD License.
# For more information on FreeBSD see: http://www.opensource.org/licenses/bsd-license.php

# All images and materials produced by this code are licensed under the Creative Commons
Expand Down
9 changes: 4 additions & 5 deletions 01-Introduction/code/ufo_sightings.R
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
# File-Name: ml_basics.R
# Date: 2011-11-01
# Author: Drew Conway
# Email: [email protected]
# File-Name: ufo_sightings.R
# Date: 2012-02-10
# Author: Drew Conway ([email protected]) and John Myles White ([email protected])
# Purpose: Code for Chapter 1. In this case we will review some of the basic
# R functions and coding paradigms we will use throughout this book.
# This includes loading, viewing, and cleaning raw data; as well as
Expand All @@ -11,7 +10,7 @@
# Data Used: http://www.infochimps.com/datasets/60000-documented-ufo-sightings-with-text-descriptions-and-metada
# Packages Used: ggplot2

# All source code is copyright (c) 2011, under the Simplified BSD License.
# All source code is copyright (c) 2012, under the Simplified BSD License.
# For more information on FreeBSD see: http://www.opensource.org/licenses/bsd-license.php

# All images and materials produced by this code are licensed under the Creative Commons
Expand Down
6 changes: 3 additions & 3 deletions 03-Classification/code/email_classify.R
Original file line number Diff line number Diff line change
@@ -1,14 +1,14 @@
# File-Name: email_classify.R
# Date: 2011-11-01
# Author: Drew Conway ([email protected]) and John Myles White ([email protected])
# Date: 2012-02-10
# Author: Drew Conway ([email protected]) and John Myles White ([email protected])
# Purpose: Code for Chapter 3. In this case we introduce the notion of binary classification.
# In machine learning this is a method for determining what of two categories a
# given observation belongs to. To show this, we will create a simple naive Bayes
# classifier for SPAM email detection, and visualize the results.
# Data Used: Email messages contained in data/ directory, source: http://spamassassin.apache.org/publiccorpus/
# Packages Used: tm, ggplot2

# All source code is copyright (c) 2011, under the Simplified BSD License.
# All source code is copyright (c) 2012, under the Simplified BSD License.
# For more information on FreeBSD see: http://www.opensource.org/licenses/bsd-license.php

# All images and materials produced by this code are licensed under the Creative Commons
Expand Down
4 changes: 2 additions & 2 deletions 04-Ranking/code/priority_inbox.R
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# File-Name: priority_inbox.R
# Date: 2011-11-01
# Date: 2012-02-10
# Author: Drew Conway ([email protected]) and John Myles White ([email protected])
# Purpose: Code for Chapter 4. In this case study we will attempt to write a "priority
# inbox" algorithm for ranking email by some measures of importance. We will
Expand All @@ -9,7 +9,7 @@
# source: http://spamassassin.apache.org/publiccorpus/
# Packages Used: tm, ggplot2

# All source code is copyright (c) 2011, under the Simplified BSD License.
# All source code is copyright (c) 2012, under the Simplified BSD License.
# For more information on FreeBSD see: http://www.opensource.org/licenses/bsd-license.php

# All images and materials produced by this code are licensed under the Creative Commons
Expand Down
19 changes: 19 additions & 0 deletions 05-Regression/chapter05.R
Original file line number Diff line number Diff line change
@@ -1,3 +1,22 @@
# File-Name: chapter05.R
# Date: 2012-02-10
# Author: Drew Conway ([email protected]) and John Myles White ([email protected])
# Purpose:
# Data Used: data/longevity.csv
# Packages Used: ggplot2

# All source code is copyright (c) 2012, under the Simplified BSD License.
# For more information on FreeBSD see: http://www.opensource.org/licenses/bsd-license.php

# All images and materials produced by this code are licensed under the Creative Commons
# Attribution-Share Alike 3.0 United States License: http://creativecommons.org/licenses/by-sa/3.0/us/

# All rights reserved.

# NOTE: If you are running this in the R console you must use the 'setwd' command to set the
# working directory for the console to whereever you have saved this file prior to running.
# Otherwise you will see errors when loading data or saving figures!

library('ggplot2')

# First snippet
Expand Down
19 changes: 19 additions & 0 deletions 06-Regularization/chapter06.R
Original file line number Diff line number Diff line change
@@ -1,3 +1,22 @@
# File-Name: chapter06.R
# Date: 2012-02-10
# Author: Drew Conway ([email protected]) and John Myles White ([email protected])
# Purpose:
# Data Used: data/oreilly.csv
# Packages Used: ggplot2, glmnet, tm, boot

# All source code is copyright (c) 2012, under the Simplified BSD License.
# For more information on FreeBSD see: http://www.opensource.org/licenses/bsd-license.php

# All images and materials produced by this code are licensed under the Creative Commons
# Attribution-Share Alike 3.0 United States License: http://creativecommons.org/licenses/by-sa/3.0/us/

# All rights reserved.

# NOTE: If you are running this in the R console you must use the 'setwd' command to set the
# working directory for the console to whereever you have saved this file prior to running.
# Otherwise you will see errors when loading data or saving figures!

library('ggplot2')

# First snippet
Expand Down
19 changes: 19 additions & 0 deletions 07-Optimization/chapter07.R
Original file line number Diff line number Diff line change
@@ -1,3 +1,22 @@
# File-Name: chapter07.R
# Date: 2012-02-10
# Author: Drew Conway ([email protected]) and John Myles White ([email protected])
# Purpose:
# Data Used: data/01_heights_weights_genders.csv, data/lexical_database.Rdata
# Packages Used: n/a

# All source code is copyright (c) 2012, under the Simplified BSD License.
# For more information on FreeBSD see: http://www.opensource.org/licenses/bsd-license.php

# All images and materials produced by this code are licensed under the Creative Commons
# Attribution-Share Alike 3.0 United States License: http://creativecommons.org/licenses/by-sa/3.0/us/

# All rights reserved.

# NOTE: If you are running this in the R console you must use the 'setwd' command to set the
# working directory for the console to whereever you have saved this file prior to running.
# Otherwise you will see errors when loading data or saving figures!

# First code snippet
height.to.weight <- function(height, a, b)
{
Expand Down
19 changes: 19 additions & 0 deletions 08-PCA/chapter08.R
Original file line number Diff line number Diff line change
@@ -1,3 +1,22 @@
# File-Name: chapter08.R
# Date: 2012-02-10
# Author: Drew Conway ([email protected]) and John Myles White ([email protected])
# Purpose:
# Data Used: data/DJI.csv, data/stock_prices.csv
# Packages Used: ggplot2, lubridate, reshape

# All source code is copyright (c) 2012, under the Simplified BSD License.
# For more information on FreeBSD see: http://www.opensource.org/licenses/bsd-license.php

# All images and materials produced by this code are licensed under the Creative Commons
# Attribution-Share Alike 3.0 United States License: http://creativecommons.org/licenses/by-sa/3.0/us/

# All rights reserved.

# NOTE: If you are running this in the R console you must use the 'setwd' command to set the
# working directory for the console to whereever you have saved this file prior to running.
# Otherwise you will see errors when loading data or saving figures!

library('ggplot2')

# First code snippet
Expand Down
4 changes: 2 additions & 2 deletions 09-MDS/chapter09.R → 09-MDS/senate_mds.R
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# File-Name: senate_mds.R
# Date: 2011-11-01
# Date: 2012-02-10
# Author: Drew Conway ([email protected]) and John Myles White ([email protected])
# Purpose: Code for Chapter 4. In this case study we introduce multidimensional scaling (MDS),
# a technique for visually displaying the simialrity of observations in
Expand All @@ -9,7 +9,7 @@
# Data Used: *.dta files in code/data/, source: http://www.voteview.com/dwnl.htm
# Packages Used: foreign, ggplot2

# All source code is copyright (c) 2011, under the Simplified BSD License.
# All source code is copyright (c) 2012, under the Simplified BSD License.
# For more information on FreeBSD see: http://www.opensource.org/licenses/bsd-license.php

# All images and materials produced by this code are licensed under the Creative Commons
Expand Down
20 changes: 20 additions & 0 deletions 10-Recommendations/chapter10.R
Original file line number Diff line number Diff line change
@@ -1,3 +1,23 @@
# File-Name: chapter10.R
# Date: 2012-02-10
# Author: Drew Conway ([email protected]) and John Myles White ([email protected])
# Purpose:
# Data Used: data/example.csv, data/installations.csv
# Packages Used: class, reshape

# All source code is copyright (c) 2012, under the Simplified BSD License.
# For more information on FreeBSD see: http://www.opensource.org/licenses/bsd-license.php

# All images and materials produced by this code are licensed under the Creative Commons
# Attribution-Share Alike 3.0 United States License: http://creativecommons.org/licenses/by-sa/3.0/us/

# All rights reserved.

# NOTE: If you are running this in the R console you must use the 'setwd' command to set the
# working directory for the console to whereever you have saved this file prior to running.
# Otherwise you will see errors when loading data or saving figures!


# First code snippet
df <- read.csv('data/example_data.csv')

Expand Down
4 changes: 2 additions & 2 deletions 11-SNA/code/01_google_sg.R
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# File-Name: google_sg.R
# Date: 2012-01-19
# Date: 2012-02-10
# Author: Drew Conway ([email protected]) and John Myles White ([email protected])
# Purpose: File 1 for code from Chapter 11. This file contains a set of functions for building
# igraph network object from the Twitter social graphs. As the initial set of code
Expand All @@ -9,7 +9,7 @@
# Data Used: Accessed via the Google SocialGraph API, source: http://code.google.com/apis/socialgraph/
# Packages Used: igraph, RCurl, RJSONIO

# All source code is copyright (c) 2011, under the Simplified BSD License.
# All source code is copyright (c) 2012, under the Simplified BSD License.
# For more information on FreeBSD see: http://www.opensource.org/licenses/bsd-license.php

# All images and materials produced by this code are licensed under the Creative Commons
Expand Down
4 changes: 2 additions & 2 deletions 11-SNA/code/02_twitter_net.R
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# File-Name: twitter_net.R
# Date: 2011-11-01
# Date: 2012-02-10
# Author: Drew Conway ([email protected]) and John Myles White ([email protected])
# Purpose: File 2 for code in Chapter 11. In this short file we write code for generating the
# the ego-network for a given Twitter user. Once the network object has been built we
Expand All @@ -9,7 +9,7 @@
# Data Used: n/a
# Packages Used: igraph, see 01_google_sg.R

# All source code is copyright (c) 2011, under the Simplified BSD License.
# All source code is copyright (c) 2012, under the Simplified BSD License.
# For more information on FreeBSD see: http://www.opensource.org/licenses/bsd-license.php

# All images and materials produced by this code are licensed under the Creative Commons
Expand Down
4 changes: 2 additions & 2 deletions 11-SNA/code/03_twitter_rec.R
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# File-Name: twitter_rec.R
# Date: 2011-11-01
# Date: 2012-02-10
# Author: Drew Conway ([email protected]) and John Myles White ([email protected])
# Purpose: File 3 for code in Chapter 9. In the final piece of this case study we design a
# simple social graph reccommendation system based on Twitter data. Using the
Expand All @@ -10,7 +10,7 @@
# Data Used: data/*.graphml
# Packages Used: igraph

# All source code is copyright (c) 2011, under the Simplified BSD License.
# All source code is copyright (c) 2012, under the Simplified BSD License.
# For more information on FreeBSD see: http://www.opensource.org/licenses/bsd-license.php

# All images and materials produced by this code are licensed under the Creative Commons
Expand Down
20 changes: 20 additions & 0 deletions 12-Model_Comparison/chapter12.R
Original file line number Diff line number Diff line change
@@ -1,3 +1,23 @@
# File-Name: chapter12.R
# Date: 2012-02-10
# Author: Drew Conway ([email protected]) and John Myles White ([email protected])
# Purpose:
# Data Used: data/df.csv, dtm.RData
# Packages Used: ggplot2, glmnet, tm, boot

# All source code is copyright (c) 2012, under the Simplified BSD License.
# For more information on FreeBSD see: http://www.opensource.org/licenses/bsd-license.php

# All images and materials produced by this code are licensed under the Creative Commons
# Attribution-Share Alike 3.0 United States License: http://creativecommons.org/licenses/by-sa/3.0/us/

# All rights reserved.

# NOTE: If you are running this in the R console you must use the 'setwd' command to set the
# working directory for the console to whereever you have saved this file prior to running.
# Otherwise you will see errors when loading data or saving figures!


library('ggplot2')

# First code snippet
Expand Down

0 comments on commit 6e4cba4

Please sign in to comment.