Skip to content

SashaOv/MLTalk

Repository files navigation

Agile Machine Learning with Scalding and scikit-learn

A talk delivered to QCON San Francisco 2014.

Prerequisites

  • Working Java installation (ideally, 1.7 and above)
  • Python packages listed below. The easiest way to get them would be to use prepackaged distribution. I use Anaconda , Canopy may work as well:
    • iPython and prerequisites.
    • scikit-learn and prerequisites.
    • mathplotlib and prerequisites. On Mac, you should be careful to install the "framework version" so that %mathplotlib inline instruction with iPython notebook will work
  • Few gigabytes of disk space

Note Anaconda is a big download (>1G), please prepare it in advance.

Data

We are using Million Song Dataset. We are only using the following data from the subset:

Getting started

Clone Github project:

 > git clone https://github.com/SashaOv/MLTalk.git
 > cd MLTalk

(Optional) Copy gradle dependencies to your installation from the flash drive

 > rsync -arv /where/is/gradle-home/ ~/.gradle

Execute build:

 > ./gradlew clean build

Run the sampler program:

 > cd acquire
 > ../gradlew run

About

The material for the Talk "Agile Machine learning with Scalding and scikit-learn", QCON-SF 2014 (http://qconsf.com/tutorial/agile-machine-learning-scalding-and-scikit-learn)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published