A talk delivered to QCON San Francisco 2014.
- Working Java installation (ideally, 1.7 and above)
- Python packages listed below. The easiest way to get them would be to use prepackaged distribution.
I use Anaconda ,
Canopy may work as well:
- iPython and prerequisites.
- scikit-learn and prerequisites.
- mathplotlib and prerequisites. On Mac, you should be careful to install the "framework version" so that
%mathplotlib inline
instruction with iPython notebook will work
- Few gigabytes of disk space
Note Anaconda is a big download (>1G), please prepare it in advance.
We are using Million Song Dataset. We are only using the following data from the subset:
- Track Metadata
- The CSV export of this database is available in this Box.com folder
- Million Song Subset Triplet Data
Clone Github project:
> git clone https://github.com/SashaOv/MLTalk.git
> cd MLTalk
(Optional) Copy gradle dependencies to your installation from the flash drive
> rsync -arv /where/is/gradle-home/ ~/.gradle
Execute build:
> ./gradlew clean build
Run the sampler program:
> cd acquire
> ../gradlew run