A collection of tools to collect and download various data.
Often, I write simple scripts and tools to collect data for various "data science" tasks. I thought that it might be worthwhile to collect them in a central repository since they might be useful to others!
- Collect Lyrics
- Twitter Timeline
- Collect Popular Music Tags
- PDB Info Table
- ZINC Molecule Downloader
- Collect English Premier League Soccer Data
Important Note
Please note that I developed and tested these tools in Python 3.x, and it could be possible that the scripts do not work flawlessly in Python 2.7.x due to the more challenging unicode handling.
A command line tool to download song lyrics given artist names and song titles.
A command line tool that downloads your personal twitter timeline in CSV format with optional keyword filter.
Tutorial for turning your twitter timeline into a word cloud.
A command line tool to download popular tags for a list of songs from last.fm, e.g., for various data mining projects.
A command line tool that creates an info table from a list of PDB files.
A command line tool for downloading 3D structures of small chemical molecules from http://zinc.docking.org.
A command line tool to Collect Fantasy Soccer data from the Premier League.