Skip to content

Latest commit

 

History

History
7 lines (6 loc) · 693 Bytes

README.md

File metadata and controls

7 lines (6 loc) · 693 Bytes

This contains two example data sets:

  1. Text Data (ptb): Data from the Penn Treebank dataset provided by Mikolov: http://www.fit.vutbr.cz/~imikolov/rnnlm/
  2. Tree Data (trees): The tree data from the Stanford Sentiment Treebank: http://nlp.stanford.edu/sentiment/index.html
  3. Classification Data (classes): The data from the Stanford Sentiment Treebank with tree info removed.
  4. Parallel Data (parallel): Data from the Tanaka corpus, reduced to only have 10,000 training examples: http://www.edrdg.org/wiki/index.php/Tanaka_Corpus
  5. Tagging Data (tags): Data from WikiNER, reduced to only have 10,000 training examples: http://schwa.org/projects/resources/wiki/Wikiner