Skip to content

tuong-olli/blog-doc2vec

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

A minimal PV-DBOW example implementation

Code acompanying this blog post: https://amsterdam.luminis.eu/2017/02/21/coding-doc2vec/

And two previous blog posts leading up to the abovementioned one:

As a starting point for the code, the word2vec implementation in assignment 5 of the Udacity deep learning course was used, and lots of small things were subsequently changed, added, and removed: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/udacity/5_word2vec.ipynb

Feel free to play with this code, improve it, fix it, and add to it.

Installation

Install Python 3, and the packages listed at the top of the .ipynb file. And finally Jupyter Notebook.

Remember, before you run it, to set PERCENTAGE_DOCS to a lower value, e.g., something like 5 percent, just to check if it all works before you train a network on the whole Reuters training data set (which may take a while).

Finally, the repository contains a file environment.yml, which contains the versions of the python packages in the conda environment I used to spin up Jupyter Notebook. It contains some other stuff you won't need, but I include it for reference. For the packages you do need, you can take a look at the versions I used if you don't get it to work otherwise.

About

Code accompanying blog posts about doc2vec

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 92.2%
  • Python 7.8%