This project is roughly an exact TensorFlow implementation of Yoon Kim's paper Convolutional Neural Networks for Sentence Classification (EMNLP 2014). His original Theano code can be found here. Alternate to this, you can look at Denny Britz's TensorFlow implementation, here.
- Download Google's
word2vec
embeddings and place them insidedata/w2v/
. This is a large file (~3.5G
). You maygit clone
this. - Ensure you have a working
tensorflow
ortensorflow-gpu
(version 1.0+). Additional dependencies includeyaml
,bunch
andcPickle
. - Pre-process the data by using,
cd data
chmod +x process_sst2.sh process_sst2_sentence.sh
./process_sst2.sh
./process_sst2_sentence.sh
- Run
python train.py
to train the model, and runpython train.py --mode test
to evaluate the model.
The model hyperparameters and mode (nonstatic
, static
and rand
) are configured via YAML files inside config/
. All hyperparameters (except batch_size
) are identical to those reported in the paper. You may change the training directory via the --job_id
parameter, and the random seed using --seed
. Look at config/arguments.py
for more details.
All results have been averaged across 10 random seeds. All reported results are on the SST2 dataset. The models were trained on Titan X GPUs.
Model | Dataset | Average | Std | Range |
---|---|---|---|---|
Kim14 - nonstatic | Phrases | 87.2 | - | - |
Our Kim14 - nonstatic | Phrases | 87.06 | 0.34 | 86.55 - 87.70 |
Our Kim14 - nonstatic | Sentences | 85.92 | 0.68 | 84.79 - 86.99 |
Kim14 - rand | Phrases | 82.7 | - | - |
Our Kim14 - rand | Phrases | 84.53 | 0.38 | 84.01 - 85.01 |
Our Kim14 - rand | Sentences | 79.65 | 0.89 | 77.76 - 81.00 |
Kim14 - static | Phrases | 86.8 | - | - |
Our Kim14 - static | Phrases | 86.10 | 0.58 | 85.23 - 86.82 |
Our Kim14 - static | Sentences | 85.18 | 0.66 | 83.69 - 86.32 |
Feel free to add Issues and PRs (for the existing issues). It should be fairly easy to understand the code.