Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
LICENSE		LICENSE
README.md		README.md
data_load.py		data_load.py
evaluate.py		evaluate.py
layers.py		layers.py
model.py		model.py
params.py		params.py
process.py		process.py
requirements.txt		requirements.txt
setup.sh		setup.sh

Repository files navigation

FAST AND ACCURATE READING COMPREHENSION WITHOUT RECURRENT NETWORKS

This is still a draft as of 04/11/17.

A Tensorflow implementation of https://openreview.net/pdf?id=B14TlG-RW

The paper has a few missing information and ambiguity. For example,

They mention max-pooling the character embeddings to obtain word representation. However in the next sentence, they mention using convolution on top of the character embedding (Probably similar implementation to Char-CNN with depthwise separable convolution instead of normal convolution). For simplicity we use max-pooling instead of Char-CNN.
They don't mention how they facilitate the difference in input and output dimention for residual connection. Input embeddings have a dimention of 2 * p, and is projected down to 128 before applying layer_norm.
Some of the hyperparameter details are missing, e.g. the number of heads in multi-head attention and so on.

Dataset

The dataset used for this task is Stanford Question Answering Dataset (https://rajpurkar.github.io/SQuAD-explorer/). Pretrained GloVe embeddings obtained from common crawl with 840B tokens are used for words (https://nlp.stanford.edu/projects/glove/).

Requirements

Python2.7
NumPy
tqdm
TensorFlow (1.2 or higher)
spacy

Downloads and Setup

Preprocessing step is identical to R-net (https://github.com/minsangkim142/R-net). Once you clone this repo, run the following lines from bash just once to process the dataset (SQuAD).

$ pip install -r requirements.txt
$ bash setup.sh
$ python process.py --process True

Training / Testing / Debugging

You can change the hyperparameters from params.py file to fit the model in your GPU. To train the model, run the following line.

$ python model.py

To test or debug your model after training, change mode = "train" from params.py file and run the model.

TODO's

Training and testing the model
Add trilinear function to Context-to-Query attention
Convergence testing
Apply dropout every 2 layers
Data augmentation by paraphrasing

Tensorboard

Run tensorboard for visualisation.

$ tensorboard --logdir=./

Note

04/11/17 Currently the model is not optimized and there is a memory leak so I strongly suggest only training if your memory is 16GB >. Also I haven't done convergence testing yet. The training time is 5 ~ 6x faster on naive implementation compared to R-net (https://github.com/minsangkim142/R-net).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FAST AND ACCURATE READING COMPREHENSION WITHOUT RECURRENT NETWORKS

Dataset

Requirements

Downloads and Setup

Training / Testing / Debugging

TODO's

Tensorboard

Note

About

Releases

Packages

Languages

License

arvindsg/QANet

Folders and files

Latest commit

History

Repository files navigation

FAST AND ACCURATE READING COMPREHENSION WITHOUT RECURRENT NETWORKS

Dataset

Requirements

Downloads and Setup

Training / Testing / Debugging

TODO's

Tensorboard

Note

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages