Skip to content

Commit

Permalink
Merge pull request tensorflow#954 from lukaszkaiser/ngpu-update
Browse files Browse the repository at this point in the history
Update to the Neural GPU.
  • Loading branch information
lukaszkaiser authored Jan 27, 2017
2 parents ea364a9 + a315e56 commit a298143
Show file tree
Hide file tree
Showing 6 changed files with 2,615 additions and 607 deletions.
33 changes: 21 additions & 12 deletions neural_gpu/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@ in [[http://arxiv.org/abs/1511.08228]].

Requirements:
* TensorFlow (see tensorflow.org for how to install)
* Matplotlib for Python (sudo apt-get install python-matplotlib)

The model can be trained on the following algorithmic tasks:

Expand All @@ -26,17 +25,27 @@ The model can be trained on the following algorithmic tasks:
* `qadd` - Long quaternary addition
* `search` - Search for symbol key in dictionary

The value range for symbols are defined by the `niclass` and `noclass` flags.
In particular, the values are in the range `min(--niclass, noclass) - 1`.
So if you set `--niclass=33` and `--noclass=33` (the default) then `--task=rev`
will be reversing lists of 32 symbols, and `--task=id` will be identity on a
list of up to 32 symbols.
It can also be trained on the WMT English-French translation task:

* `wmt` - WMT English-French translation (data will be downloaded)

To train the model on the reverse task run:
The value range for symbols are defined by the `vocab_size` flag.
In particular, the values are in the range `vocab_size - 1`.
So if you set `--vocab_size=16` (the default) then `--problem=rev`
will be reversing lists of 15 symbols, and `--problem=id` will be identity
on a list of up to 15 symbols.


To train the model on the binary multiplication task run:

```
python neural_gpu_trainer.py --problem=bmul
```

This trains the Extended Neural GPU, to train the original model run:

```
python neural_gpu_trainer.py --task=rev
python neural_gpu_trainer.py --problem=bmul --beam_size=0
```

While training, interim / checkpoint model parameters will be
Expand All @@ -47,16 +56,16 @@ with, hit `Ctrl-C` to stop the training process. The latest
model parameters will be in `/tmp/neural_gpu/neural_gpu.ckpt-<step>`
and used on any subsequent run.

To test a trained model on how well it decodes run:
To evaluate a trained model on how well it decodes run:

```
python neural_gpu_trainer.py --task=rev --mode=1
python neural_gpu_trainer.py --problem=bmul --mode=1
```

To produce an animation of the result run:
To interact with a model (experimental, see code) run:

```
python neural_gpu_trainer.py --task=rev --mode=1 --animate=True
python neural_gpu_trainer.py --problem=bmul --mode=2
```

Maintained by Lukasz Kaiser (lukaszkaiser)
Loading

0 comments on commit a298143

Please sign in to comment.