Table Of Contents
- Description
- How does this sample work?
- Preparing sample data
- Converting TensorFlow weights
- Running the sample
- Additional resources
- License
- Changelog
- Known issues
This sample, sampleCharRNN, uses the TensorRT API to build an RNN network layer by layer, sets up weights and inputs/outputs and then performs inference. Specifically, this sample creates a CharRNN network that has been trained on the Tiny Shakespeare dataset. For more information about character level modeling, see char-rnn.
TensorFlow has a useful RNN Tutorial which can be used to train a word level model. Word level models learn a probability distribution over a set of all possible word sequence. Since our goal is to train a char level model, which learns a probability distribution over a set of all possible characters, a few modifications will need to be made to get the TensorFlow sample to work. These modifications can be seen here.
There are also many GitHub repositories that contain CharRNN implementations that will work out of the box. Tensorflow-char-rnn is one such implementation.
The CharRNN network is a fairly simple RNN network. The input into the network is a single character that is embedded into a vector of size 512. This embedded input is then supplied to a RNN layer containing two stacked LSTM cells. The output from the RNN layer is then supplied to a fully connected layer, which can be represented in TensorRT by a Matrix Multiply layer followed by an ElementWise sum layer. Constant layers are used to supply the weights and biases to the Matrix Multiply and ElementWise Layers, respectively. A TopK operation is then performed on the output of the ElementWise sum layer where K = 1
to find the next predicted character in the sequence. For more information about these layers, see the TensorRT API documentation.
This sample provides a pre-trained model called model-20080.data-00000-of-00001
located in the /usr/src/tensorrt/data/samples/char-rnn/model
directory, therefore, training is not required for this sample. The model used by this sample was trained using tensorflow-char-rnn. This GitHub repository includes instructions on how to train and produce checkpoint that can be used by TensorRT.
Note: If you wanted to train your own model and then perform inference with TensorRT, you will simply need to do a char to char comparison between TensorFlow and TensorRT.
In this sample, the following layers are used. For more information about these layers, see the TensorRT Developer Guide: Layers documentation.
ElementWise The ElementWise layer, also known as the Eltwise layer, implements per-element operations. The ElementWise layer is used to execute the second step of the functionality provided by a FullyConnected layer.
MatrixMultiply The MatrixMultiply layer implements matrix multiplication for a collection of matrices. The Matrix Multiplication layer is used to execute the first step of the functionality provided by a FullyConnected layer.
RNNv2
The RNNv2 layer implements recurrent layers such as Recurrent Neural Network (RNN), Gated Recurrent Units (GRU), and Long Short-Term Memory (LSTM). Supported types are RNN, GRU, and LSTM. It performs a recurrent operation, where the operation is defined by one of several well-known recurrent neural network (RNN) "cells". This is the first layer in the network is an RNN layer. This is added and configured in the addRNNv2Layer()
function. Weights are set for each gate and layer individually. The input format for RNNv2 is BSE (Batch, Sequence, Embedding).
TopK The TopK layer is used to identify the character that has the maximum probability of appearing next. The TopK layer finds the top K maximum (or minimum) elements along a dimension, returning a reduced tensor and a tensor of index positions.
- Download the sample data from TensorRT release tarball, if not already mounted under
/usr/src/tensorrt/data
(NVIDIA NGC containers) and set it to$TRT_DATADIR
.export TRT_DATADIR=/usr/src/tensorrt/data
(Optional) If you want to train your own model and not use the pre-trained model included in this sample, you’ll need to convert the TensorFlow weights into a format that TensorRT can use.
-
Locate TensorFlow weights dumping script:
$TRT_OSSPATH/samples/common/dumpTFWts.py
This script has been provided to extract the weights from the model checkpoint files that are created during training. Use
dumpTFWts.py -h
for directions on the usage of the script. -
Convert the TensorFlow weights using the following command:
dumpTFWts.py -m /path/to/checkpoint -o /path/to/output
-
Compile the sample by following build instructions in TensorRT README.
-
Run the sample to generate characters based on the trained model:
./sample_char_rnn --datadir=<path/to/data>
For example:
./sample_char_rnn --datadir $TRT_DATADIR/char-rnn
-
Verify that the sample ran successfully. If the sample runs successfully you should see output similar to the following:
&&&& RUNNING TensorRT.sample_char_rnn # ./sample_char_rnn [I] [TRT] Detected 4 input and 3 output network tensors. [I] RNN Warmup: JACK [I] Expect: INGHAM: What shall I [I] Received: INGHAM: What shall I &&&& PASSED TensorRT.sample_char_rnn # ./sample_char_rnn
This output shows that the sample ran successfully;
PASSED
.
To see the full list of available options and their descriptions, use the -h
or --help
command line option.
The following resources provide a deeper understanding about RNN networks:
RNN networks
Videos
Documentation
For terms and conditions for use, reproduction, and distribution, see the TensorRT Software License Agreement documentation.
February 2019
This is the first release of this README.md
file.
There are no known issues in this sample.