- Project description
- Development Tools
- Discussion
- Result and Conclusion
- References
Generate TV scripts using RNN and LSTM
- PyTorch Framework
- I followed the lecture from Udacity DLND for the hyperparameters as a starting point.
- Inital batch_size was 128 with sequence_length = 100, learning rate of 0.01, hidden_dim = 215. There was error message 'RuntimeError: cuda runtime error (59) : device-side assert triggered at /opt/conda/conda-bld/pytorch_1524584710464/work/aten/src/THC/generic/THCStorage.c:36'.
- I change the batch_size to 64 with sequence_length = 5, lr of 0.01, hidden_dim = 215. The model's losses increased from 5.8 (first loss) to 6.9 and the loss numbers after that were around 6-7 during training. The model didn't seem to learn.
- I changed hidden_dim to 128. The losses were around 5.
- I changed the sequence_length to 10. The losses were still around 5.
- I changed the lr to 0.001. The losses decreased to 4.3, however, during training, the losses bounced around 4.3-4.1
- I changed the batch_size to 100, sequence_length: 5, and with 20 epochs. The losses decreased gradually from 5.11 to 3.8 (the last epoch). I think I am heading to the right direction. I increased the epochs number to 50 for the next training.
- I changed the embedding_dim to 50, as the previous dimension (300) resulted in very slow loss decreasing. This didn't work.
- I used embedding_dim 200 and increased the clipping rate from 5 to 15. Epoch: 30. Loss at the last epoch: 3.7 and the loss reduced too slow.
- I increased the clipping rate to 20 to speed up the training. It didin't work. I just found out that I had bugs with my dropout layer.
- I solved the problem after 32 hours training (attempting different hyperparameters), trying to discover why the losses decreased so slow.
Aim loss is 3.5
To achive that, I used:
- clipping rate : 10
- no of words in a sequence: 10
- batch size: 100
- 10 epochs
- learning rate: 0.001
- embedding dim: 200
- hidden dim: 256
- number of RNN layers: 2
Last epoch's loss is 3.25
- https://datascience.stackexchange.com/questions/31109/ratio-between-embedded-vector-dimensions-and-vocabulary-size
- https://www.quora.com/How-do-I-determine-the-number-of-dimensions-for-word-embedding
- https://www.quora.com/What-is-word-embedding-in-deep-learning
- https://pytorch.org/tutorials/intermediate/char_rnn_generation_tutorial.html
- Udacity Deep Learning Nanodegree Program