Skip to content

CatterPingu/sequence-to-sequence-lstm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

This notebook implements a Sequence-to-Sequence model using LSTM layers to translate French to English. The model is trained on a subset of the English-French pairs. The implementation uses the Keras deep learning framework with TensorFlow as the backend.

Data Preparation

The French-English dataset is downloaded and unzipped. The code reads in the dataset and prepares the data for the model. The data is split into input and target sequences and each character in the sequences is encoded using a one-hot encoding scheme. The data is then split into a training and validation set.

Model Architecture

The model architecture consists of two parts: an encoder and a decoder. The encoder reads in the input sequence and produces a context vector, which is passed to the decoder. The decoder takes the context vector and produces the output sequence. Both the encoder and decoder use LSTM layers. The decoder also uses an output dense layer to produce the probability distribution of the target tokens.

Training

The model is trained using the RMSprop optimizer and categorical cross-entropy loss function. Training is performed for 100 epochs with a batch size of 64. The model's accuracy is also evaluated on the validation set during training.

Inference

The trained model is used for inference to translate new French sentences to English. The code loads the saved model and creates separate encoder and decoder models. The encoder model takes in the input sequence and produces the context vector. The decoder model takes the context vector and generates the output sequence one token at a time. The output sequence is produced until the decoder generates an end-of-sequence token or the maximum output length is reached.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published