Lip Reading Word Classification

Project Description

Lip reading, also known as visual speech recognition, is the technique of understanding speech only through visually interpreting the movement of the lip. The goal of this project is to create a classifier model that can identify the speech content of a sequence of images of a speaker uttering a single word. The primary motivation behind this project is the existence of large volumes of videos on the internet without subtitles or captions. This project can help extract information from these videos online, benefiting hearing-impaired groups, and increasing their accessibility to video media. Our solution approach is to implement a CNN-LSTM model, which is well suited for sequence classification problems with spatial inputs, such as images and videos. Afterward, perform tuning to identify best performing hyperparameters.

Repository Structure

The repository contains the following files:

preprocessing_MIRACL_V1.ipynb: This was used to preprocessing all images in MIRACL_V1 dataset
training_MIRACL_V1.ipynb: Thie notebook implements a CNN-LSTM Model, and trains the model.
tune_MIRACLV1.ipynb: This notebook contains code for the tuning of hyperparameters.

The MIRACL_V1 data used for this project can be downloaded from this link.

Instructions

Make sure the path in the notebooks is the correct path to the directory where the data is located on.
Run preprocessing_MIRACL_V1.ipynb before running the training or tuning notebooks.

Model

CNN-LSTM Model:

Evaluation

Grid search on hyperparameters:

Final model with learning rate 0.001 and batch size 32:

Final model performance:

Log loss: 5.856795053599635
Top 1 accuracy: 0.21
Top 2 accuracy: 0.4
Top 3 accuracy: 0.56

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
README.md		README.md
preprocessing_MIRACL_V1.ipynb		preprocessing_MIRACL_V1.ipynb
training_MIRACL_V1.ipynb		training_MIRACL_V1.ipynb
tune_MIRACL_V1.ipynb		tune_MIRACL_V1.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Lip Reading Word Classification

Project Description

Repository Structure

Instructions

Model

Evaluation

Resources Used and Relevant Works

About

Releases

Packages

Contributors 2

Languages

kristenjc/LipReadingWordClassification

Folders and files

Latest commit

History

Repository files navigation

Lip Reading Word Classification

Project Description

Repository Structure

Instructions

Model

Evaluation

Resources Used and Relevant Works

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages