VIPL AVSU
Pinned Loading
Repositories
- CAS-VSR-S101 Public
CAS-VSR-S101: A large-scale Mandarin dataset from TV broadcasts for audio-visual speech research
VIPL-Audio-Visual-Speech-Understanding/CAS-VSR-S101’s past year of commit activity - CAS-VSR-MOV20 Public
CAS-VSR-MOV20: A challenging dataset for Chinese visual speech recognition, consisting of video clips from 20 movies.
VIPL-Audio-Visual-Speech-Understanding/CAS-VSR-MOV20’s past year of commit activity - MAVSR2025-Track2 Public
VIPL-Audio-Visual-Speech-Understanding/MAVSR2025-Track2’s past year of commit activity - CAS-VSR-S68 Public
CAS-VSR-S68: A dataset for lip reading with unseen speakers, spanning 68 hours of news broadcasts.
VIPL-Audio-Visual-Speech-Understanding/CAS-VSR-S68’s past year of commit activity - learn-an-effective-lip-reading-model-without-pains Public
The PyTorch Code and Model In "Learn an Effective Lip Reading Model without Pains", (https://arxiv.org/abs/2011.07557), which reaches the state-of-art performance in LRW-1000 dataset.
VIPL-Audio-Visual-Speech-Understanding/learn-an-effective-lip-reading-model-without-pains’s past year of commit activity - LipNet-PyTorch Public
The state-of-art PyTorch implementation of the method described in the paper "LipNet: End-to-End Sentence-level Lipreading" (https://arxiv.org/abs/1611.01599)
VIPL-Audio-Visual-Speech-Understanding/LipNet-PyTorch’s past year of commit activity - deep-face-speechreading Public
Visual speech recognition with face inputs: code and models for F&G 2020 paper "Can We Read Speech Beyond the Lips? Rethinking RoI Selection for Deep Visual Speech Recognition"
VIPL-Audio-Visual-Speech-Understanding/deep-face-speechreading’s past year of commit activity