Skip to content

Latest commit

 

History

History
 
 

asr

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Automatic Speech Recognition

This directory contains example scripts to train ASR models using various methods such as Connectionist Temporal Classification loss, RNN Transducer Loss.

Speech pre-training via self supervised learning, voice activity detection and other sub-domains are also included as part of this domain's examples.

ASR Model inference execution overview

The inference scripts in this directory execute in the following order. When preparing your own inference scripts, please follow this order for correct inference.

graph TD
    A[Hydra Overrides + Config Dataclass] --> B{Config}
    B --> |Init| C[Model]
    B --> |Init| D[Trainer]
    C & D --> E[Set trainer]
    E --> |Optional| F[Change Transducer Decoding Strategy]
    F --> H[Load Manifest]
    E --> |Skip| H
    H --> I["model.transcribe(...)"]
    I --> J[Write output manifest]
    K[Ground Truth Manifest]
    J & K --> |Optional| L[Evaluate CER/WER]

Loading

During restoration of the model, you may pass the Trainer to the restore_from / from_pretrained call, or set it after the model has been initialized by using model.set_trainer(Trainer).