ConMamba

An official implementation of convolution-augmented Mamba for speech recognition.

Architecture

Prerequisites

Download LibriSpeech corpus.
Install Packages.

conda create --name Slytherin python=3.9
conda activate Slytherin
pip install -r requirements.txt

You may need to install lower or higher versions of torch, torchaudio, causal-conv1d and mamba-ssm based on your hardware and system. Make sure they are compatible.

Training

To train a ConMamba Encoder-Transformer Decoder model on one GPU:

python train_S2S.py hparams/S2S/conmamba_large(small).yaml --data_folder <YOUR_PATH_TO_LIBRISPEECH> --precision bf16

To train a ConMamba Encoder-Mamba Decoder model on one GPU:

python train_S2S.py hparams/S2S/conmambamamba_large(small).yaml --data_folder <YOUR_PATH_TO_LIBRISPEECH> --precision bf16

To train a ConMamba Encoder model with a character-level CTC loss on four GPUs:

torchrun --nproc-per-node 4 train_CTC.py hparams/CTC/conmamba_large.yaml --data_folder <YOUR_PATH_TO_LIBRISPEECH> --precision bf16

Inference and Checkpoints (Later)

Performance (Word Error Rate%)

Acknowledgement

We acknowledge the wonderful work of Mamba and Vision Mamba. We borrowed their implementation of Mamba and bidirectional Mamba. The training recipes are adapted from SpeechBrain.

Citation

If you find this work helpful, please consider citing:

@misc{jiang2024speechslytherin,
      title={Speech Slytherin: Examining the Performance and Efficiency of Mamba for Speech Separation, Recognition, and Synthesis}, 
      author={Xilin Jiang and Yinghao Aaron Li and Adrian Nicolas Florea and Cong Han and Nima Mesgarani},
      year={2024},
      eprint={2407.09732},
      archivePrefix={arXiv},
      primaryClass={eess.AS},
      url={https://arxiv.org/abs/2407.09732}, 
}

You may also like our Mamba for speech separation: https://github.com/xi-j/Mamba-TasNet

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
figures		figures
hparams		hparams
modules		modules
LICENSE		LICENSE
README.md		README.md
librispeech_prepare.py		librispeech_prepare.py
requirement.txt		requirement.txt
train_CTC.py		train_CTC.py
train_S2S.py		train_S2S.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ConMamba

Architecture

Prerequisites

Training

Inference and Checkpoints (Later)

Performance (Word Error Rate%)

Acknowledgement

Citation

About

Releases

Packages

Languages

License

xi-j/Mamba-ASR

Folders and files

Latest commit

History

Repository files navigation

ConMamba

Architecture

Prerequisites

Training

Inference and Checkpoints (Later)

Performance (Word Error Rate%)

Acknowledgement

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages