Skip to content

zzhang68/icassp2021_metric

Repository files navigation

Companion code for the paper "An End-To-End Non-Intrusive Model for Subjective and Objective Real-World Speech Assessment Using a Multi-Task Framework", ICASSP 2021.

The dataset (e.g., human MOS scores for COSINE and VOiCES) is available here.

Note: The current model takes in fixed 4s audio/speech as input, padding/truncation is needed. Two pretrained models (on COSINE and VOiCES datasets, respectively) are provided. For different datasets, we recommend to retrain the model.

Paper: https://ieeexplore.ieee.org/document/9414182

If you use the code in this repo, please cite the following paper:

  @inproceedings{zhang2021end,
    title={An End-To-End Non-Intrusive Model for Subjective and Objective Real-World Speech Assessment Using a Multi-Task Framework},
    author={Zhang, Zhuohuang and Vyas, Piyush and Dong, Xuan and Williamson, Donald S},
    booktitle={ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
    pages={316--320},
    year={2021},
    organization={IEEE}
  }

If you use the dataset, please cite the following paper:

@article{dong2020pyramid,
  title={{A pyramid recurrent network for predicting crowdsourced speech-quality ratings of real-world signals}},
  author={Dong, Xuan and Williamson, Donald S},
  booktitle={Interspeech},
  pages={4631--4635},
  year={2020}
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages