GitHub

Companion code for the paper "An End-To-End Non-Intrusive Model for Subjective and Objective Real-World Speech Assessment Using a Multi-Task Framework", ICASSP 2021.

The dataset (e.g., human MOS scores for COSINE and VOiCES) is available here.

Note: The current model takes in fixed 4s audio/speech as input, padding/truncation is needed. Two pretrained models (on COSINE and VOiCES datasets, respectively) are provided. For different datasets, we recommend to retrain the model.

Paper: https://ieeexplore.ieee.org/document/9414182

If you use the code in this repo, please cite the following paper:

  @inproceedings{zhang2021end,
    title={An End-To-End Non-Intrusive Model for Subjective and Objective Real-World Speech Assessment Using a Multi-Task Framework},
    author={Zhang, Zhuohuang and Vyas, Piyush and Dong, Xuan and Williamson, Donald S},
    booktitle={ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
    pages={316--320},
    year={2021},
    organization={IEEE}
  }

If you use the dataset, please cite the following paper:

@article{dong2020pyramid,
  title={{A pyramid recurrent network for predicting crowdsourced speech-quality ratings of real-world signals}},
  author={Dong, Xuan and Williamson, Donald S},
  booktitle={Interspeech}，
  pages={4631--4635},
  year={2020}
}

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
pretrained_models		pretrained_models
.gitattributes		.gitattributes
README.md		README.md
data.py		data.py
model.py		model.py
test_model.py		test_model.py
train_model.py		train_model.py
trainer.py		trainer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Languages

zzhang68/icassp2021_metric

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages