Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

average_wer.py:真的能按句平均吗? #13

Open
Ma-LinHan opened this issue Aug 11, 2024 · 3 comments
Open

average_wer.py:真的能按句平均吗? #13

Ma-LinHan opened this issue Aug 11, 2024 · 3 comments

Comments

@Ma-LinHan
Copy link

Ma-LinHan commented Aug 11, 2024

我看到在计算 WER 的代码中,先是计算出每一条测试数据的 wer ,然后直接 wer = round(np.mean(wers)*100,3) 进行了平均。每一条测试数据中的字符数并不相同,长句和短句的字符数差异很大,这样的计算方式是不是不太合理?而为什么不按照测试数据的总字符数计算 WER 呢?

@Ma-LinHan
Copy link
Author

用你们的代码对模型的测试结果进行自动识别并计算指标时,放大了短句 badcase 的影响,WER 明显偏高

@liyunlongaaa
Copy link

明显是哪个指标算得低用哪个^^

@PussyCat0700
Copy link

+1 我认为总字符数计算才是合理的,参考FAIR的代码:
https://github.com/facebookresearch/av_hubert/blob/258fb50e155134eec2c4b49c2ae8de267075fd18/avhubert/infer_s2s.py#L258

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants