We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
我看到在计算 WER 的代码中,先是计算出每一条测试数据的 wer ,然后直接 wer = round(np.mean(wers)*100,3) 进行了平均。每一条测试数据中的字符数并不相同,长句和短句的字符数差异很大,这样的计算方式是不是不太合理?而为什么不按照测试数据的总字符数计算 WER 呢?
The text was updated successfully, but these errors were encountered:
用你们的代码对模型的测试结果进行自动识别并计算指标时,放大了短句 badcase 的影响,WER 明显偏高
Sorry, something went wrong.
明显是哪个指标算得低用哪个^^
+1 我认为总字符数计算才是合理的,参考FAIR的代码: https://github.com/facebookresearch/av_hubert/blob/258fb50e155134eec2c4b49c2ae8de267075fd18/avhubert/infer_s2s.py#L258
No branches or pull requests
我看到在计算 WER 的代码中,先是计算出每一条测试数据的 wer ,然后直接 wer = round(np.mean(wers)*100,3) 进行了平均。每一条测试数据中的字符数并不相同,长句和短句的字符数差异很大,这样的计算方式是不是不太合理?而为什么不按照测试数据的总字符数计算 WER 呢?
The text was updated successfully, but these errors were encountered: