Skip to content

Commit

Permalink
update eval dataset.
Browse files Browse the repository at this point in the history
  • Loading branch information
shibing624 committed Sep 4, 2023
1 parent 30e936d commit 18c1588
Show file tree
Hide file tree
Showing 2 changed files with 2,702 additions and 2,712 deletions.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -114,7 +114,7 @@
- 各预训练模型均可以通过transformers调用,如MacBERT模型:`--model_name hfl/chinese-macbert-base` 或者roberta模型:`--model_name uer/roberta-medium-wwm-chinese-cluecorpussmall`
- 为测评模型的鲁棒性,加入了未训练过的SOHU测试集,用于测试模型的泛化能力;为达到开箱即用的实用效果,使用了搜集到的各中文匹配数据集,数据集也上传到HF datasets[链接见下方](#数据集)
- 中文匹配任务实验表明,pooling最优是`EncoderType.FIRST_LAST_AVG``EncoderType.MEAN`,两者预测效果差异很小
- 中文匹配评测结果复现,可以下载中文匹配数据集到`examples/data`,运行[tests/test_model_spearman.py](https://github.com/shibing624/text2vec/blob/master/tests/test_model_spearman.py)代码复现评测结果
- 中文匹配评测结果复现,可以下载中文匹配数据集到`examples/data`,运行 [tests/model_spearman.py](https://github.com/shibing624/text2vec/blob/master/tests/model_spearman.py) 代码复现评测结果
- QPS的GPU测试环境是Tesla V100,显存32GB

模型训练实验报告:[实验报告](https://github.com/shibing624/text2vec/blob/master/docs/model_report.md)
Expand Down
Loading

0 comments on commit 18c1588

Please sign in to comment.