Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

无法复现bce-reranker-base_v1在T2Reranking上的指标 #98

Open
HuDi2018 opened this issue Dec 12, 2024 · 1 comment
Open

无法复现bce-reranker-base_v1在T2Reranking上的指标 #98

HuDi2018 opened this issue Dec 12, 2024 · 1 comment

Comments

@HuDi2018
Copy link

以下是我的复现脚本

import mteb
from sentence_transformers import SentenceTransformer

model_name = "maidalun1020/bce-reranker-base_v1"
result_folder_name = "test1"

model = SentenceTransformer(model_name, cache_folder="/mnt/ckpt/")
tasks = mteb.get_tasks(tasks=["T2Reranking"])
evaluation = mteb.MTEB(tasks=tasks)
results = evaluation.run(model, output_folder=f"work_dirs/{result_folder_name}", save_predictions=True, encode_kwargs={"batch_size":8})

以下是我的复现结果:

{
  "dataset_revision": "76631901a18387f85eaa53e5450019b87ad58ef9",
  "evaluation_time": 412.80209612846375,
  "kg_co2_emissions": null,
  "mteb_version": "1.19.4",
  "scores": {
    "dev": [
      {
        "hf_subset": "default",
        "languages": [
          "cmn-Hans"
        ],
        "main_score": 0.47832978223327827,
        "map": 0.47832978223327827,
        "mrr": 0.5101074948146715,
        "nAUC_map_diff1": 0.2298402435900708,
        "nAUC_map_max": -0.22601260558435882,
        "nAUC_map_std": 0.07847525033222313,
        "nAUC_mrr_diff1": 0.10274562925373469,
        "nAUC_mrr_max": -0.1278304290403295,
        "nAUC_mrr_std": 0.036299709792525885
      }
    ]
  },
  "task_name": "T2Reranking"
}
@shenlei1020
Copy link
Collaborator

shenlei1020 commented Dec 13, 2024

我记得mteb的cross encoder的评测代码在某个reranker任务上有问题(去年底的mteb版本),需要修一下。但不确定是不是这个T2Reranking任务,可以参考一下本项目的评测代码。
本项目评测代码是修复了mteb和cmteb的cross encoder评测问题之后的代码。建议详细看一下本项目评测代码细节,这个:

class ModChineseRerankingEvaluator(RerankingEvaluator):

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants