metric

History

Name		Name	Last commit message	Last commit date
parent directory ..
dataflow		dataflow
README.md		README.md
bleu.py		bleu.py
calculator.py		calculator.py
feqa.py		feqa.py
general.py		general.py
multi-bleu.perl		multi-bleu.perl
official_coQA_scorer.py		official_coQA_scorer.py
score_CoQA.py		score_CoQA.py
score_bAbI.py		score_bAbI.py
score_retrieval_wow.py		score_retrieval_wow.py
scorer.py		scorer.py
scorer_MWOZ.py		scorer_MWOZ.py
scorer_parse.py		scorer_parse.py
smd_scorer.py		smd_scorer.py
tree.py		tree.py

README.md

Metrics

In this forder, we implement the evaluation metrics used in the paper.

SMD

The SMD scorer is custum to the task of SMD, so ```smd_score.py`` implement the scorer from Wu et al. 2020.

Feqa

The FeQA scorer is used for evaluating DialKG as in Dziri et al 2021. Notice to run this scorer, separated checkpoints for BART and QA model need to downloaded. Please refer to https://github.com/esdurmus/feqa for more information.

WoW-parse

We use the scorer from KILT for the Rprec. This is implemented in score_retrieval_wow.py.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Files

metric

metric

README.md

Metrics

SMD

Feqa

WoW-parse

Files

metric

Directory actions

More options

Directory actions

More options

Latest commit

History

metric

Folders and files

parent directory

README.md

Metrics

SMD

Feqa

WoW-parse