mosaic-leaderboard

AI2's Mosaic Team has created benchmark datasets for various Commonsense Understanding tasks. To keep track of progress, each dataset is associated with a leaderboard linked here: https://leaderboard.allenai.org/

This repository provides implementations for baselines and evaluation scripts for each dataset.

αNLI: Abductive Natural Language Inference
1. Evaluator
2. Random Baseline
VCR: Visual Commonsense Reasoning
1. Evaluator
2. Random Baseline
HellaSwag: Can a Machine Really Finish Your Sentence?
1. Evaluator
2. Random Baseline
Social IQA: Commonsense Reasoning about Social Interactions
1. Evaluator
2. Random Baseline
Physical IQA: Commonsense Reasoning about Physical Interactions
1. Evaluator
2. Random Baseline
Cosmos QA: Machine Reading Comprehension with Contextual Commonsense Reasoning
1. Evaluator
2. Random Baseline
WinoGrande: An Adversarial Winograd Schema Challenge at Scale
1. Evaluator
2. Random Baseline

Name		Name	Last commit message	Last commit date
Latest commit History 70 Commits
anli		anli
cosmosqa		cosmosqa
hellaswag		hellaswag
physicaliqa		physicaliqa
socialiqa		socialiqa
vcr		vcr
winogrande		winogrande
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

mosaic-leaderboard

About

Releases

Packages

Languages

License

wk1879/mosaic-leaderboard

Folders and files

Latest commit

History

Repository files navigation

mosaic-leaderboard

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages