CORE-MM

Complex Open-ended Reasoning Evaluation for Multi-Modal Large Language Models

Table of Contents

News
Leaderboard
Download
Evaluation
Examples
Citation
License

News

🎉 [2023.11.18] We release paper at arxive.

Leaderboard

The leaderboard can be found via Papers with Code or project page.

Download

Images and Questions can be downloaded here.

Evaluation

To evaluate on our CORE-MM Benchmark, please follow below steps:

Step 0: Download Images and Questions

Step 1: Generate Response for Your Model

Generate responses for your model on the CORE-MM dataset. The response should be a json file with the following format:

{
  "1": "the answer of question 1",
  "2": "the answer of question 2",
  ...
  "idx": "the answer of question idx"
}

Step 2: Send Predictions to us

After generating responses for your model, please name the json as model_name_model_size.json e.g. CogVLM-Chat_17B.json and send to us via email for evaluation.

We will evaluate your model and send you the results back.

Examples

More examples

Citation

@misc{han2023coremm,
      title={CORE-MM: Complex Open-Ended Reasoning Evaluation For Multi-Modal Large Language Models},
      author={Xiaotian Han and Quanzeng You and Yongfei Liu and Wentao Chen and Huangjie Zheng and Khalil Mrini and Xudong Lin and Yiqi Wang and Bohan Zhai and Jianbo Yuan and Heng Wang and Hongxia Yang},
      year={2023},
      eprint={2311.11567},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

License

This project is licensed under the CC BY-NC 4.0.

The copyright of the images belongs to the original authors.

See LICENSE for more information.

Contact

Please feel free to contact us via email [email protected] if you have any questions.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
assets		assets
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CORE-MM

Complex Open-ended Reasoning Evaluation for Multi-Modal Large Language Models

News

Leaderboard

Download

Evaluation

Step 0: Download Images and Questions

Step 1: Generate Response for Your Model

Step 2: Send Predictions to us

Examples

Citation

License

Contact

About

Releases

Packages

Languages

License

kennymckormick/core-mm

Folders and files

Latest commit

History

Repository files navigation

CORE-MM

Complex Open-ended Reasoning Evaluation for Multi-Modal Large Language Models

News

Leaderboard

Download

Evaluation

Step 0: Download Images and Questions

Step 1: Generate Response for Your Model

Step 2: Send Predictions to us

Examples

Citation

License

Contact

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages