Skip to content

Official repository of the paper "Easy2Hard-Bench: Standardized Difficulty Labels for Profiling LLM Performance and Generalization" in NeurIPS 2024 Track Datasets and Benchmarks

Notifications You must be signed in to change notification settings

umd-huang-lab/Easy2Hard-Bench

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

Easy2Hard-Bench: Standardized Difficulty Labels for Profiling LLM Performance and Generalization

Mucong Ding* · Chenghao Deng* · Jocelyn Choo · Zichu Wu · Aakriti Agarawal · Avi Schwarzschild · Tianyi Zhou · Tom Goldstein · John Langford · Anima Anandkumar · Furong Huang

Logo

The codebase for the paper "Easy2Hard-Bench: Standardized Difficulty Labels for Profiling LLM Performance and Generalization" (https://arxiv.org/abs/2409.18433) by Mucong Ding*, Chenghao Deng*, Jocelyn Choo, Zichu Wu, Aakriti Agrawal, Avi Schwarzschild, Tianyi Zhou, Tom Goldstein, John Langford, Anima Anandkumar, Furong Huang.

We are still working on the final version of evaluation code for Easy2Hard-Bench. See you soon!

Citing

Please cite our work if you find it is helpful:

@inproceedings{
          ding2024easyhardbench,
          title={Easy2Hard-Bench: Standardized Difficulty Labels for Profiling {LLM} Performance and Generalization},
          author={Mucong Ding and Chenghao Deng and Jocelyn Choo and Zichu Wu and Aakriti Agrawal and Avi Schwarzschild and Tianyi Zhou and Tom Goldstein and John Langford and Anima Anandkumar and Furong Huang},
          booktitle={The Thirty-eight Conference on Neural Information Processing Systems Datasets and Benchmarks Track},
          year={2024},
}

About

Official repository of the paper "Easy2Hard-Bench: Standardized Difficulty Labels for Profiling LLM Performance and Generalization" in NeurIPS 2024 Track Datasets and Benchmarks

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published