-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue about reproducing results in some datasets #2
Comments
It seems that this is mainly due to your lower version of vllm. Try to upgrade that to reproduce it. Thanks! |
Thanks for your help! Here are my updated results with the new vllm version. I think the GPQA dataset is a little unstable.
|
Thanks! Would you mind trying our updated ckpt. It's getting better results. Please refer to https://huggingface.co/TIGER-Lab/MAmmoTH2-7B-Plus. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Thanks for your great work! I clone the
math_eval
directory and runrun_7B_plus.sh
directly, and find some performance gaps in some datasets.My environment is:
Am I missing something? Thanks for your help!
The text was updated successfully, but these errors were encountered: