MARIO

This is the official repository for the paper MARIO: MAth Reasoning with code Interpreter Output -- A Reproducible Pipeline. We release our codes and data.

MARIO REACT Corpus coming soon. 🤗🤖Gaokao-2023-ME are released here

	Base Model: Llemma	Outcome Value Model
7B	🤗🤖MARIO-7B	🤗🤖MARIO-OVM-7B
34B	🤗🤖MARIO-34B

Performance

We demonstrate the results of our MARIO-7B and MARIO-34B as follows:

Model	Decoding	GSM	MATH	OCWCourse	Gaokao-2023-ME
MARIO-OVM-7B + OVM@20	Hybrid	83.6	60.6	25.4	42.9
MARIO-7B + OVM@20	Hybrid	82.9	59.1	28.3	45.2
MARIO-OVM-7B	Hybrid	74.5	47.7	19.1	32.5
MARIO-7B	Hybrid	70.1	46.3	19.9	35.6
ToRA-Code-7B	Hybrid	72.6	44.6	4.8	23.9
MAmmoTH-Coder-7B	Hybrid	59.4	33.4	11.0	15.3
MathCoder-7B	Hybrid	67.8	30.2	-	-
MetaMath-7B-Mistral	CoT	77.7	28.2	-	-
OpenChat-3.5-7B	CoT	77.3	28.6	-	-
ChatGLM-3-6B	CoT	72.3	25.7	-	-

Model	Decoding	GSM	MATH	OCWCourse	Gaokao-2023-ME
MARIO-34B	Hybrid	78.7	53.1	25.4	41.3
ToRA-Code-34B	Hybrid	80.7	50.8	5.5	31.7
MAmmoTH-Coder-34B	Hybrid	72.7	43.6	14.0	25.2
MathCoder-34B	Hybrid	81.7	45.2	-	-
DeepSeek-Coder-33B	PoT	60.7	29.1	-	-
QWen-72B	CoT	78.9	35.2	-	-

Installation

Clone this repository and install the required packages:

git clone https://github.com/MARIO-Math-Reasoning/MARIO.git
cd MARIO
pip install -r requirements.txt
pip install -e ./math_evaluation

Data Generation

python gpt_react.py --verbose -g "gpt-4-1106-preview" -q "Given complex number $(a+i)(1-ai)=2,\;a \in \mathbb{R}$, find $a$."

Training and Inference

Our training is mostly performed on LLaMA-Factory code base. Please refer to that repo for more details.

Quick Start

Single question inference with screen output.

python react.py -c /path/to/checkpoint_dir -q "Compute tan(45)." --verbose

Batch inference with vllm

python batch_react.py -c /path/to/checkpoint_dir -q /path/to/question_file

Question file should be in jsonl format, where each line is a json string. The json string should at least include a key value pair for question.

Evaluation

python eval.py -q /path/to/question_file

Question file should be in jsonl format, where each line is a json string at least containing "pred" and "answer" keys for prediction and ground truth, respectively.

Acknowledgements

hiyouga's LLaMA-Factory

Citation

Please cite our paper if you use our data, model or code. Please also kindly cite the original dataset papers.

@misc{liao2024mario,
      title={MARIO: MAth Reasoning with code Interpreter Output -- A Reproducible Pipeline}, 
      author={Minpeng Liao and Wei Luo and Chengxi Li and Jing Wu and Kai Fan},
      year={2024},
      eprint={2401.08190},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
imgs		imgs
math_evaluation		math_evaluation
LICENSE		LICENSE
README.md		README.md
batch_react.py		batch_react.py
eval.py		eval.py
gpt_react.py		gpt_react.py
prompts.py		prompts.py
python_tool.py		python_tool.py
react.py		react.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MARIO

Performance

Installation

Data Generation

Training and Inference

Quick Start

Batch inference with vllm

Evaluation

Acknowledgements

Citation

About

Releases

Packages

Languages

License

lmp-decaderan/MARIO

Folders and files

Latest commit

History

Repository files navigation

MARIO

Performance

Installation

Data Generation

Training and Inference

Quick Start

Batch inference with vllm

Evaluation

Acknowledgements

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages