Improving llm reasoning on GSM8K, GSM Symbolic and other datasets using reasoning.
- Learning how to fine tune llama.
- Use techniques like self taught reasoners (Rejection sampling?).
- Use techniques like MCTS.
- Try RL techniques.
- Figure out how to integrate interpreters into reasoning.
- Train verifiers on program of thought.