Download phi weights
Microsoft Research released Phi 2, which is a 2.7 billion parameter model trained on "textbook-quality" data with knowledge distillation from Phi 1.5. The model achieves sota results among base LLMs with less than 13B parameters and matches or outperforms models up to 25x larger on complex benchmarks, e.g. it achieves better performance compared to 25x larger Llama-2-70B model on multi-step reasoning tasks, i.e., coding and math. Phi 2 was trained on 1.4T tokens and has not undergone any RLHF alignment nor has it been instruct fine-tuned. Phi 2 shares the same architecture with Phi 1.5 and has context length of 2048 tokens. The model weights are released under Microsoft Research license.
To download the model weights and convert them to the lit-gpt format, run
pip install 'huggingface_hub[hf_transfer] @ git+https://github.com/huggingface/huggingface_hub'
python scripts/download.py --repo_id microsoft/phi-2 --from_safetensors True
python scripts/convert_hf_checkpoint.py --checkpoint_dir checkpoints/microsoft/phi-2
Warning
Phi-2 used dropout during training which we don't model, so training will not be equal.
Inference the model in instruct mode:
python chat/base.py --checkpoint_dir checkpoints/microsoft/phi-2
>> Prompt: Write a detailed analogy between mathematics and a lighthouse.
>> Reply: Mathematics is like a lighthouse. Mathematics provides a method to guide us through the sometimes chaotic and confusing waters of life. It provides a structured approach to problems which can help us find our way and provide direction. Just as a lighthouse keeps watch over the sea, mathematics can provide us with the tools to try and make sense of the world. Furthermore, just as a lighthouse keeps a watchful eye on the horizon, mathematics can help us reach our goals by showing us the way.
Note
In order to obtain appropriate answers, you may need to tweak the input prompt. E.g. we found out that if using "Instruct:{prompt}\nOutput:\n"
instead of "Instruct:{prompt}\nOutput:"
the model generates longer answers in some cases.
Free generation mode:
python generate/base.py --prompt "Alice: I don't know why, I'm struggling to maintain focus while studying. Any suggestions?\nBob:" --checkpoint_dir checkpoints/microsoft/phi-2
which yields
Alice: I don't know why, I'm struggling to maintain focus while studying. Any suggestions?
Bob: Well, one possible reason could be stress. Have you been feeling overwhelmed lately?
Alice: Yes, I've been juggling multiple deadlines and it's been quite taxing.
Carol: Stress can definitely impact your ability to concentrate. Maybe you need
A team at Microsoft Research has made available Phi 1.5, which is a 1.3 billion parameter model optimized for common sense reasoning in natural language, showing performance on par with models 5x its size, especially in grade-school mathematics and basic coding. This model retains characteristics of larger LLMs, and significant improvement was noted in reducing toxic and biased generations by avoiding web data. It's also worth highlighting that while this model performs well on language understanding and common sense reasoning tasks, it is a base model that has not undergone any supervised instruction finetuning or finetuning with RLHF.
The model was trained the same data sources (7B tokens) as its phi-1 predecessor, which includes
- a Python code subset from The Stack v1.2
- Q&A texts from StackOverflow
- code from DeepMind code_contests
- synthetic Python textbooks and exercises generated by gpt-3.5-turbo-0301
In addition, to create phi-1.5, the authors included additional textbook-quality synthetic text (roughly 20B tokens) in natural language, which was created using the Textbooks Are All You Need approach.
The model weights are released under a Microsoft Research license.
In order to use the phi-1.5 model checkpoint, which requires about 3 Gb of disk space, download the weights and convert the checkpoint to the lit-gpt format:
pip install 'huggingface_hub[hf_transfer] @ git+https://github.com/huggingface/huggingface_hub'
python scripts/download.py --repo_id microsoft/phi-1_5
python scripts/convert_hf_checkpoint.py --checkpoint_dir checkpoints/microsoft/phi-1_5
You're done! To execute the model just run:
pip install tokenizers
python generate/base.py --prompt "Hello, my name is" --checkpoint_dir checkpoints/microsoft/phi-1_5