From 56ae3392ddd7685f5cf4f37bc0fce0a2941fa774 Mon Sep 17 00:00:00 2001 From: Cedric Chee Date: Wed, 8 Mar 2023 13:21:13 +0800 Subject: [PATCH] add supporting tools - Resharding and HuggingFace conversion. --- README.md | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 2f402dcea..5070fb021 100755 --- a/README.md +++ b/README.md @@ -29,7 +29,7 @@ All the new codes are available in the [chattyllama](./chattyllama/) directory. **Combined** All changes and fixes baked into one: -- Non-Model Parallel (MP): all MP logic removed (MP shards weights across a GPU cluster setup) +- Non-Model Parallel (MP): all MP constructs removed (MP shards weights across a GPU cluster setup) - 8-bit quantized model using bitsandbytes - Sampler fixes, better sampler @@ -123,6 +123,10 @@ Well look at [my "transformers-llama" repo](https://github.com/cedrickchee/trans - Train with prompt data from: [fka/awesome-minichatgpt-prompts](https://huggingface.co/datasets/fka/awesome-minichatgpt-prompts). Training scripts and instructions [here](https://github.com/juncongmoo/minichatgpt/tree/main/examples#train-with-real-prompt-data). - Train the reward model using [Dahoas/rm-static](https://huggingface.co/datasets/Dahoas/rm-static) dataset. +### Supporting tools + +- [Resharding and HuggingFace conversion](https://github.com/dmahan93/llama/blob/main/CONVERSIONS.md) - Useful scripts for transforming the weights, if you still want to spread the weights and run the larger model (in fp16 instead of int8) across multiple GPUs for some reasons. + ### Plan TODO: