Skip to content

Commit

Permalink
add supporting tools
Browse files Browse the repository at this point in the history
- Resharding and HuggingFace conversion.
  • Loading branch information
cedrickchee committed Mar 8, 2023
1 parent d1e43f7 commit 56ae339
Showing 1 changed file with 5 additions and 1 deletion.
6 changes: 5 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ All the new codes are available in the [chattyllama](./chattyllama/) directory.
**Combined**

All changes and fixes baked into one:
- Non-Model Parallel (MP): all MP logic removed (MP shards weights across a GPU cluster setup)
- Non-Model Parallel (MP): all MP constructs removed (MP shards weights across a GPU cluster setup)
- 8-bit quantized model using bitsandbytes
- Sampler fixes, better sampler

Expand Down Expand Up @@ -123,6 +123,10 @@ Well look at [my "transformers-llama" repo](https://github.com/cedrickchee/trans
- Train with prompt data from: [fka/awesome-minichatgpt-prompts](https://huggingface.co/datasets/fka/awesome-minichatgpt-prompts). Training scripts and instructions [here](https://github.com/juncongmoo/minichatgpt/tree/main/examples#train-with-real-prompt-data).
- Train the reward model using [Dahoas/rm-static](https://huggingface.co/datasets/Dahoas/rm-static) dataset.

### Supporting tools

- [Resharding and HuggingFace conversion](https://github.com/dmahan93/llama/blob/main/CONVERSIONS.md) - Useful scripts for transforming the weights, if you still want to spread the weights and run the larger model (in fp16 instead of int8) across multiple GPUs for some reasons.

### Plan

TODO:
Expand Down

0 comments on commit 56ae339

Please sign in to comment.