Skip to content

Commit

Permalink
Add colab
Browse files Browse the repository at this point in the history
  • Loading branch information
zxdmike committed Oct 7, 2024
1 parent 7f5bf66 commit dd5d68d
Show file tree
Hide file tree
Showing 6 changed files with 15 additions and 3 deletions.
8 changes: 5 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,10 +31,11 @@ If you find our code useful for your research, please consider citing:

### 1. Requirements:
* python 3.9, pytorch >= 2.0
* install pytorch with cuda from https://pytorch.org/get-started/locally/, it is prerequisite for fast-hadamard-transform package.
* pip install -r requirement.txt
* git clone https://github.com/Dao-AILab/fast-hadamard-transform.git
cd fast-hadamard-transform
pip install .
* cd fast-hadamard-transform
* pip install .

### 2. Steps to run:
For the scripts here, set `output_rotation_path` `output_dir` `logging_dir` `optimized_rotation_path` to your own locations. For gated repo such as meta-llama, you can set your HF token to `access_token`.
Expand All @@ -59,7 +60,8 @@ To obtain ExecuTorch-compatible quantized models, you can use the following scri

* `bash scripts/31_optimize_rotation_executorch.sh $model_name`
* `bash scripts/32_eval_ptq_executorch.sh $model_name`


We also provide an example [colab notebook](https://colab.research.google.com/gist/zxdmike/abbb2c9b0d1fd1f4ed8cdae8c02180f4) to train and export ExecuTorch compatiable Llama 3.2 models
### Note
* If using GPTQ quantization method in Step 2 for quantizing both weight and activations, we optimize the rotation matrices with respect to a network where only activations are quantized.
e.g. `bash 10_optimize_rotation.sh meta-llama/Llama-2-7b 16 4 4` followed by `bash 2_eval_ptq.sh meta-llama/Llama-2-7b 4 4 4` with the `--optimized_rotation_path` pointing to the rotation optimized for W16A4KV4.
Expand Down
2 changes: 2 additions & 0 deletions scripts/10_optimize_rotation.sh
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,8 @@
# This source code is licensed under the license found in the
# LICENSE file in the root directory of this source tree.

# nnodes determines the number of GPU nodes to utilize (usually 1 for an 8 GPU node)
# nproc_per_node indicates the number of GPUs per node to employ.
torchrun --nnodes=1 --nproc_per_node=8 optimize_rotation.py \
--input_model $1 \
--output_rotation_path "your_path" \
Expand Down
2 changes: 2 additions & 0 deletions scripts/11_optimize_rotation_fsdp.sh
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,8 @@
# This source code is licensed under the license found in the
# LICENSE file in the root directory of this source tree.

# nnodes determines the number of GPU nodes to utilize (usually 1 for an 8 GPU node)
# nproc_per_node indicates the number of GPUs per node to employ.
torchrun --nnodes=1 --nproc_per_node=8 optimize_rotation.py \
--input_model $1 \
--output_rotation_path "your_path" \
Expand Down
2 changes: 2 additions & 0 deletions scripts/2_eval_ptq.sh
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,8 @@
# This source code is licensed under the license found in the
# LICENSE file in the root directory of this source tree.

# nnodes determines the number of GPU nodes to utilize (usually 1 for an 8 GPU node)
# nproc_per_node indicates the number of GPUs per node to employ.
torchrun --nnodes=1 --nproc_per_node=1 ptq.py \
--input_model $1 \
--do_train False \
Expand Down
2 changes: 2 additions & 0 deletions scripts/31_optimize_rotation_executorch.sh
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,8 @@
# This source code is licensed under the license found in the
# LICENSE file in the root directory of this source tree.

# nnodes determines the number of GPU nodes to utilize (usually 1 for an 8 GPU node)
# nproc_per_node indicates the number of GPUs per node to employ.
torchrun --nnodes=1 --nproc_per_node=8 optimize_rotation.py \
--input_model $1 \
--output_rotation_path "your_path" \
Expand Down
2 changes: 2 additions & 0 deletions scripts/32_eval_ptq_executorch.sh
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,8 @@
# This source code is licensed under the license found in the
# LICENSE file in the root directory of this source tree.

# nnodes determines the number of GPU nodes to utilize (usually 1 for an 8 GPU node)
# nproc_per_node indicates the number of GPUs per node to employ.
torchrun --nnodes=1 --nproc_per_node=1 ptq.py \
--input_model $1 \
--do_train False \
Expand Down

0 comments on commit dd5d68d

Please sign in to comment.