Skip to content

Commit

Permalink
Merge branch 'refactor_eval' of https://github.com/locuslab/tofu into…
Browse files Browse the repository at this point in the history
… refactor_eval
  • Loading branch information
zhilif committed Mar 15, 2024
2 parents 63453a9 + ecaf053 commit 77d421b
Show file tree
Hide file tree
Showing 6 changed files with 47 additions and 16 deletions.
21 changes: 21 additions & 0 deletions LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
MIT License

Copyright (c) 2024 CMU Locus Lab

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
16 changes: 12 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,16 @@

The TOFU dataset serves as a benchmark for evaluating unlearning performance of large language models on realistic tasks. The dataset comprises question-answer pairs based on autobiographies of 200 different authors that do not exist and are completely fictitiously generated by the GPT-4 model. The goal of the task is to unlearn a fine-tuned model on various fractions of the forget set.

## Quick Links

- [**Website**](https://locuslab.github.io/tofu): The landing page for TOFU
- [**arXiv Paper**](http://arxiv.org/abs/2401.06121): Detailed information about the TOFU dataset and its significance in unlearning tasks.
- [**GitHub Repository**](https://github.com/locuslab/tofu): Access the source code, fine-tuning scripts, and additional resources for the TOFU dataset.
- [**Dataset on Hugging Face**](https://huggingface.co/datasets/locuslab/TOFU): Direct link to download the TOFU dataset.
- [**Leaderboard on Hugging Face Spaces**](https://huggingface.co/spaces/locuslab/tofu_leaderboard): Current rankings and submissions for the TOFU dataset challenges.
- [**Summary on Twitter**](https://x.com/_akhaliq/status/1745643293839327268): A concise summary and key takeaways from the project.


## Applicability 🚀

The dataset is in QA format, making it ideal for use with popular chat models such as Llama2, Mistral, or Qwen. However, it also works for any other large language model. The corresponding code base is written for the Llama2 chat, and Phi-1.5 models, but can be easily adapted to other models.
Expand All @@ -14,6 +24,7 @@ conda activate tofu
conda install pytorch pytorch-cuda=11.8 -c pytorch -c nvidia
conda install -c "nvidia/label/cuda-11.8.0" cuda-toolkit
pip install -r requirements.txt
pip install flash-attn --no-build-isolation
```

## Loading the Dataset
Expand Down Expand Up @@ -60,13 +71,10 @@ You can modify the configuration in config/eval_everything.yaml. We suggest to e

Retain sets corresponding to each forget set are also available, which can be used to train an Oracle model.

## Codebase

The code for training the models and availability of all fine-tuned models can be found at our [GitHub repository]().

### Push to Leaderboard

How to push your results to the leaderboard?
Head over to our [**Leaderboard on Hugging Face Spaces**](https://huggingface.co/spaces/locuslab/tofu_leaderboard) and drop your evaluated results file!

## Citing Our Work

Expand Down
12 changes: 6 additions & 6 deletions config/finetune_lora.yaml
Original file line number Diff line number Diff line change
@@ -1,16 +1,16 @@
model_id: NousResearch/Llama-2-7b-chat-hf
model_family: llama2-7b
# model_path: /project_data2/zhilif/unlearning_ckpt/ft_model_10_epochs_inst

LoRA:
r: 8
alpha: 32
dropout: 0.05

data_path: TUFA
split: all
data_path: locuslab/TOFU
split: full
batch_size: 16
# data_path: /home/zhilif/memory/data/gpt4_gen_bios/trial1+2
gradient_accumulation_steps: 1
num_epochs: 10
save_dir: unlearning_ckpt/ft_model_10_epochs_inst_lr1e-4
lr: 1e-4
save_dir: paper_models/final_ft_LORA_${num_epochs}_epochs_inst_lr${lr}_${model_family}_${split}
lr: 1e-4
weight_decay: 0
8 changes: 5 additions & 3 deletions finetune.py
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,6 @@ def main(cfg):

max_steps = int(cfg.num_epochs*len(torch_format_dataset))//(batch_size*gradient_accumulation_steps*num_devices)
print(f"max_steps: {max_steps}")

training_args = transformers.TrainingArguments(
per_device_train_batch_size=batch_size,
per_device_eval_batch_size=batch_size,
Expand All @@ -93,9 +92,11 @@ def main(cfg):
)

model = AutoModelForCausalLM.from_pretrained(model_id, use_flash_attention_2=model_cfg["flash_attention2"]=="true", torch_dtype=torch.bfloat16, trust_remote_code = True)
model.generation_config.do_sample=True



# Hot fix for https://discuss.huggingface.co/t/help-with-llama-2-finetuning-setup/50035
model.generation_config.do_sample = True


if model_cfg["gradient_checkpointing"] == "true":
model.gradient_checkpointing_enable()
Expand Down Expand Up @@ -126,6 +127,7 @@ def main(cfg):
if cfg.LoRA.r != 0:
model = model.merge_and_unload()


model.save_pretrained(cfg.save_dir)
tokenizer.save_pretrained(cfg.save_dir)

Expand Down
4 changes: 3 additions & 1 deletion forget.py
Original file line number Diff line number Diff line change
Expand Up @@ -142,7 +142,9 @@ def main(cfg):
model.save_pretrained(cfg.model_path)



# Hot fix for https://discuss.huggingface.co/t/help-with-llama-2-finetuning-setup/50035
model.generation_config.do_sample = True

#now we have a HuggingFace model
if model_cfg["gradient_checkpointing"] == "true":
model.gradient_checkpointing_enable()
Expand Down
2 changes: 0 additions & 2 deletions requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -15,5 +15,3 @@ packaging
bitsandbytes
scipy
ninja
flash-attn --no-build-isolation

0 comments on commit 77d421b

Please sign in to comment.