BackdoorBench - NLP is a complementary material for the original BackdoorBench which mainly focuses on Computer Vision domain. It provides easy implementations of two mainstream backdoor attack methods and one defense method in Natural Language Processing(NLP) domain.
- Methods
- 2 Backdoor attack methods: HiddenKiller, BkdAtk-LWS
- 1 Backdoor defense methods: ONION
- Datasets: SST-2, AgNews, OLID
- Models: BERT-base-uncased
Table of Contents
This is a demo script of running HiddenKiller attack on SST-2. The first two commands are used to generate the poisoned training set with user-specified poison rate, target label, etc. The third command is to run HiddenKiller attack on BERT with the datasets generated by the first two commands.
python ./attack/HiddenKiller/generate_by_openattack.py --yaml_path ../../config/attack/hiddenkiller/generate_poison_data.yaml
python ./attack/HiddenKiller/generate_poison_train_data.py --yaml_path ../../config/attack/hiddenkiller/generate_poison_train_data.yaml
python ./attack/HiddenKiller/attack_hiddenkiller.py --yaml_path ../../config/attack/hiddenkiller/hiddenkiller_default.yaml
After attack, the poisoned model will be saved in ./models, which can be used for further defense. If you want to change the attack methods, dataset, save folder location, you should specify both the attack method script in ./attack and the YAML config file to use different attack methods.
This is a demo script of running ONION defense on SST-2 for HiddenKiller. Before defense you need to run the corresponding attack using the commands given above.
python ./defense/onion/test_defense_hiddenkiller.py --yaml_path ../../config/defense/onion/onion_hiddenkiller.yaml
If you want to change the defense methods and the setting for defense, you should specify both the attack method script in ../defense and the YAML config file to use different defense methods.
File name | Paper | |
---|---|---|
ONION | test_defense_hiddenkiller.py, test_defense_lws.py | ONION: A Simple and Effective Defense Against Textual Backdoor Attacks EMNLP 2021 |
We did not merge the code of ONION into a single file for additional data pre-processing method is required for LWS attack before running the defense. Besides, the origianl implementation of the two codes differ a lot in inferfaces and abstractions. We will keep them in separate forms for now and provide a unified version later as the framework is established.
We present results on all darasets with poison ratio = 5%.
BackdoorDefense→ | Nodefense | Nodefense | Nodefense | ONION | ONION | ONION | ||
---|---|---|---|---|---|---|---|---|
TargetedModel | Dataset↓ | BackdoorAttack↓ | C-Acc (%) | ASR (%) | R-Acc (%) | C-Acc (%) | ASR (%) | R-Acc (%) |
BERT-base-uncased | SST-2 | BkdAtk-LWS | 89.017 | 94.079 | 4.276 | 86.200 | 90.417 | 9.583 |
BERT-base-uncased | OLID | BkdAtk-LWS | 82.674 | 97.917 | 0.833 | 79.070 | 96.774 | 3.225 |
BERT-base-uncased | AgNews | BkdAtk-LWS | 93.105 | 99.193 | 0.614 | 92.100 | 68.030 | 10.967 |
BERT-base-uncased | SST-2 | HiddenKiller | 90.335 | 88.925 | 11.075 | 85.667 | 88.267 | 11.732 |
BERT-base-uncased | OLID | HiddenKiller | 82.189 | 97.415 | 2.585 | 81.374 | 96.123 | 3.877 |
BERT-base-uncased | AgNews | HiddenKiller | 93.487 | 98.667 | 1.123 | 92.053 | 95.158 | 4.211 |