Multi-style transfer-based backdoor attack:

Code and data of the paper "Leverage NLP models against other NLP models: two invisible feature space backdoor attacks"

Multi-style transfer-based backdoor attack:

Prepare Data

First, you need to prepare the poison data or directly using our preprocessed data in the data folder /data/muti_style To generate poison data, you need to you need to transfer the original dataset to multiple styles. We implement it based on the code. Please move to check the details.

Backdoor attacks

For example, to conduct backdoor attacks against BERT on SST-2:

CUDA_VISIBLE_DEVICES=0 python run_poison_bert.py --data sst-2 --poison_rate 20 --transferdata_path ../data/muti_style/sst-2 --origdata_path ../data/clean/sst-2  --bert_type bert-base-uncased --output_num 2

Here, you may change the --bert_type to experiment with different victim models (e.g. roberta-base, distilbert-base-uncased). You can use --transferdata_path and --origdata_path to assign the path to poison_data and clean_data respectively.

Paraphrase-based backdoor attack:

Prepare Data

First, you need to prepare the poison data or directly using our preprocessed data in the data folder /data/paraphrase To generate poison data, you need to you need to paraphrase the original dataset.

Custom Datasets

Create a folder in datasets which will contain new_dataset as datasets/new_dataset. Paste your plaintext train/dev/test splits into this folder as train.txt, dev.txt, test.txt. Use one instance per line (note that the model truncates sequences longer than 50 subwords). Add train.label, dev.label, test.label files (with same number of lines as train.txt, dev.txt, test.txt). These files will contain the style label of the corresponding instance.

To convert a plaintext dataset into it's BPE form run the command,

python paraphrase/dataset2bpe.py --dataset datasets/new_dataset

Next, for converting the BPE codes to fairseq binaries and building a label dictionary, first make sure you have downloaded RoBERTa and setup the $ROBERTA_LARGE global variable in your .bashrc. Then run,

python paraphrase/bpe2binary.sh datasets/new_dataset

Paraphrase the dataset by using pretrained Gpt_2_large model.

python paraphrase/paraphrase_splits.py --dataset datasets/new_dataset

Convert the BPE file back into its raw text form.

python paraphrase/bpe2text.py --dataset datasets/new_dataset

Backdoor attacks

For example, to conduct backdoor attacks against BERT on SST-2:

CUDA_VISIBLE_DEVICES=0 python run_poison_bert.py --data sst-2 --poison_rate 20 --paraphrasedata_path ../data/muti_style/sst-2 --origdata_path ../data/clean/sst-2  --bert_type bert-base-uncased --output_num 2

Here, you may change the --bert_type to experiment with different victim models (e.g. roberta-base, distilbert-base-uncased). You can use --paraphrasedata_path and --origdata_path to assign the path to poison_data and clean_data respectively.

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
data		data
experiments		experiments
paraphrase		paraphrase
.gitattributes		.gitattributes
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multi-style transfer-based backdoor attack:

Prepare Data

Backdoor attacks

Paraphrase-based backdoor attack:

Prepare Data

Custom Datasets

Backdoor attacks

About

Releases

Packages

Languages

secularsee/Paraphrase

Folders and files

Latest commit

History

Repository files navigation

Multi-style transfer-based backdoor attack:

Prepare Data

Backdoor attacks

Paraphrase-based backdoor attack:

Prepare Data

Custom Datasets

Backdoor attacks

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages