Skip to content

JonasGeiping/poisoning-benchmark

 
 

Repository files navigation

Just How Toxic is Data Poisoning? A Unified Benchmark for Backdoor and Data Poisoning Attacks

This repository is the official implementation of Just How Toxic is Data Poisoning? A Unified Benchmark for Backdoor and Data Poisoning Attacks.

Benchmark Scores

Frozen Feature Extractor

Attack White-box (%) Grey-box (%) Black-box (%)
Feature Collision 16.0 7.0 3.50
Feature Collision Ensembled 13.0 9.0 6.0
Convex Polytope 24.0 7.0 4.5
Convex Polytope Ensembled 20.0 8.0 12.5
Clean Label Backdoor 3.0 6.0 3.5
Hidden Trigger Backdoor 2.0 4.0 4.0

End-to-end Fine-tuning

Attack White-box (%) Grey-box (%) Black-box (%)
Feature Collision 4.0 3.0 3.5
Feature Collision Ensembled 7.0 4.0 5.0
Convex Polytope 17.0 7.0 4.5
Convex Polytope Ensembled 14.0 4.0 10.5
Clean Label Backdoor 3.0 2.0 1.5
Hidden Trigger Backdoor 3.0 2.0 4.0

From Scratch Training

Attack ResNet-18 (%) MobileNetV2 (%) VGG11 (%) Average (%)
Feature Collision 0 1 3 1.33
Convex Polytope 0 1 1 0.67
Clean Label Backdoor 0 1 2 1.00
Hidden Trigger Backdoor 0 4 1 2.67

Requirements

To install requirements:

pip install -r requirements.txt

Pre-trained Models

Pre-trained checkpoints used in this benchmark in the pretrained_models folder.

Testing

To test a model, run:

python test_model.py --model <model> --model_path <path_to_model_file> 

Crafting Poisons With Our Setups

See How To for full details and sample code.

Evaluating A Single Batch of Poison Examples

We have left one sample folder of poisons in poison_examples.

python poison_test.py --model <model> --model_path <model_path> --poisons_path <path_to_poisons_dir>

This allows users to test their poisons in a variety of settings, not only the benchmark setups.

Benchmarking A Backdoor or Triggerless Attack

To compute benchmark scores, craft 100 batches of poisons using the setup pickles (for transfer learning: poison_setups_transfer_learning.pickle, for from-scratch training: poison_setups_from_scratch.pickle), and run the following.

Important Note: In order to be on the leaderboard, new submissions must host their poisoned datasets online for public access, so results can be corroborated without producing new poisons. Consider a Dropbox or GoogleDrive folder with all 100 batches of poisons.

For one trial of transfer learning poisons:

python benchmark_test.py --poisons_path <path_to_poison_directory>

For one trial of from-scratch training poisons:

python benchmark_test.py --poisons_path <path_to_poison_directory> --from_scratch

To benchmark 100 batches of poisons, run

bash benchmark_all.sh <path_to_directory_with_100_batches> 

or

bash benchmark_all.sh <path_to_directory_with_100_batches> from_scratch

About

A unified benchmark problem for data poisoning attacks

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 99.6%
  • Shell 0.4%