We adopt as backbone the pre-trained adversarially-robust CLIP models from TeCoA. The used pre-trained weights are provided here. To run the code, the pre-trained backbone models should be placed under this directory. The code currently supports two architectures: ViT-B/32 (named vitb32
) and ResNet50 (named rn50
). Taking an example of tuning ViT-B/32 at epsilon=4/255, the name of checkpoint is vitb32_eps4.pth.tar
. Note that our code can be easily adapted to load other pre-trained models as backbone.