Mutual Information Guided Backdoor Mitigation for Pre-trained Encoders

1. overview

This is an implementation demo for our submission, “Mutual Information Guided Backdoor Mitigation for Pre-trained Encoders”.

In our paper, MIMIC is proposed to remove backdoors in pre-trained encoders. We also offer a step-by-step guidance to help your build your own evaluation.

The code is tested on Ubuntu 18.04, Pytorch 1.7.1.

2. commands

2.1. step 0: Train a clean encoder

Train a clean encoder from scratch

python -u train_clean_encoder.py --lr 1e-3 --results_dir <your_directory>

Then a clean encoder will be saved in <your_directory>.

2.2. step 1: Train a backdoored encoder

Training a backdoor model. Here, we use BadEncoder as example:

 python train_badencoder.py --trigger_file trigger/trigger_pt_white_21_10_ap_replace.npz --reference_file reference/cifar10/priority.npz --pretraining_dataset cifar10 --downstreamTask gtsrb --pretrained_encoder --<your_clean_encoder> --results_dir <your_directory>

The <your_clean_encoder> means the trained clean encoder path from step 0.

And the backdoored encoder will be saved in <your_directory>

2.3. step 2: Remove backdoors by MIMIC

Remove Trigger using Medic, clean label attack as an example.

python -u MIMIC.py --lr 1e-2 --batch_size 128 --epochs 1000 --pretraining_dataset cifar10 --teacher <your_backdoored_encoder> --ratio 0.04 --results_dir <your_directory>

The <your_backdoored_encoder> means the path to a backdoored encoder obtained in step 1.

Then a purified encoder will be saved in <your_directory>.

2.4. step 3: Train a downstream classifier

Based on the encoder purified in stage 2, we can test the ACC and ASR by training a downstream classifier:

python training_downstream_classifier.py --dataset gtsrb  --reference_label 12 --trigger_file trigger/trigger_pt_white_21_10_ap_replace.npz --encoder_usage_info cifar10 --encoder <your_purified_encoder> --reference_file reference/cifar10/priority.npz --nn_epochs 1000

The <your_purified_encoder> means the path to the purified encoder obtained in step 3.

Then the results will show on the screen.

3.Acknowledgement

Thanks for NAD and BadEncoder.

The dataset we use in based on badencoder, please download it following link in data/.

Their amazing implementations inspire us.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
datasets		datasets
estimator		estimator
evaluation		evaluation
models		models
reference/cifar10		reference/cifar10
trigger		trigger
LICENSE		LICENSE
MIMIC.py		MIMIC.py
README.md		README.md
train_badencoder.py		train_badencoder.py
train_clean_encoder.py		train_clean_encoder.py
training_downstream_classifier.py		training_downstream_classifier.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Mutual Information Guided Backdoor Mitigation for Pre-trained Encoders

1. overview

2. commands

2.1. step 0: Train a clean encoder

2.2. step 1: Train a backdoored encoder

2.3. step 2: Remove backdoors by MIMIC

2.4. step 3: Train a downstream classifier

3.Acknowledgement

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Mutual Information Guided Backdoor Mitigation for Pre-trained Encoders

1. overview

2. commands

2.1. step 0: Train a clean encoder

2.2. step 1: Train a backdoored encoder

2.3. step 2: Remove backdoors by MIMIC

2.4. step 3: Train a downstream classifier

3.Acknowledgement

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages