ReGUn is a machine unlearning approach that aligns model behavior on forget data with predictions from a disjoint held-out reference set.
This codebase implements ReGUn and provides a unified evaluation pipeline for comparing multiple machine unlearning methods on standard vision benchmarks:
- Datasets: CIFAR-10, CIFAR-100 (auto-downloaded), Tiny-ImageNet (must be downloaded and placed in data directory)
- Models: ResNet, Swin Transformer, Vision Transformer (ViT)
- Unlearning Methods: ReGUn, NegGrad, NegGrad+, Finetune, l1-sparse, SCRUB, SalUn, SSD, Amun, LUR
All configurations are managed via Hydra (see conf/ directory).
Install dependencies requirements.txt (requires Python 3.8+ and PyTorch >= 2.0).
Weights & Biases: The pipeline integrates with wandb for experiment tracking. Login with wandb login or set WANDB_MODE=offline to disable cloud syncing.
All experiments are configured via YAML files in the conf/ directory:
conf/config.yaml: Main configuration with defaultsconf/data/: Dataset configurations (CIFAR-10, CIFAR-100, Tiny-ImageNet)conf/model/: Model architectures (ResNet, Swin, ViT)conf/unlearn/: Unlearning method hyperparameters
Note: The default settings are for ResNet on CIFAR. Other models and datasets require config adjustments (e.g. num_classes, model_stem, ...)
The evaluation pipeline consists of three sequential scripts that must be run in order:
Purpose: Train multiple reference models on the retain split for robust membership inference evaluation (RMIA)(typically run 4 times with different random seeds).
Example Usage:
# Run this 4 times with different --run-idx values (1, 2, 3, 4)
for idx in 1 2 3 4; do
python run1_reference.py --run-idx $idx data=cifar10 model=resnet
doneOutput: Saved reference models in $CACHE_DIR/models/
Purpose: Train a base model on the full training set and a retrained model from scratch on only the retain split.
Example Usage:
python run2_base.py data=cifar10 model=resnetOutput:
- Base model:
$CACHE_DIR/models/*_base.ckpt - Retrained model:
$CACHE_DIR/models/*_retrained.ckpt
Purpose: Load the base model, apply the specified unlearning method, and evaluate it.
Example Usage:
# Run ReGUn
python run3_unlearning.py data=cifar10 model=resnet unlearn=regun
# Or other methods
python run3_unlearning.py data=cifar10 model=resnet unlearn=neggrad
python run3_unlearning.py data=cifar10 model=resnet unlearn=salunOutput: Results and its logs in $OUTPUTS_DIR/
For compute clusters using SLURM and Singularity/Apptainer:
apptainer build mul_env.sif mul_env.defNote: You may need to adjust the base image or CUDA/PyTorch versions in mul_env.def
export PROJ_DIR=$PWD
export IMG=/path/to/mul_env.sif
export DATASET_ROOT=/path/to/data
export CACHE_ROOT=/path/to/cache
export OUTPUTS_ROOT=/path/to/outputs# Step 1: Reference models (submit 4 jobs)
for idx in 1 2 3 4; do
sbatch --export=ALL,PROJ_DIR=$PROJ_DIR,IMG=$IMG,DATASET_ROOT=$DATASET_ROOT,CACHE_ROOT=$CACHE_ROOT,OUTPUTS_ROOT=$OUTPUTS_ROOT \
run_slurm.sbatch run1 --run-idx $idx data=cifar10 model=resnet
done
# Step 2: Base training
sbatch --export=ALL,PROJ_DIR=$PROJ_DIR,IMG=$IMG,DATASET_ROOT=$DATASET_ROOT,CACHE_ROOT=$CACHE_ROOT,OUTPUTS_ROOT=$OUTPUTS_ROOT \
run_slurm.sbatch run2 data=cifar10 model=resnet
# Step 3: Unlearning
sbatch --export=ALL,PROJ_DIR=$PROJ_DIR,IMG=$IMG,DATASET_ROOT=$DATASET_ROOT,CACHE_ROOT=$CACHE_ROOT,OUTPUTS_ROOT=$OUTPUTS_ROOT \
run_slurm.sbatch run3 data=cifar10 model=resnet unlearn=regun