ReGUn: Reference-Guided Machine Unlearning

Overview

ReGUn is a machine unlearning approach that aligns model behavior on forget data with predictions from a disjoint held-out reference set.

This codebase implements ReGUn and provides a unified evaluation pipeline for comparing multiple machine unlearning methods on standard vision benchmarks:

Datasets: CIFAR-10, CIFAR-100 (auto-downloaded), Tiny-ImageNet (must be downloaded and placed in data directory)
Models: ResNet, Swin Transformer, Vision Transformer (ViT)
Unlearning Methods: ReGUn, NegGrad, NegGrad+, Finetune, l1-sparse, SCRUB, SalUn, SSD, Amun, LUR

All configurations are managed via Hydra (see conf/ directory).

Getting Started

Install dependencies requirements.txt (requires Python 3.8+ and PyTorch >= 2.0).

Weights & Biases: The pipeline integrates with wandb for experiment tracking. Login with wandb login or set WANDB_MODE=offline to disable cloud syncing.

Configuration

All experiments are configured via YAML files in the conf/ directory:

conf/config.yaml: Main configuration with defaults
conf/data/: Dataset configurations (CIFAR-10, CIFAR-100, Tiny-ImageNet)
conf/model/: Model architectures (ResNet, Swin, ViT)
conf/unlearn/: Unlearning method hyperparameters

Note: The default settings are for ResNet on CIFAR. Other models and datasets require config adjustments (e.g. num_classes, model_stem, ...)

Pipeline: Three-Step Execution

The evaluation pipeline consists of three sequential scripts that must be run in order:

Step 1: Generate Reference Models (`run1_reference.py`)

Purpose: Train multiple reference models on the retain split for robust membership inference evaluation (RMIA)(typically run 4 times with different random seeds).

Example Usage:

# Run this 4 times with different --run-idx values (1, 2, 3, 4)
for idx in 1 2 3 4; do
  python run1_reference.py --run-idx $idx data=cifar10 model=resnet
done

Output: Saved reference models in $CACHE_DIR/models/

Step 2: Train Base and Retrained Models (`run2_base.py`)

Purpose: Train a base model on the full training set and a retrained model from scratch on only the retain split.

Example Usage:

python run2_base.py data=cifar10 model=resnet

Output:

Base model: $CACHE_DIR/models/*_base.ckpt
Retrained model: $CACHE_DIR/models/*_retrained.ckpt

Step 3: Run Unlearning and Evaluate (`run3_unlearning.py`)

Purpose: Load the base model, apply the specified unlearning method, and evaluate it.

Example Usage:

# Run ReGUn
python run3_unlearning.py data=cifar10 model=resnet unlearn=regun

# Or other methods
python run3_unlearning.py data=cifar10 model=resnet unlearn=neggrad
python run3_unlearning.py data=cifar10 model=resnet unlearn=salun

Output: Results and its logs in $OUTPUTS_DIR/

Running on SLURM Clusters

For compute clusters using SLURM and Singularity/Apptainer:

1. Build Container

apptainer build mul_env.sif mul_env.def

Note: You may need to adjust the base image or CUDA/PyTorch versions in mul_env.def

2. Set Environment Variables

export PROJ_DIR=$PWD
export IMG=/path/to/mul_env.sif
export DATASET_ROOT=/path/to/data
export CACHE_ROOT=/path/to/cache
export OUTPUTS_ROOT=/path/to/outputs

3. Submit Jobs

# Step 1: Reference models (submit 4 jobs)
for idx in 1 2 3 4; do
  sbatch --export=ALL,PROJ_DIR=$PROJ_DIR,IMG=$IMG,DATASET_ROOT=$DATASET_ROOT,CACHE_ROOT=$CACHE_ROOT,OUTPUTS_ROOT=$OUTPUTS_ROOT \
    run_slurm.sbatch run1 --run-idx $idx data=cifar10 model=resnet
done

# Step 2: Base training
sbatch --export=ALL,PROJ_DIR=$PROJ_DIR,IMG=$IMG,DATASET_ROOT=$DATASET_ROOT,CACHE_ROOT=$CACHE_ROOT,OUTPUTS_ROOT=$OUTPUTS_ROOT \
  run_slurm.sbatch run2 data=cifar10 model=resnet

# Step 3: Unlearning
sbatch --export=ALL,PROJ_DIR=$PROJ_DIR,IMG=$IMG,DATASET_ROOT=$DATASET_ROOT,CACHE_ROOT=$CACHE_ROOT,OUTPUTS_ROOT=$OUTPUTS_ROOT \
  run_slurm.sbatch run3 data=cifar10 model=resnet unlearn=regun

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ReGUn: Reference-Guided Machine Unlearning

Overview

Getting Started

Configuration

Pipeline: Three-Step Execution

Step 1: Generate Reference Models (`run1_reference.py`)

Step 2: Train Base and Retrained Models (`run2_base.py`)

Step 3: Run Unlearning and Evaluate (`run3_unlearning.py`)

Running on SLURM Clusters

1. Build Container

2. Set Environment Variables

3. Submit Jobs

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
conf		conf
data		data
evaluation		evaluation
models		models
mul		mul
utils		utils
.gitignore		.gitignore
README.md		README.md
mul_env.def		mul_env.def
requirements.txt		requirements.txt
run1_reference.py		run1_reference.py
run2_base.py		run2_base.py
run3_unlearning.py		run3_unlearning.py
run_slurm.sbatch		run_slurm.sbatch

Folders and files

Latest commit

History

Repository files navigation

ReGUn: Reference-Guided Machine Unlearning

Overview

Getting Started

Configuration

Pipeline: Three-Step Execution

Step 1: Generate Reference Models (run1_reference.py)

Step 2: Train Base and Retrained Models (run2_base.py)

Step 3: Run Unlearning and Evaluate (run3_unlearning.py)

Running on SLURM Clusters

1. Build Container

2. Set Environment Variables

3. Submit Jobs

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Step 1: Generate Reference Models (`run1_reference.py`)

Step 2: Train Base and Retrained Models (`run2_base.py`)

Step 3: Run Unlearning and Evaluate (`run3_unlearning.py`)

Packages