Stable Diffusion 2.1 Fine-tuning with BOFT (DreamBooth)

AIST5030 Mini Project — Fine-tuning Stable Diffusion 2.1 using BOFT (Butterfly Orthogonal Fine-Tuning) on a DreamBooth-style dataset so the model learns a specific subject (a dog) and generates it in novel contexts.

Project Structure

├── config.yaml              # All training / inference hyperparameters
├── pyproject.toml            # Python project & dependency specification
├── data/                     # Training images (DreamBooth dog2 dataset)
├── src/sd21_boft/
│   ├── config_utils.py       # YAML config loading & model path resolution
│   ├── dataset.py            # DreamBooth dataset & data loader
│   ├── download_model.py     # Download pretrained model from HuggingFace Hub
│   ├── train.py              # BOFT fine-tuning training loop
│   ├── inference.py          # Generate images with original & fine-tuned model
│   ├── log_utils.py          # Logging setup
│   └── plot_utils.py         # Training loss curve plotting
├── output/                   # Saved BOFT adapter checkpoints & validation images
├── inference_output/         # Inference results (original vs. fine-tuned)
├── logs/                     # Training & inference log files, loss CSV
└── plots/                    # Loss curve plot

Setup

Prerequisites

Python >= 3.12
NVIDIA GPU with CUDA support
uv package manager (recommended)

Install Dependencies

uv sync

Or using pip:

pip install -e .

Prepare Data

Download the DreamBooth dog2 dataset from google/dreambooth and place the images into the data/ directory.

Download Pretrained Model (Optional)

If local_model_dir is set in config.yaml, the pretrained Stable Diffusion 2.1 model will be automatically downloaded from HuggingFace Hub on first run. You can also trigger this manually:

uv run python -m sd21_boft.download_model

Usage

Configuration

All hyperparameters are centralized in config.yaml. Key settings include:

Parameter	Description
`pretrained_model`	HuggingFace model ID (`sd2-community/stable-diffusion-2-1`)
`prompt`	Training prompt containing the unique identifier (`a photo of a5o8 dog`)
`boft_target_modules`	UNet attention layers to apply BOFT (`to_q`, `to_k`, `to_v`, `to_out.0`)
`num_train_epochs`	Number of training epochs
`learning_rate`	Learning rate
`validation_prompts`	Prompts used for periodic validation during training
`inference_prompts`	Prompts used for final inference

Training

uv run python -m sd21_boft.train

This will:

Load the pretrained Stable Diffusion 2.1 model
Apply BOFT adapters to the UNet attention layers
Fine-tune on the DreamBooth dataset
Save adapter checkpoints and validation images periodically
Plot the training loss curve upon completion

Inference

uv run python -m sd21_boft.inference

This will generate images for each prompt in inference_prompts using both the original model and the fine-tuned model (with the latest checkpoint by default), saving results to inference_output/.

AI Usage

Some non-core-logic modules (dependency specification, logging management, configuration management/loading, loss curve plotting) were written with the assistance of the AI tool GitHub Copilot (Claude Opus 4.6 model).

Besides, the readme and the report were polished and formatted using GitHub Copilot (Claude Opus 4.6 model).

References

Rombach, R., Blattmann, A., Lorenz, D., Esser, P., and Ommer, B. "High-resolution image synthesis with latent diffusion models." CVPR, 2022.
Ruiz, N., Li, Y., Jampani, V., Pritch, Y., Rubinstein, M., and Aberman, K. "DreamBooth: Fine tuning text-to-image diffusion models for subject-driven generation." CVPR, 2023.
Qiu, Z., Liu, W., Feng, H., Xue, Y., Feng, Y., Liu, Z., ... and Schölkopf, B. "Controlling text-to-image diffusion by orthogonal finetuning." NeurIPS, 2023.
Liu, W., Qiu, Z., Feng, Y., Xiu, Y., Xue, Y., Yu, L., ... and Schölkopf, B. "Parameter-efficient orthogonal finetuning via butterfly factorization." ICLR, 2024.
The training and inference code is written with reference to: HuggingFace. PEFT BOFT DreamBooth example, 2024.
The finetuning dataset is from: Google. DreamBooth dataset — dog2, 2023.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Stable Diffusion 2.1 Fine-tuning with BOFT (DreamBooth)

Project Structure

Setup

Prerequisites

Install Dependencies

Prepare Data

Download Pretrained Model (Optional)

Usage

Configuration

Training

Inference

AI Usage

References

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
data		data
inference_output		inference_output
logs		logs
output/validation		output/validation
plots		plots
report		report
src/sd21_boft		src/sd21_boft
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
config.yaml		config.yaml
pyproject.toml		pyproject.toml

Folders and files

Latest commit

History

Repository files navigation

Stable Diffusion 2.1 Fine-tuning with BOFT (DreamBooth)

Project Structure

Setup

Prerequisites

Install Dependencies

Prepare Data

Download Pretrained Model (Optional)

Usage

Configuration

Training

Inference

AI Usage

References

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages