🧠 DDIM Face Generation

A Denoising Diffusion Implicit Model trained from scratch on 30,000 faces — no pretrained weights, no diffusers library. Pure PyTorch.

🖼️ Results — 100 Epochs on CelebA-HQ 64×64

Faces generated from pure Gaussian noise — no post-processing

🚀 Live Demo

▶ Try it on Hugging Face Spaces

Demo features:

✨ Generate — sample new faces from pure noise with adjustable DDIM steps
🎞️ Trajectory — animated GIF showing the full denoising path (noise → face)
🔀 Interpolate — spherical linear interpolation (slerp) between two faces
📖 How it works — full architecture & training breakdown at the bottom of the page

⚙️ Technical Details


Architecture	U-Net with sinusoidal time embeddings + multi-head self-attention
Channels	[64, 128, 256, 256]
Parameters	25.6M
Dataset	CelebA-HQ — 30,000 aligned faces at 64×64
Training	100 epochs, ~40 hours, Apple Silicon MPS (no cloud GPU)
Sampler	DDIM — 20 steps vs DDPM 1000 steps (50× speedup)
Noise schedule	Linear β: 1×10⁻⁴ → 0.02, T = 1000
Inference weights	EMA (exponential moving average of training weights)

🏗️ Architecture

Input x_t (noisy image) + timestep t
            │
    ┌───────▼────────┐
    │  Time Embedding │  Sinusoidal → MLP → injected at every ResBlock
    └───────┬────────┘
            │
    ┌───────▼────────┐
    │    U-Net       │  4 resolution levels
    │                │  Self-attention at 8×8 and 16×16
    │  Down → Mid    │  GroupNorm + SiLU throughout
    │       → Up     │  Zero-init output conv (identity at init)
    └───────┬────────┘
            │
    predicted ε (noise)

Training objective: L = ||ε − ε_θ(√ᾱₜ x₀ + √(1−ᾱₜ) ε, t)||²

📁 Project Structure

minidiffusion/
├── models/
│   ├── attention.py     # Multi-head self-attention (2D spatial)
│   ├── unet.py          # Full U-Net with time embeddings
│   └── diffusion.py     # DDPM training + DDIM sampling + EMA + AdamW
├── utils/
│   ├── dataset.py       # CelebA-HQ dataloader
│   └── visualize.py     # Trajectory GIF, interpolation grid
├── train.py             # Training loop — W&B logging, auto-resume
├── sample.py            # Inference — grid, trajectory, interpolation, compare
├── app.py               # Gradio demo UI
└── config.py            # All hyperparameters

🔧 Built From Scratch

Every component is hand-written — no diffusers, no guided-diffusion, no pretrained encoders:

attention.py · unet.py · diffusion.py · dataset.py · train.py

Notable engineering decisions:

Custom CPU-resident AdamW — fixes a MPS NaN bug in PyTorch 2.3.1 where zero-grad params corrupt optimizer state, while also saving ~2GB of GPU memory
EMA shadow on CPU — keeps a smoothed copy of weights off the GPU, saving another ~1GB
MPS-safe DDIM indexing — tensor indexing with MPS buffers returns garbage in some PyTorch builds; fixed by using Python ints throughout the sampling loop

🏃 Run Locally

git clone https://github.com/Gh-Novel/DDIM_Image_Generation.git
cd DDIM_Image_Generation
pip install -r requirements.txt

# Run the Gradio demo (uses bundled checkpoint)
python app.py

# Or generate samples directly
python sample.py --ckpt checkpoints/stage-64_best.pt --num 16 --steps 50

# Train from scratch on your own data
python train.py --image-size 64 --epochs 100 --run-name my-run

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🧠 DDIM Face Generation

🖼️ Results — 100 Epochs on CelebA-HQ 64×64

🚀 Live Demo

⚙️ Technical Details

🏗️ Architecture

📁 Project Structure

🔧 Built From Scratch

🏃 Run Locally

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
checkpoints		checkpoints
models		models
utils		utils
.gitattributes		.gitattributes
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
app.py		app.py
config.py		config.py
requirements.txt		requirements.txt
sample.py		sample.py
train.py		train.py

Folders and files

Latest commit

History

Repository files navigation

🧠 DDIM Face Generation

🖼️ Results — 100 Epochs on CelebA-HQ 64×64

🚀 Live Demo

⚙️ Technical Details

🏗️ Architecture

📁 Project Structure

🔧 Built From Scratch

🏃 Run Locally

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages