Skip to content

feat: LoRA adapter support for dit-vae#10

Draft
Copilot wants to merge 3 commits intomasterfrom
copilot/port-lora-loading-feature
Draft

feat: LoRA adapter support for dit-vae#10
Copilot wants to merge 3 commits intomasterfrom
copilot/port-lora-loading-feature

Conversation

Copy link

Copilot AI commented Mar 7, 2026

Adds runtime LoRA adapter loading to dit-vae via PEFT-format adapter_model.safetensors, with deltas applied in-graph across all DiT self-attention, cross-attention, and MLP projections.

New files

  • src/safetensors.h — zero-dependency safetensors reader (header length + JSON metadata + raw tensor reads by offset)
  • src/dit-lora.cppdit_ggml_load_lora(): normalises all PEFT key-name variants, matches A/B pairs per slot, transposes PyTorch row-major → ggml column-major, uploads to backend

Core changes

  • src/dit.hDiTGGMLLayer gains 22 LoRA tensor pointer slots (SA/CA Q/K/V/O + MLP gate/up/down); DiTGGML gains lora_wctx (independent weight context) and lora_scale
  • src/dit-graph.h — new dit_ggml_linear_lora() (W@x + scale·B@A@x); self-attn, cross-attn, and MLP builders apply deltas on both fused and separate weight paths
  • src/request.h/request.cppcustom_tag (trigger word auto-appended to caption before condition encoding) and genre fields; parsing aliases for formatted_lyrics, language, is_instrumental
  • tools/dit-vae.cpp--lora <path> and --lora-scale <float> (default 1.0); adapter loaded once, reused across all requests

Usage

./build/dit-vae \
    --request request0.json \
    --text-encoder models/Qwen3-Embedding-0.6B-Q8_0.gguf \
    --dit   models/acestep-v15-turbo-Q8_0.gguf \
    --vae   models/vae-BF16.gguf \
    --lora  lora/adapter_model.safetensors \
    --lora-scale 1.0
{
  "caption": "Nu-disco track with funky bassline",
  "custom_tag": "crydamoure",
  "inference_steps": 8,
  "shift": 3
}

custom_tag is appended to the caption so the trigger word reaches the condition encoder. See examples/lora.sh and examples/lora.json for a full runnable example.

Deltas are applied in the compute graph (base weights are not mutated). When lora_scale == 0 all LoRA paths are no-ops. All base weight fusion variants (full QKV, partial QK, separate) are handled.


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Co-authored-by: lmangani <1423657+lmangani@users.noreply.github.com>
Copilot AI changed the title [WIP] Add LoRA loading feature to master branch Port LoRA adapter loading from lora-patches into master Mar 7, 2026
Copilot AI changed the title Port LoRA adapter loading from lora-patches into master feat: LoRA adapter support for dit-vae Mar 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants