Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
43 commits
Select commit Hold shift + click to select a range
a6082cb
Fix SDXL TensorRT engine build failure on Windows
dotsimulate Mar 29, 2026
d71085b
feat: clone decode_image output to prevent TRT VAE buffer reuse
INTER-NYC Mar 31, 2026
28491ce
perf: Tier 2 inference performance optimizations
INTER-NYC Mar 31, 2026
fb293ba
feat: auto-resolve IP-Adapter model paths based on detected architecture
INTER-NYC Mar 31, 2026
7d29e08
Fix IP-Adapter crash on SD2.1 models (sd-turbo) due to non-existent i…
INTER-NYC Apr 1, 2026
117cbbc
Add cuda-python 13.x compatibility fix for cudart import
INTER-NYC Apr 1, 2026
bd9a2d3
Quick-win CUDA optimizations: pre-allocated buffers + L2 cache persis…
INTER-NYC Apr 1, 2026
85307e8
Fix hardcoded float16 autocasts and add fp32 precision for scheduler …
INTER-NYC Apr 1, 2026
1128d97
Fix L2 cache: second reserve call was resetting reservation to 0
INTER-NYC Apr 1, 2026
7bf2366
fix: report inference FPS separately from output FPS when similar ima…
INTER-NYC Apr 1, 2026
9c22342
VRAM reduction: text encoder offloading + max_batch_size 4→2
INTER-NYC Apr 1, 2026
1efb9ef
Revert max_batch_size 4→2: unsafe for cfg_type=full/initialize
INTER-NYC Apr 1, 2026
697b548
perf: pre-allocate image output buffers, replace .clone() with .copy_…
INTER-NYC Apr 2, 2026
818b03d
fix: use opencv-contrib-python 4.9.0.80 and add FP8 deps (modelopt, c…
INTER-NYC Apr 2, 2026
aa21e73
fix: pin onnx 1.17.0, onnxruntime-gpu 1.22.0; remove CPU onnxruntime …
INTER-NYC Apr 2, 2026
2b6c0aa
fix: bump onnx 1.18.0 + onnxruntime-gpu 1.24.3 (modelopt FLOAT4E2M1 +…
INTER-NYC Apr 2, 2026
600c5bf
fix: patch ByteSize() for >2GB ONNX in modelopt FP8 quantization
INTER-NYC Apr 2, 2026
847be93
fix: reduce FP8 calibration batches 128→8 (KVO cache OOM, 281GB→17GB)
INTER-NYC Apr 2, 2026
00cf0c7
fix: export UNet ONNX at opset 19 when FP8 enabled to skip modelopt v…
INTER-NYC Apr 2, 2026
9e22ea9
fix: merge calibration list-of-dicts into stacked dict for modelopt C…
INTER-NYC Apr 2, 2026
519069f
fix: add NVIDIA DLLs to PATH and retry without quantize_mha on ORT EP…
INTER-NYC Apr 2, 2026
cfca95b
fix: use single calibration batch for modelopt (avoid rank mismatch),…
INTER-NYC Apr 2, 2026
ccecf37
fix: resolve 4 FP8 quantization bugs for TRT 10.12 cached attention e…
INTER-NYC Apr 3, 2026
0f50188
perf: add CUDA/PyTorch env var tuning and cudnn.benchmark
INTER-NYC Apr 4, 2026
18fc5ed
fix: prevent FP8 engine build intermediate file bloat on Windows
INTER-NYC Apr 4, 2026
888d20a
fix(l2-cache): add TRT activation caching path to setup_l2_persistence
INTER-NYC Apr 4, 2026
72f6409
perf: clean up deprecated TRT 10.x API usage in engine builder and pr…
INTER-NYC Apr 4, 2026
07093bf
fix: clamp TRT persistent_cache_limit to L2_cache_size//2 to avoid ex…
INTER-NYC Apr 4, 2026
5dc8af4
fix(fp8): remove direct_io_types/simplify, make allocate_buffers FP8-…
INTER-NYC Apr 4, 2026
95d34b8
perf(trt): static spatial shapes + tactic cleanup for engine builder
INTER-NYC Apr 5, 2026
fe85327
perf: Tier 1 hot-path allocation elimination (Phase A-C)
INTER-NYC Apr 5, 2026
cd9b6ec
perf: skip text encoder reload on identical prompt; seed FPS EMA from…
INTER-NYC Apr 5, 2026
1e6b351
fix: guard ControlNet TRT engine compilation behind acceleration check
INTER-NYC Apr 5, 2026
b75d15b
fix: remove empty_cache() from text encoder offload to prevent prompt…
INTER-NYC Apr 5, 2026
e548b40
perf: keep text encoders on GPU during inference; add force_offload f…
INTER-NYC Apr 5, 2026
2ed2996
perf(trt): fully static batch profiles to unlock l2tc on UNet
INTER-NYC Apr 5, 2026
0f9d1d6
feat(trt): add TRT profiling infrastructure gated by STREAMDIFFUSION_…
INTER-NYC Apr 5, 2026
791bd26
perf(trt): reduce builder_optimization_level from 4 to 3 for static s…
INTER-NYC Apr 5, 2026
3a44259
revert(trt): restore builder_optimization_level=4; tactic 0x3e9 is a …
INTER-NYC Apr 5, 2026
f1fc4bf
fix(trt): guard aten::copy behind _use_prealloc to unblock ONNX expor…
INTER-NYC Apr 5, 2026
8d75198
fix(controlnet): pass pipeline resolution to TRT engine builder
INTER-NYC Apr 5, 2026
0454830
perf(controlnet): enable CUDA graphs for ControlNet TRT engine
INTER-NYC Apr 5, 2026
b0eeab2
chore(installer): overhaul install scripts — pins, portability, TRT v…
INTER-NYC Apr 7, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions .charlie/config.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
checkCommands:
fix: ruff check --fix . && ruff format .
lint: pip install ruff && ruff check .
# Type checking not yet configured - add when ready:
# types: pip install pyrefly && pyrefly check
# Tests not yet available - add when CPU-compatible tests exist:
# test: pip install pytest && pytest tests/ -x -q --ignore=tests/gpu/
beta:
canApprovePullRequests: false
13 changes: 13 additions & 0 deletions .charlie/instructions/code-style.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# StreamDiffusion Code Style

Charlie reads CLAUDE.md automatically for project context. These are additional rules.

## Rules

- [R1] Follow existing patterns in the codebase — check surrounding code before suggesting changes
- [R2] Ensure ruff lint and format checks pass: `ruff check . && ruff format --check .` (line-length 119)
- [R3] CUDA kernels and device operations must include error checking — never ignore return codes
- [R4] TensorRT engine building/loading code must handle version compatibility explicitly
- [R5] Use type hints for all new public functions and class methods
- [R6] TouchDesigner extension methods must follow the TD callback pattern (onXxx naming)
- [R7] Do not commit CLAUDE.md, MEMORY.md, or .claude/ — these are local-only files
118 changes: 118 additions & 0 deletions .claude/hooks/git-commit-enforcer.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,118 @@
#!/usr/bin/env python3
"""
PreToolUse hook to enforce safe git commits via commit_enhanced.sh
instead of raw 'git commit' commands.

This hook intercepts any 'git commit' bash command and redirects it
to scripts/git/commit_enhanced.sh for validation and safety checks.

Benefits:
- Automatic lint validation (ruff, black, isort)
- Local file exclusion (CLAUDE.md, MEMORY.md, _archive/)
- Commit message format validation
- Branch-specific protection
"""

import json
import os
import re
import sys


def main():
try:
input_data = json.load(sys.stdin)
except (json.JSONDecodeError, ValueError):
sys.exit(0)

tool_name = input_data.get("tool_name", "")
tool_input = input_data.get("tool_input", {})
command = tool_input.get("command", "")

# Only intercept Bash tool
if tool_name != "Bash":
sys.exit(0)

# Detect git commit patterns
# Match: git commit, git commit -m, git commit --message, etc.
git_commit_pattern = r"\bgit\s+commit\b"
if not re.search(git_commit_pattern, command):
sys.exit(0)

# Respect --no-verify flag - user explicitly wants to bypass hooks
no_verify_pattern = r"\b(--no-verify|-n)\b"
if re.search(no_verify_pattern, command):
# Allow raw command to pass through
sys.exit(0)

# SMART DETECTION: Allow legitimate git commit cases to pass through
# Only intercept standard new commits (git commit -m "message")
ALLOWED_PATTERNS = [
r"--amend", # Amending previous commit
r"--no-edit", # Merge/rebase completion
r"--allow-empty", # Empty commits (rare but valid)
r"--fixup", # Fixup commits for rebase
r"--squash", # Squash commits for rebase
]

# Check if any allowed pattern is present
for pattern in ALLOWED_PATTERNS:
if re.search(pattern, command):
# Allow special commit types to pass through
sys.exit(0)

# ONLY intercept: git commit -m "message" (standard new commits)
# This is the pattern that should go through commit_enhanced.bat
if not re.search(r'-m\s+["\']', command):
# No -m flag = likely interactive or special case
sys.exit(0)

# Extract commit message if present
# Patterns: -m "message", -m 'message', --message "message"
message = ""

# Try -m with quotes
msg_match = re.search(r'-m\s+["\']([^\'"]+)["\']', command)
if msg_match:
message = msg_match.group(1)
else:
# Try --message with quotes
msg_match = re.search(r'--message\s+["\']([^\'"]+)["\']', command)
if msg_match:
message = msg_match.group(1)
else:
# Try -m without quotes (single word)
msg_match = re.search(r'-m\s+([^\s"\']+)', command)
if msg_match:
message = msg_match.group(1)

# Build safe commit command
project_dir = os.environ.get("CLAUDE_PROJECT_DIR", "F:/RD_PROJECTS/COMPONENTS/claude-context-local")
safe_script = f"{project_dir}/scripts/git/commit_enhanced.sh"

# Construct updated command - shell script runs natively in Git Bash
if message:
updated_command = f'./scripts/git/commit_enhanced.sh "{message}"'
else:
# No message provided - script will prompt or use default
updated_command = "./scripts/git/commit_enhanced.sh"

# Return hook decision with updated command
output = {
"hookSpecificOutput": {
"hookEventName": "PreToolUse",
"permissionDecision": "allow",
"permissionDecisionReason": (
"Routing through safe commit handler (commit_enhanced.sh) "
"for validation, lint checks, and local file protection"
),
"updatedInput": {"command": updated_command},
}
}

print(json.dumps(output))
sys.exit(0)


if __name__ == "__main__":
main()
45 changes: 45 additions & 0 deletions .gitattributes
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
# Line endings
*.sh text eol=lf
*.py text eol=lf
*.md text eol=lf
*.yml text eol=lf
*.yaml text eol=lf
*.json text eol=lf
*.toml text eol=lf
*.txt text eol=lf
*.bat text eol=crlf
*.cmd text eol=crlf

# Merge strategies
*.py merge=diff3
*.json merge=diff3
*.yaml merge=diff3
*.yml merge=diff3
*.toml merge=diff3
*.md merge=diff3
CHANGELOG.md merge=union

# Binary files (ML models, images, compiled artifacts)
*.onnx binary
*.engine binary
*.trt binary
*.pth binary
*.pt binary
*.safetensors binary
*.pkl binary
*.pb binary
*.h5 binary

# Images
*.png binary
*.jpg binary
*.jpeg binary
*.gif binary
*.ico binary
*.webp binary

# Archives
*.zip binary
*.tar binary
*.gz binary
*.whl binary
61 changes: 61 additions & 0 deletions .githooks/pre-commit
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
#!/bin/sh
# Pre-commit hook to prevent accidental commit of local-only files
# Prevents commits of CLAUDE.md, MEMORY.md, .claude/ directory
# Allows DELETIONS (D) but blocks additions (A) or modifications (M)

echo "Checking for local-only files..."

# Check if any local-only files are being ADDED or MODIFIED (not deleted)
PROBLEMATIC_FILES=$(git diff --cached --name-status | grep -E "^[AM]\s+(CLAUDE\.md|MEMORY\.md|\.claude/)")

if [ -n "$PROBLEMATIC_FILES" ]; then
echo "ERROR: Attempting to add or modify local-only files!"
echo ""
echo "The following files must remain local only:"
echo "- CLAUDE.md (development context, project-specific AI instructions)"
echo "- MEMORY.md (session memory)"
echo "- .claude/ (Claude Code configuration, hooks, skills)"
echo ""
echo "Problematic files:"
echo "$PROBLEMATIC_FILES"
echo ""
echo "DELETIONS are allowed (removing from git tracking)"
echo "ADDITIONS/MODIFICATIONS are blocked (privacy protection)"
echo ""
echo "To fix this, reset the problematic files:"
echo " git reset HEAD <file>"
echo ""
exit 1
fi

# Check for deletions (which are allowed and expected)
DELETED_FILES=$(git diff --cached --name-status | grep -E "^D\s+(CLAUDE\.md|MEMORY\.md|\.claude/)")
if [ -n "$DELETED_FILES" ]; then
echo " Local-only files being removed from git tracking (as intended)"
fi

echo " No local-only files detected"
echo " Privacy protection active"

# Optional: Check code quality for Python files (non-blocking)
PYTHON_FILES=$(git diff --cached --name-only --diff-filter=ACM | grep '\.py$')

if [ -n "$PYTHON_FILES" ]; then
echo ""
echo "Checking code quality (non-blocking)..."

if command -v ruff > /dev/null 2>&1; then
if ! ruff check $PYTHON_FILES > /dev/null 2>&1; then
echo " WARNING: ruff found lint issues in staged Python files"
echo " Run 'ruff check --fix .' to auto-fix, or 'ruff check .' to see details"
echo " (Commit will proceed - fix lint issues when ready)"
else
echo " Code quality checks passed"
fi
else
echo " ruff not found - skipping lint check (install: pip install ruff)"
fi
fi

echo ""
exit 0
14 changes: 14 additions & 0 deletions .github/CODEOWNERS
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
# Default owners for all files
* @forkni

# CUDA/TensorRT specific code
*.cu @forkni
*.cuh @forkni
*tensorrt* @forkni
*trt* @forkni

# CI/CD workflows
.github/ @forkni

# Core streaming pipeline
src/streamdiffusion/ @forkni
114 changes: 114 additions & 0 deletions .github/ISSUE_TEMPLATE/bug_report.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
name: Bug Report
description: Report a bug or unexpected behavior
labels: ["bug", "status/needs-triage"]
body:
- type: markdown
attributes:
value: |
Thanks for taking the time to report a bug! Please fill out the form below.

- type: textarea
id: description
attributes:
label: Bug Description
description: A clear and concise description of the bug.
placeholder: What happened? What did you expect to happen?
validations:
required: true

- type: textarea
id: steps
attributes:
label: Steps to Reproduce
description: Steps to reproduce the behavior.
placeholder: |
1. Load model '...'
2. Set parameters '...'
3. Run inference '...'
4. See error
validations:
required: true

- type: textarea
id: error
attributes:
label: Error Output
description: Paste the full error output or traceback.
render: shell

- type: markdown
attributes:
value: "## Environment"

- type: input
id: gpu
attributes:
label: GPU Model
placeholder: e.g. NVIDIA RTX 3090, A100 80GB
validations:
required: true

- type: input
id: cuda
attributes:
label: CUDA Version
placeholder: e.g. 11.8
validations:
required: true

- type: input
id: driver
attributes:
label: NVIDIA Driver Version
placeholder: e.g. 525.85.12
validations:
required: true

- type: input
id: tensorrt
attributes:
label: TensorRT Version (if applicable)
placeholder: e.g. 8.6.1

- type: input
id: python
attributes:
label: Python Version
placeholder: e.g. 3.10.12
validations:
required: true

- type: input
id: torch
attributes:
label: PyTorch Version
placeholder: e.g. 2.0.1+cu118
validations:
required: true

- type: dropdown
id: os
attributes:
label: Operating System
options:
- Ubuntu 22.04
- Ubuntu 20.04
- Windows 11
- Windows 10
- macOS (MPS)
- Other Linux
- Other
validations:
required: true

- type: input
id: branch
attributes:
label: Branch / Commit
placeholder: e.g. SDTD_031_stable, main, commit hash

- type: textarea
id: context
attributes:
label: Additional Context
description: Any other context, config files, or screenshots.
Loading
Loading