Releases · Aatricks/LightDiffusion-Next · GitHub

13 Apr 15:25

Aatricks

V2.1.4beta1 Latest

Latest

What's Changed

New minimalistic UI and added testing by @Aatricks in #20
Feat/optimizations stability by @Aatricks in #21

Full Changelog: V2.1.3...V2.1.4beta1

Contributors

Aatricks

Assets 2

14 Feb 17:21

Aatricks

V2.1.3

What's changed

NVFP4 (4-bit) Weight-Only Quantization Support by @Aatricks
- Implementation of 4-bit quantization for ~75% reduction in weight memory usage.
- Integrated support for Flux2 (Transformer + Klein Text Encoder), SDXL, and SD1.5 architectures.
- Optimized runtime dequantization to FP16/BF16 during the forward pass via comfy_cast_weights.
- Automated layer selection targeting weights >4096 elements to balance compression and quality.
- Added weight_quantization configuration to API, Context, and Pipeline for granular memory
  control.

Full Changelog: V2.1.2...V2.1.3

Contributors

Aatricks

Assets 2

12 Feb 16:39

Aatricks

V2.1.2

What's Changed

Fix: reload base model before HiresFix + stability & UX improvements (Flux2, VAE, ADetailer, settings history) by @Aatricks in #19
Pipeline Stability & HiresFix Improvements by @Aatricks
- Fixed critical race conditions in VAE encoding during SDXL+Refiner workflows by enforcing blocking transfers.
- Implemented logic to reload the base model before HiresFix when a refiner is used, preventing latent corruption.
- Ensured ADetailer explicitly uses the base model instead of the refiner for text-guided crop enhancements.
- Reverted non-essential SDXL changes to resolve regressions in Attention and conditioning modules.
Settings Persistence & History Management by @Aatricks
- Implemented backend storage for settings history and last-used seeds.
- Added a collapsible "Settings History" section to the UI for quick restoration of previous configurations.
- Integrated image import functionality directly into the GenerationSettings component.
Flux.2 & Core Model Optimization by @Aatricks
- Enhanced torch.compile integration with support for callables and improved logging.
- Fixed RoPE feature dimension alignment and added padding adjustments for Flux2.
- Improved FP8 quantization fallback logic for models lacking diffusion submodules.
- Added validation for model file existence (safetensors/pt) in the downloader.
Image Processing & Batch Limits by @Aatricks
- Introduced LD_MAX_IMAGES_PER_GROUP to control processing limits and implemented chunking for large pipeline requests.
- Updated AutoHDR to properly handle RGBA images (preserving alpha) and added fallbacks for missing LCMS or failed ICC transforms.
- Added telemetry for batch limit configuration.
Testing & Infrastructure by @Aatricks
- Restructured the test suite into clear e2e, integration, and unit categories.
- Fixed frontend runtime crashes by importing missing Button and ImageMetadata types.

Full Changelog: V2.1.1...V2.1.2

Contributors

Aatricks

Assets 2

10 Feb 18:02

Aatricks

V2.1.1

What's changed

Enhanced Preview and Message Management by @Aatricks
- Added generation ID handling to improve preview message tracking and management.
- Implemented configurable preview fidelity settings (format and quality) in AppInstance.
Model Compilation and Image Processing Optimizations by @Aatricks
- Updated compile_model to default to 'max-autotune-no-cudagraphs' for better performance and stability.
- Introduced in-memory image byte storage in ImageSaver to reduce disk I/O during API responses.
- Added color utility functions for linear to sRGB conversion and Reinhard tonemapping.
Improved HiresFix and SDXL Support by @Aatricks
- Enhanced HiresFix with support for size conditioning specific to SDXL models.
- Refined handling of denoise and CFG parameters during the upscaling process.
Comprehensive Img2Img Enhancements by @Aatricks
- Expanded API to support image uploads via local file paths, data URLs, and raw Base64 strings.
- Implemented robust image saving with automatic format conversion and size limit enforcement.
- Added a request filename prefix feature for improved output file organization.
Pipeline Robustness and Bug Fixes by @Aatricks
- Enhanced tensor handling in sampling utilities and the multiscale manager for better stability.
- Fixed refiner prompt usage for per-sample HiresFix/Adetailer when using SDXL or Flux models.
- Added missing flux2 tokenizer merges configuration.
- Improved error handling for non-tensor outputs in VAE and Pipeline modules.
Tooling and Distribution Updates by @Aatricks
- Added a dedicated downloader for Flux models to streamline setup.
- Included the frontend dist folder in the repository.
Expanded Integration and Unit Testing by @Aatricks
- Added tests for FP8 quantization and torch.compile compatibility.
- Introduced comprehensive integration tests for batched processing and high-payload img2img requests.

Full Changelog: 2.1.0...V2.1.1

Contributors

Aatricks

Assets 2

09 Feb 20:07

Aatricks

2.1.0

What's Changed

Server first by @Aatricks in #18
Added Img2Img support by @Aatricks
Added canny Controlnet support by @Aatricks
Replaced Streamlit and Gradio ui with a new react one by @Aatricks

Full Changelog: V2.0.0...2.1.0

Contributors

Aatricks

Assets 2

04 Feb 18:01

Aatricks

V2.0.0

What's Changed

Full Flux.2 Klein 4B Distilled Support by @Aatricks in
#17
- Implementation of the Flux2 transformer and Qwen-based text encoder.
- Optimized text conditioning with attention masks, text normalization, and
  vector input support.
- Resolution-dependent timestep scheduling and model sampling shift
  configurations.
- Aggressive VRAM management and partial model loading for consumer hardware.
- Fixed positional embeddings and VAE decoding logic specific to the Klein
  architecture.
SDPA Backend Priority Management (SpargeAttn > SageAttention > Xformers) by
@Aatricks
Enhanced SDXL Condition Processing for improved prompt adherence by @Aatricks
Optimized Model Reuse Logic to prevent redundant device transfers and reloads by
@Aatricks
Streamlined CI Workflow with improved error reporting and expanded
unit/integration tests by @Aatricks
Revamped Streamlit UI for synchronized resolution management and Flux2 presets by
@Aatricks
Fixed random seed generation to comply with PyTorch limits across all modules by
@Aatricks
Improved Img2Img upscale logic and dimension handling for DiT models by @Aatricks

Full Changelog: V1.9.1...V2.0.0

Contributors

Aatricks

Assets 2

10 Jan 20:16

Aatricks

V1.9.1

What's Changed

Smart cfg by @Aatricks in #11
ROCm and mps by @Aatricks in #13
Avoid unnecessary GGUF model unloading after patches by @google-labs-jules[bot] in #16
Optimize mmap release logic in Quantizer by @google-labs-jules[bot] in #15
Dynamic Width Scaling in Condition Encoding by @google-labs-jules[bot] in #14
Various optimizations to reduce device transfers by @Aatricks
Implemented calculations caching and batching for flux and attention by @Aatricks
Vectorized tensor indexing for schedulers by @Aatricks
Implemented dynamic vae tiling based on available VRAM by @Aatricks
These various improvements lead to about 30% inference speed improvements in SD1.5 scenarios

New Contributors

@google-labs-jules[bot] made their first contribution in #16

Full Changelog: V1.9.0...V1.9.1

Contributors

Aatricks and google-labs-jules

Assets 2

01 Nov 11:15

Aatricks

V1.9.0

What's Changed

Better scheduler (AYS) by @Aatricks in #9
More models (flux quant levels and SDXL, SD2.5) by @Aatricks in #10
Implemented deep cache optimization and prompt caching

Full Changelog: V1.8.0...V1.9.0

Contributors

Aatricks

Assets 2

27 Oct 18:06

Aatricks

V1.8.0

What's Changed

feat: Enable CUDA graphs for faster inference by @Aatricks in #7
Feature/deepcache by @Aatricks in #8
Chore : Completely updated the docs

Full Changelog: V1.7.3...V1.8.0

Contributors

Aatricks

Assets 2

16 Oct 08:34

Aatricks

V1.7.3

What's Changed

Smart batching by @Aatricks in #6
Implemented requests buffering and generation batching for similar requests in server mode
Fixed unwanted behaviour on batches in the streamlit UI

Full Changelog: V1.7.2...V1.7.3

Contributors

Aatricks

Assets 2