Skip to content

Releases: Aatricks/LightDiffusion-Next

V2.1.4beta1

13 Apr 15:25

Choose a tag to compare

What's Changed

Full Changelog: V2.1.3...V2.1.4beta1

V2.1.3

14 Feb 17:21

Choose a tag to compare

What's changed

  • NVFP4 (4-bit) Weight-Only Quantization Support by @Aatricks
    • Implementation of 4-bit quantization for ~75% reduction in weight memory usage.
    • Integrated support for Flux2 (Transformer + Klein Text Encoder), SDXL, and SD1.5 architectures.
    • Optimized runtime dequantization to FP16/BF16 during the forward pass via comfy_cast_weights.
    • Automated layer selection targeting weights >4096 elements to balance compression and quality.
    • Added weight_quantization configuration to API, Context, and Pipeline for granular memory
      control.

Full Changelog: V2.1.2...V2.1.3

V2.1.2

12 Feb 16:39

Choose a tag to compare

What's Changed

  • Fix: reload base model before HiresFix + stability & UX improvements (Flux2, VAE, ADetailer, settings history) by @Aatricks in #19
  • Pipeline Stability & HiresFix Improvements by @Aatricks
    • Fixed critical race conditions in VAE encoding during SDXL+Refiner workflows by enforcing blocking transfers.
    • Implemented logic to reload the base model before HiresFix when a refiner is used, preventing latent corruption.
    • Ensured ADetailer explicitly uses the base model instead of the refiner for text-guided crop enhancements.
    • Reverted non-essential SDXL changes to resolve regressions in Attention and conditioning modules.
  • Settings Persistence & History Management by @Aatricks
    • Implemented backend storage for settings history and last-used seeds.
    • Added a collapsible "Settings History" section to the UI for quick restoration of previous configurations.
    • Integrated image import functionality directly into the GenerationSettings component.
  • Flux.2 & Core Model Optimization by @Aatricks
    • Enhanced torch.compile integration with support for callables and improved logging.
    • Fixed RoPE feature dimension alignment and added padding adjustments for Flux2.
    • Improved FP8 quantization fallback logic for models lacking diffusion submodules.
    • Added validation for model file existence (safetensors/pt) in the downloader.
  • Image Processing & Batch Limits by @Aatricks
    • Introduced LD_MAX_IMAGES_PER_GROUP to control processing limits and implemented chunking for large pipeline requests.
    • Updated AutoHDR to properly handle RGBA images (preserving alpha) and added fallbacks for missing LCMS or failed ICC transforms.
    • Added telemetry for batch limit configuration.
  • Testing & Infrastructure by @Aatricks
    • Restructured the test suite into clear e2e, integration, and unit categories.
    • Fixed frontend runtime crashes by importing missing Button and ImageMetadata types.

Full Changelog: V2.1.1...V2.1.2

V2.1.1

10 Feb 18:02

Choose a tag to compare

What's changed

  • Enhanced Preview and Message Management by @Aatricks
    • Added generation ID handling to improve preview message tracking and management.
    • Implemented configurable preview fidelity settings (format and quality) in AppInstance.
  • Model Compilation and Image Processing Optimizations by @Aatricks
    • Updated compile_model to default to 'max-autotune-no-cudagraphs' for better performance and stability.
    • Introduced in-memory image byte storage in ImageSaver to reduce disk I/O during API responses.
    • Added color utility functions for linear to sRGB conversion and Reinhard tonemapping.
  • Improved HiresFix and SDXL Support by @Aatricks
    • Enhanced HiresFix with support for size conditioning specific to SDXL models.
    • Refined handling of denoise and CFG parameters during the upscaling process.
  • Comprehensive Img2Img Enhancements by @Aatricks
    • Expanded API to support image uploads via local file paths, data URLs, and raw Base64 strings.
    • Implemented robust image saving with automatic format conversion and size limit enforcement.
    • Added a request filename prefix feature for improved output file organization.
  • Pipeline Robustness and Bug Fixes by @Aatricks
    • Enhanced tensor handling in sampling utilities and the multiscale manager for better stability.
    • Fixed refiner prompt usage for per-sample HiresFix/Adetailer when using SDXL or Flux models.
    • Added missing flux2 tokenizer merges configuration.
    • Improved error handling for non-tensor outputs in VAE and Pipeline modules.
  • Tooling and Distribution Updates by @Aatricks
    • Added a dedicated downloader for Flux models to streamline setup.
    • Included the frontend dist folder in the repository.
  • Expanded Integration and Unit Testing by @Aatricks
    • Added tests for FP8 quantization and torch.compile compatibility.
    • Introduced comprehensive integration tests for batched processing and high-payload img2img requests.

Full Changelog: 2.1.0...V2.1.1

2.1.0

09 Feb 20:07
5285f53

Choose a tag to compare

What's Changed

Full Changelog: V2.0.0...2.1.0

V2.0.0

04 Feb 18:01

Choose a tag to compare

What's Changed

  • Full Flux.2 Klein 4B Distilled Support by @Aatricks in
    #17
    • Implementation of the Flux2 transformer and Qwen-based text encoder.
    • Optimized text conditioning with attention masks, text normalization, and
      vector input support.
    • Resolution-dependent timestep scheduling and model sampling shift
      configurations.
    • Aggressive VRAM management and partial model loading for consumer hardware.
    • Fixed positional embeddings and VAE decoding logic specific to the Klein
      architecture.
  • SDPA Backend Priority Management (SpargeAttn > SageAttention > Xformers) by
    @Aatricks
  • Enhanced SDXL Condition Processing for improved prompt adherence by @Aatricks
  • Optimized Model Reuse Logic to prevent redundant device transfers and reloads by
    @Aatricks
  • Streamlined CI Workflow with improved error reporting and expanded
    unit/integration tests by @Aatricks
  • Revamped Streamlit UI for synchronized resolution management and Flux2 presets by
    @Aatricks
  • Fixed random seed generation to comply with PyTorch limits across all modules by
    @Aatricks
  • Improved Img2Img upscale logic and dimension handling for DiT models by @Aatricks

Full Changelog: V1.9.1...V2.0.0

V1.9.1

10 Jan 20:16
625ac6f

Choose a tag to compare

What's Changed

  • Smart cfg by @Aatricks in #11
  • ROCm and mps by @Aatricks in #13
  • Avoid unnecessary GGUF model unloading after patches by @google-labs-jules[bot] in #16
  • Optimize mmap release logic in Quantizer by @google-labs-jules[bot] in #15
  • Dynamic Width Scaling in Condition Encoding by @google-labs-jules[bot] in #14
  • Various optimizations to reduce device transfers by @Aatricks
  • Implemented calculations caching and batching for flux and attention by @Aatricks
  • Vectorized tensor indexing for schedulers by @Aatricks
  • Implemented dynamic vae tiling based on available VRAM by @Aatricks
  • These various improvements lead to about 30% inference speed improvements in SD1.5 scenarios

New Contributors

Full Changelog: V1.9.0...V1.9.1

V1.9.0

01 Nov 11:15
1b6ba62

Choose a tag to compare

What's Changed

  • Better scheduler (AYS) by @Aatricks in #9
  • More models (flux quant levels and SDXL, SD2.5) by @Aatricks in #10
  • Implemented deep cache optimization and prompt caching

Full Changelog: V1.8.0...V1.9.0

V1.8.0

27 Oct 18:06
40ae3db

Choose a tag to compare

What's Changed

  • feat: Enable CUDA graphs for faster inference by @Aatricks in #7
  • Feature/deepcache by @Aatricks in #8
  • Chore : Completely updated the docs

Full Changelog: V1.7.3...V1.8.0

V1.7.3

16 Oct 08:34
8aeadcc

Choose a tag to compare

What's Changed

  • Smart batching by @Aatricks in #6
  • Implemented requests buffering and generation batching for similar requests in server mode
  • Fixed unwanted behaviour on batches in the streamlit UI

Full Changelog: V1.7.2...V1.7.3