Skip to content

[BUG-FIX]: download progress not reported to Python during file transfers#791

Open
tobocop2 wants to merge 1 commit intohuggingface:mainfrom
tobocop2:fix/item-bridge-suppresses-transfer-progress
Open

[BUG-FIX]: download progress not reported to Python during file transfers#791
tobocop2 wants to merge 1 commit intohuggingface:mainfrom
tobocop2:fix/item-bridge-suppresses-transfer-progress

Conversation

@tobocop2
Copy link
Copy Markdown

@tobocop2 tobocop2 commented Apr 8, 2026

This blocks the companion PR huggingface/huggingface_hub#4059

Related issue: huggingface/huggingface_hub#4058

Problem

Download progress bars in huggingface_hub update in large jumps (e.g. 0% -> 26% -> 63% -> 100% for a 4GB file) instead of smoothly. The root cause is that ItemBridgeState::compute_diff hardcodes all total_transfer_bytes fields to 0, so the fine-grained per-HTTP-chunk progress that xet-core already tracks internally never reaches Python callbacks.

Fix

  • Add transfer_bytes and transfer_bytes_completed to ItemProgressReport
  • Update ItemBridgeState::compute_diff to diff transfer fields, matching the existing GroupBridgeState behavior
  • Update GroupBridgeState::compute_diff item inclusion to also trigger on transfer progress

Reproduction

Single file

"""Verify smooth single-file download progress."""
import tempfile
from huggingface_hub import hf_hub_download

with tempfile.TemporaryDirectory() as tmp:
    hf_hub_download(
        "Qwen/Qwen3-4B",
        filename="model-00002-of-00003.safetensors",
        cache_dir=tmp,
        force_download=True,
    )

Multi-file

"""Verify smooth multi-file download progress."""
import tempfile
from huggingface_hub import snapshot_download

with tempfile.TemporaryDirectory() as tmp:
    snapshot_download(
        "Qwen/Qwen3-4B",
        allow_patterns=["model-00003-of-00003.safetensors", "config.json", "*.txt"],
        cache_dir=tmp,
        force_download=True,
    )

Before: tqdm bar jumps 0% -> 26% -> 63% -> 100% with ~4 callbacks for a 4GB file.
After: tqdm bar updates smoothly every 250ms.

See callback_bridge.rs tests for coverage of the new transfer byte diffing.

@tobocop2 tobocop2 changed the title fix: download progress callbacks missing transfer byte data [BUG FIX]: download progress callbacks missing transfer byte data Apr 8, 2026
@tobocop2 tobocop2 force-pushed the fix/item-bridge-suppresses-transfer-progress branch from 8e0dd2d to 066a2de Compare April 8, 2026 23:59
…progress

ItemBridgeState::compute_diff hardcoded all total_transfer_bytes fields
to 0, suppressing the fine-grained network-level progress that xet-core
already tracks per HTTP chunk. This caused Python callbacks to only
receive coarse bytes_completed updates (per disk write, ~256MB batches).

- Add transfer_bytes and transfer_bytes_completed to ItemProgressReport
- Update ItemBridgeState::compute_diff to diff transfer fields (matching
  GroupBridgeState which already does this correctly)
- Fire callbacks when transfer progress changes, even if bytes_completed
  has not changed yet
@tobocop2 tobocop2 force-pushed the fix/item-bridge-suppresses-transfer-progress branch from 066a2de to 13645b7 Compare April 9, 2026 00:07
@tobocop2 tobocop2 changed the title [BUG FIX]: download progress callbacks missing transfer byte data fix: download progress not reported to Python during file transfers Apr 9, 2026
@tobocop2 tobocop2 changed the title fix: download progress not reported to Python during file transfers [BUG-FIX]: download progress not reported to Python during file transfers Apr 9, 2026
Copy link
Copy Markdown
Collaborator

@hoytak hoytak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM -- makes sense.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants