Skip to content

OS-level I/O errors (errno) are lost before reaching Python callers #763

@cquil11

Description

@cquil11

Describe the bug

When xet encounters OS-level I/O errors during downloads, the specific error (errno, ErrorKind) is converted to a generic string before reaching Python. For example, a disk-full condition (ENOSPC) surfaces as:

RuntimeError: Data processing error: File reconstruction error: Internal Writer Error: Background writer channel closed
Segmentation fault (core dumped)

instead of OSError(28, "No space left on device").

This affects all OS errors (ENOSPC, EACCES, EBUSY, etc.) (and may be a more general pattern across all errors). It makes it impossible for Python callers to programmatically handle specific I/O conditions (e.g., except OSError with errno checks).

  • XetError::Io(String)*in xet_pkg/src/error.rs calls .to_string() on std::io::Error, discarding raw_os_error() and ErrorKind
  • InternalWriterError(String) in xet_data/src/file_reconstruction/error.rs replaces actual I/O errors with generic messages like "Background writer channel closed"
  • Python FFI mapping converts these to PyRuntimeError instead of PyOSError with the correct errno

### Reproduction

```bash
# Setup: Ubuntu 22.04, Python 3.10, huggingface_hub 0.30.x with hf_xet enabled

# Create a constrained filesystem
mkdir -p /tmp/tiny_cache
sudo mount -t tmpfs -o size=185m tmpfs /tmp/tiny_cache

# Fill most of it
HF_HUB_CACHE=/tmp/tiny_cache huggingface-cli download openai/gpt-oss-120b

# Trigger ENOSPC
HF_HUB_CACHE=/tmp/tiny_cache huggingface-cli download meta-llama/Llama-3.1-70B

# Observe: RuntimeError + segfault instead of OSError(28)

# Cleanup
sudo umount /tmp/tiny_cache

Disabling xet (HF_HUB_DISABLE_XET=1) and repeating the same test produces the correct OSError(28, "No space left on device"), confirming the error originates in xet's error propagation, not the OS or huggingface_hub.

Logs

The segfault is also potentially concerning...

~$ hf download meta-llama/Llama-3.1-70B
Downloading (incomplete total...): 0.00B [00:00, ?B/s]                                                                                                                                                /home/cam/.local/lib/python3.10/site-packages/huggingface_hub/file_download.py:722: UserWarning: Not enough free disk space to download the file. The expected file size is: 4664.17 MB. The target location /tmp/tiny_cache/models--meta-llama--Llama-3.1-70B/blobs only has 2878.12 MB free disk space.
  warnings.warn(
Downloading (incomplete total...):   0%|                                                                                                                         | 50.2k/4.66G [00:00<2:40:34, 484kB/s]/home/cam/.local/lib/python3.10/site-packages/huggingface_hub/file_download.py:722: UserWarning: Not enough free disk space to download the file. The expected file size is: 4584.41 MB. The target location /tmp/tiny_cache/models--meta-llama--Llama-3.1-70B/blobs only has 2878.05 MB free disk space.
  warnings.warn(
Downloading (incomplete total...):   0%|                                                                                                                         | 54.9k/9.25G [00:00<5:18:24, 484kB/s]/home/cam/.local/lib/python3.10/site-packages/huggingface_hub/file_download.py:722: UserWarning: Not enough free disk space to download the file. The expected file size is: 4664.13 MB. The target location /tmp/tiny_cache/models--meta-llama--Llama-3.1-70B/blobs only has 2878.05 MB free disk space.
  warnings.warn(
Downloading (incomplete total...):   0%|                                                                                                                         | 55.7k/13.9G [00:00<7:58:59, 484kB/s]/home/cam/.local/lib/python3.10/site-packages/huggingface_hub/file_download.py:722: UserWarning: Not enough free disk space to download the file. The expected file size is: 4664.17 MB. The target location /tmp/tiny_cache/models--meta-llama--Llama-3.1-70B/blobs only has 2878.05 MB free disk space.
  warnings.warn(
Downloading (incomplete total...):   0%|                                                                                                                        | 55.7k/18.6G [00:00<10:39:34, 484kB/s]/home/cam/.local/lib/python3.10/site-packages/huggingface_hub/file_download.py:722: UserWarning: Not enough free disk space to download the file. The expected file size is: 4999.71 MB. The target location /tmp/tiny_cache/models--meta-llama--Llama-3.1-70B/blobs only has 2878.05 MB free disk space.
  warnings.warn(
Downloading (incomplete total...):   0%|                                                                                                                        | 55.7k/23.2G [00:00<13:20:09, 484kB/s]/home/cam/.local/lib/python3.10/site-packages/huggingface_hub/file_download.py:722: UserWarning: Not enough free disk space to download the file. The expected file size is: 4966.16 MB. The target location /tmp/tiny_cache/models--meta-llama--Llama-3.1-70B/blobs only has 2878.05 MB free disk space.
  warnings.warn(
Downloading (incomplete total...):   7%|████████▉                                                                                                                 | 2.80G/38.2G [00:02<00:16, 2.13GB/s]/home/cam/.local/lib/python3.10/site-packages/huggingface_hub/file_download.py:722: UserWarning: Not enough free disk space to download the file. The expected file size is: 4664.13 MB. The target location /tmp/tiny_cache/models--meta-llama--Llama-3.1-70B/blobs only has 0.00 MB free disk space.
  warnings.warn(
Fetching 50 files:  12%|█████████████████▎                                                                                                                              | 6/50 [00:02<00:18,  2.41it/s]
Downloading (incomplete total...):   7%|████████▏                                                                                                                 | 2.88G/42.9G [00:02<00:18, 2.13GB/s]/home/cam/.local/lib/python3.10/site-packages/huggingface_hub/file_download.py:722: UserWarning: Not enough free disk space to download the file. The expected file size is: 4664.17 MB. The target location /tmp/tiny_cache/models--meta-llama--Llama-3.1-70B/blobs only has 0.00 MB free disk space.
  warnings.warn(
Downloading (incomplete total...):   6%|███████▍                                                                                                                  | 2.88G/47.5G [00:02<00:20, 2.13GB/s]/home/cam/.local/lib/python3.10/site-packages/huggingface_hub/file_download.py:722: UserWarning: Not enough free disk space to download the file. The expected file size is: 4966.16 MB. The target location /tmp/tiny_cache/models--meta-llama--Llama-3.1-70B/blobs only has 0.00 MB free disk space.
  warnings.warn(
Downloading (incomplete total...):   4%|████▉                                                                                                                     | 2.88G/71.5G [00:02<00:32, 2.13GB/s]/home/cam/.local/lib/python3.10/site-packages/huggingface_hub/file_download.py:722: UserWarning: Not enough free disk space to download the file. The expected file size is: 4999.71 MB. The target location /tmp/tiny_cache/models--meta-llama--Llama-3.1-70B/blobs only has 0.00 MB free disk space.
  warnings.warn(
Downloading (incomplete total...):   4%|████▌                                                                                                                     | 2.88G/76.5G [00:02<00:34, 2.13GB/s]Traceback (most recent call last):
  File "/home/cam/.local/bin/hf", line 8, in <module>
    sys.exit(main())
  File "/home/cam/.local/lib/python3.10/site-packages/huggingface_hub/cli/hf.py", line 113, in main
    app()
  File "/home/cam/.local/lib/python3.10/site-packages/typer/main.py", line 1152, in __call__
    raise e
  File "/home/cam/.local/lib/python3.10/site-packages/typer/main.py", line 1135, in __call__
    return get_command(self)(*args, **kwargs)
  File "/home/cam/.local/lib/python3.10/site-packages/click/core.py", line 1485, in __call__
    return self.main(*args, **kwargs)
  File "/home/cam/.local/lib/python3.10/site-packages/typer/core.py", line 795, in main
    return _main(
  File "/home/cam/.local/lib/python3.10/site-packages/typer/core.py", line 188, in _main
    rv = self.invoke(ctx)
  File "/home/cam/.local/lib/python3.10/site-packages/click/core.py", line 1873, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/cam/.local/lib/python3.10/site-packages/click/core.py", line 1269, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/cam/.local/lib/python3.10/site-packages/click/core.py", line 824, in invoke
    return callback(*args, **kwargs)
  File "/home/cam/.local/lib/python3.10/site-packages/typer/main.py", line 1514, in wrapper
    return callback(**use_params)
  File "/home/cam/.local/lib/python3.10/site-packages/huggingface_hub/cli/download.py", line 224, in download
    _print_result(run_download())
  File "/home/cam/.local/lib/python3.10/site-packages/huggingface_hub/cli/download.py", line 185, in run_download
    return snapshot_download(
  File "/home/cam/.local/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 89, in _inner_fn
    return fn(*args, **kwargs)
  File "/home/cam/.local/lib/python3.10/site-packages/huggingface_hub/_snapshot_download.py", line 450, in snapshot_download
    thread_map(
  File "/home/cam/.local/lib/python3.10/site-packages/tqdm/contrib/concurrent.py", line 69, in thread_map
    return _executor_map(ThreadPoolExecutor, fn, *iterables, **tqdm_kwargs)
  File "/home/cam/.local/lib/python3.10/site-packages/tqdm/contrib/concurrent.py", line 51, in _executor_map
    return list(tqdm_class(ex.map(fn, *iterables, chunksize=chunksize), **kwargs))
  File "/home/cam/.local/lib/python3.10/site-packages/tqdm/std.py", line 1181, in __iter__
    for obj in iterable:
  File "/usr/lib/python3.10/concurrent/futures/_base.py", line 621, in result_iterator
    yield _result_or_cancel(fs.pop())
  File "/usr/lib/python3.10/concurrent/futures/_base.py", line 319, in _result_or_cancel
    return fut.result(timeout)
  File "/usr/lib/python3.10/concurrent/futures/_base.py", line 458, in result
    return self.__get_result()
  File "/usr/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
    raise self._exception
  File "/usr/lib/python3.10/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/home/cam/.local/lib/python3.10/site-packages/huggingface_hub/_snapshot_download.py", line 430, in _inner_hf_hub_download
    hf_hub_download(  # type: ignore
  File "/home/cam/.local/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 89, in _inner_fn
    return fn(*args, **kwargs)
  File "/home/cam/.local/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 986, in hf_hub_download
    return _hf_hub_download_to_cache_dir(
  File "/home/cam/.local/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1202, in _hf_hub_download_to_cache_dir
    _download_to_tmp_and_move(
  File "/home/cam/.local/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1826, in _download_to_tmp_and_move
    xet_get(
  File "/home/cam/.local/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 549, in xet_get
    download_files(
RuntimeError: Data processing error: File reconstruction error: Internal Writer Error: Background writer channel closed
Segmentation fault (core dumped)

System info

above

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions