Draft: xet CLI utility for file upload, download, and inspection#759
Draft
Draft: xet CLI utility for file upload, download, and inspection#759
Conversation
Resolve Cargo.lock, hf_xet/Cargo.lock, and xet_pkg/Cargo.toml (keep clap/serde_json/walkdir + ulid). Adapt xet CLI to session API: per-operation auth on UploadCommitBuilder and DownloadStreamGroupBuilder; new_upload_commit().build().await; download via new_download_stream_group(). Made-with: Cursor
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR adds a new xet command-line binary to xet_pkg for directly uploading, downloading, and inspecting files against a CAS endpoint — useful for development, debugging, and scripting without going through git-xet or huggingface_hub.
The binary exposes four subcommands under xet file:
upload — upload one or more files (or stdin) and emit file metadata (hash, size, sha256).
download — download by xet hash to a file or stdout, with optional source and write byte ranges.
scan — dry-run dedup/compression analysis without uploading data.
dump-reconstruction — fetch and display reconstruction metadata as JSON.
Endpoint resolution currently follows the same conventions as session API, but using arguments passed in. --endpoint overrides HF_ENDPOINT, which defaults to https://huggingface.co; endpoint can also be a local directory in which case a LocalClient is used. Token resolution uses --token then HF_TOKEN.
All config values can be overridden with -c KEY=VALUE.