Commit 6288a37
authored
feat(patch): add package- and diff-level patch sources (#67)
* feat(patch): add package- and diff-level patch sources
Adds two new optional pathways to the socket-patch CLI alongside the
existing per-file blob path:
- Per-package archives at `.socket/packages/<uuid>.tar.gz` — a tarball
of patched files for a single patch, extracted in one shot.
- Per-file bsdiff archives at `.socket/diffs/<uuid>.tar.gz` — bsdiff
deltas that transform `before_hash` content into `after_hash` content.
The apply pipeline now tries sources in the order package → diff →
blob, falling through to the next on any failure. Every strategy
post-write-verifies the file's git-sha256 against `after_hash`, so the
existing safety invariant is unchanged.
A new `--download-mode {diff,package,file}` flag (default: `diff`)
controls what `apply`, `get`, `scan`, and `repair` fetch when local
artifacts are missing. The manifest schema is intentionally unchanged:
archives are keyed by patch UUID (already present in `PatchRecord`),
so legacy manifests keep working with no migration.
Highlights:
- New core modules `patch/diff.rs` (qbsdiff bspatch wrapper) and
`patch/package.rs` (tar+flate2 reader with path-traversal guards,
whitelist filtering against `expected_files`, and hard caps on
decompressed bytes / per-entry size / entry count to defuse
gzip-bomb and `Vec::with_capacity` allocation attacks).
- New `PatchSources` struct and `AppliedVia` enum in `patch/apply.rs`;
`apply_package_patch` takes a `PatchSources` and an optional UUID.
Passing `uuid = None` restores pre-2.2 blob-only behavior.
- `try_apply_from_diff` gates on the captured pre-apply `current_hash`
rather than `VerifyStatus`, so `--force` cannot drive a diff against
garbage content.
- `apply`'s offline guard now reports per-patch source availability
instead of a global blobs/diffs/packages bucket count.
- `ApiClient::fetch_diff(uuid)` and `fetch_package(uuid)` mirror
`fetch_blob(hash)`; a private `fetch_binary` helper deduplicates the
proxy/auth client split and 200/404/error handling.
- `DownloadMode` enum + `fetch_missing_sources` in `api/blob_fetcher.rs`
dispatch downloads by kind. `cleanup_unused_archives` in
`utils/cleanup_blobs.rs` reaps orphaned `.socket/packages/` and
`.socket/diffs/` files via `repair`.
Tests: 307 unit + 2 e2e gem (was 263 + 2 before this change). New
coverage spans diff round-trips, package extraction safety (traversal,
oversize-header, too-many-entries, decompression-bomb truncation),
fallback chain ordering (`via package`/`diff`/`blob`), force-mode +
diff regression, dry-run safety, UUID validation, and archive
download/cleanup helpers. All existing tests pass unchanged.
Server-side `/patch/diff/<uuid>` and `/patch/package/<uuid>` endpoints
are not live yet — 404 responses fall through gracefully to the file
blob path, so this PR ships safely ahead of server support.
Assisted-by: Claude Code:claude-opus-4-7
* chore: clean up stray dead-code markers
- e2e_npm.rs: NPM_PURL is actually used by 5 assertions; drop the
stale `#[allow(dead_code)]`.
- maven_crawler.rs: remove `read_pom_in_dir`, an async helper that
was never called and only existed under `#[allow(dead_code)]`.
No behavior change. 307 tests still pass; cargo build clean.
Assisted-by: Claude Code:opus-4-71 parent 5550100 commit 6288a37
17 files changed
Lines changed: 2280 additions & 91 deletions
File tree
- crates
- socket-patch-cli
- src/commands
- tests
- socket-patch-core
- src
- api
- crawlers
- patch
- utils
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
25 | 25 | | |
26 | 26 | | |
27 | 27 | | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
28 | 31 | | |
29 | 32 | | |
30 | 33 | | |
| |||
0 commit comments