-
Notifications
You must be signed in to change notification settings - Fork 5
Lock/pin upstream versions for deterministic syncing #103
Description
Summary
Pin upstream configs to a specific commit SHA for deterministic, reproducible syncing. Track the resolved version in a lockfile so check verifies against a known state, not just "latest".
Parent: #100 (Tier 1)
Motivation
Without pinning, ruff-sync pull in CI could silently pick up breaking upstream changes. Every major package ecosystem has a lockfile mechanism:
| Ecosystem | Lock Mechanism |
|---|---|
| npm | package-lock.json records exact resolved versions |
| Go | go.sum records content hashes |
| pre-commit | rev: <sha/tag> pins exact hook versions |
| pip | pip freeze / uv lock |
ruff-sync should let users opt into deterministic syncing while keeping the default behavior unchanged.
Proposed Design
Lock storage
Store lock metadata in pyproject.toml under [tool.ruff-sync.lock] rather than a separate file. This keeps everything in one place and avoids adding a new file to every project.
# Written by ruff-sync after a successful pull
[tool.ruff-sync.lock]
upstream = "https://raw.githubusercontent.com/my-org/standards/main/pyproject.toml"
commit = "abc1234def5678..." # resolved commit SHA (if git-resolvable)
content-hash = "sha256:e3b0c44..." # hash of the upstream ruff config section
pulled-at = "2026-03-15T15:30:00Z" # timestamp of last pullNew CLI commands / flags
# Pull and update lock (default behavior when lock section exists)
ruff-sync pull # fetches latest, updates lock
# Pull but skip lock update (useful for testing)
ruff-sync pull --no-lock
# Check against locked version specifically
ruff-sync check # uses lock hash if available
# Explicitly update lock without applying changes
ruff-sync lock # new subcommand: fetch, resolve, write lock onlyImplementation Plan
1. Define lock schema in Config TypedDict (core.py)
class LockInfo(TypedDict, total=False):
"""Lock metadata written after a successful pull."""
upstream: str # resolved raw URL
commit: str # git commit SHA if resolvable
content_hash: str # sha256 of the upstream ruff config text
pulled_at: str # ISO 8601 timestamp2. Compute content hash (core.py)
import hashlib
def compute_config_hash(config_text: str) -> str:
"""Compute a deterministic hash of the upstream ruff config."""
# Normalize: parse and re-serialize to ignore whitespace variance
normalized = tomlkit.dumps(tomlkit.parse(config_text))
return f"sha256:{hashlib.sha256(normalized.encode()).hexdigest()}"3. Resolve commit SHA (core.py)
For GitHub/GitLab URLs, resolve the current commit SHA via API or from the git clone:
async def resolve_commit_sha(
url: URL, branch: str, client: httpx.AsyncClient
) -> str | None:
"""Resolve the current commit SHA for a GitHub/GitLab branch."""
# GitHub API: GET /repos/{owner}/{repo}/commits/{branch}
if url.host in _GITHUB_HOSTS or url.host == _GITHUB_RAW_HOST:
# Extract org/repo from URL
...
api_url = f"https://api.github.com/repos/{org}/{repo}/commits/{branch}"
resp = await client.get(
api_url, headers={"Accept": "application/vnd.github.sha"}
)
if resp.status_code == 200:
return resp.text.strip()
return None # fallback: no SHA availableFor git clone fetches, extract SHA from the cloned repo.
4. Write lock after pull() (core.py)
After a successful merge, write lock metadata to pyproject.toml:
async def pull(args: Arguments) -> int:
# ... existing fetch + merge logic ...
# Write lock metadata (only for pyproject.toml targets)
if not args.no_lock and _source_toml_path.name == "pyproject.toml":
lock_info = {
"upstream": str(fetch_result.resolved_upstream),
"content-hash": compute_config_hash(upstream_config_text),
"pulled-at": dt.datetime.now(dt.timezone.utc).isoformat(),
}
if commit_sha:
lock_info["commit"] = commit_sha
# Write into [tool.ruff-sync.lock] using tomlkit
_write_lock(merged_toml, lock_info)
source_toml_file.write(merged_toml)5. Check against lock in check() (core.py)
If lock metadata exists, compare the upstream content hash against the locked hash for a fast "has upstream changed?" check:
async def check(args: Arguments) -> int:
# ... existing logic ...
# If lock exists, also verify upstream hasn't changed since last pull
config = get_config(args.to)
if "lock" in config:
lock = config["lock"]
upstream_hash = compute_config_hash(upstream_config_text)
if upstream_hash != lock.get("content-hash"):
print("Warning: Upstream has changed since last pull "
f"(locked at {lock.get('pulled-at', 'unknown')})")6. Add lock subcommand (cli.py)
A lightweight subcommand that fetches upstream, resolves the commit SHA and content hash, and writes the lock section without modifying the ruff config:
lock_parser = subparsers.add_parser(
"lock",
parents=[common_parser],
help="Fetch upstream and update lock metadata without changing ruff config",
)7. Add --no-lock flag to pull (cli.py)
pull_parser.add_argument(
"--no-lock",
action="store_true",
help="Skip updating the lock metadata after pull.",
)Backward Compatibility
- Lock is opt-in: if no
[tool.ruff-sync.lock]section exists, behavior is unchanged. - The lock section is written automatically on the first
pull(can be disabled with--no-lock). checkgracefully handles missing lock sections.- Lock metadata is stored using tomlkit to preserve formatting of the rest of
pyproject.toml.
Edge Cases
- ruff.toml targets: Lock metadata can't go in
ruff.toml(no[tool]section). Options: (a) skip locking, (b) store in a nearbypyproject.toml, or (c) use a standalone.ruff-sync.lockfile. Recommend (a) for MVP. - Git clone upstreams: SHA is directly available from the clone; no API call needed.
- Non-GitHub/GitLab hosts: Content hash still works; commit SHA may not be resolvable (logged as info, not an error).
Test Plan
- Unit test for
compute_config_hash()— verify deterministic hashing. - Unit test for
_write_lock()— verify tomlkit preserves formatting when adding lock section. - E2E test —
pullwrites lock, subsequentcheckpasses, modify upstream,checkreports both hash mismatch and content drift. --no-locktest — verify lock section is not written.locksubcommand test — verify it writes lock without modifying ruff config.- Missing lock test — verify
checkworks normally without lock section.
Files Changed
| File | Change |
|---|---|
src/ruff_sync/core.py |
LockInfo TypedDict, compute_config_hash(), resolve_commit_sha(), _write_lock(), update pull() and check() |
src/ruff_sync/cli.py |
lock subcommand, --no-lock flag, Arguments.no_lock field |
src/ruff_sync/__init__.py |
Export LockInfo |
tests/test_basic.py |
Hash and lock-write unit tests |
tests/test_e2e.py |
E2E lock lifecycle tests |