Skip to content

fix: disable git auto-gc during dependency temp-dir clone+checkout#313

Closed
mchen-sentry wants to merge 1 commit intomainfrom
mchen/disable-gc-during-temp-clone
Closed

fix: disable git auto-gc during dependency temp-dir clone+checkout#313
mchen-sentry wants to merge 1 commit intomainfrom
mchen/disable-gc-during-temp-clone

Conversation

@mchen-sentry
Copy link
Copy Markdown
Member

We're seeing intermittent acceptance failures on getsentry/sentry where devservices up blows up with shutil.Error from _checkout_dependency. Attempt 1 of run 25064611924 / job 73428191911:

File ".../devservices/utils/dependencies.py", line 662, in _checkout_dependency
    shutil.copytree(temp_dir, dst=dependency_repo_dir)
shutil.Error: [('/tmp/tmpoydmzhs4/.git/objects/info/commit-graphs/tmp_graph_8REK9p',
                '.../chartcuterie/.git/objects/info/commit-graphs/tmp_graph_8REK9p',
                "[Errno 2] No such file or directory: '/tmp/tmpoydmzhs4/.git/objects/info/commit-graphs/tmp_graph_8REK9p'")]

The same job logs git version 2.54.0, and 2.54.0's release notes call out that git maintenance now defaults to the geometric strategy. The new default writes more tmp_graph_* and commit-graph-chain.lock fragments via the create-then-rename pattern git uses for atomic commit-graph chain updates. After our git checkout returns, gc.auto fires in the background and starts writing those fragments while copytree is still walking the temp tree: os.scandir enumerates a tmp file, then shutil.copy2 hits ENOENT because git has already renamed it.

Fundamentally none of this background bookkeeping is useful inside a temp dir we're about to bulk-copy: any work either gets carried over verbatim during copytree or redone later when the cache is actually used. #312 patched a related race in TemporaryDirectory's implicit cleanup (the rmtree when the with block exits), but copytree doesn't go through that cleanup machinery so ignore_cleanup_errors=True doesn't apply here. We disable auto-gc on the two git invocations inside the temp dir (the partial clone and the checkout) by passing -c gc.auto=0 -c maintenance.auto=false -c fetch.writeCommitGraph=false. That stops the background gc from ever running against the temp dir, killing both this copytree race and the rmtree race #312 was patching around.

(Considered also adding ignore=shutil.ignore_patterns('tmp_*', '*.lock') to the copytree as belt-and-suspenders, but disabling gc at the source is the cleaner fix; the patterns approach would still have a residual race for any non-tmp file that maintenance might touch.)

We're seeing intermittent `shutil.Error` from `_checkout_dependency`'s
copytree on getsentry/sentry CI now that runners ship git 2.54.0, whose
geometric maintenance default writes more commit-graph chain fragments
in the background. The fragments are written via create-then-rename and
race with copytree's scandir-then-copy2 walk.

#312 patched the related rmtree-on-cleanup race but doesn't help here
since copytree doesn't go through TemporaryDirectory's cleanup. We
disable auto-gc/maintenance/commit-graph writes on the two git calls
that run inside the temp dir, which stops the background bookkeeping
from ever touching files we're about to copy out.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant