Skip to content

fix: make Windows unit tests pass#244

Open
Eishaan-Khatri wants to merge 2 commits into
huggingface:mainfrom
Eishaan-Khatri:fix-windows-unit-suite
Open

fix: make Windows unit tests pass#244
Eishaan-Khatri wants to merge 2 commits into
huggingface:mainfrom
Eishaan-Khatri:fix-windows-unit-suite

Conversation

@Eishaan-Khatri
Copy link
Copy Markdown

fixed errors that occurred due to Windows-specific path and file-locking behavior:

  • Updated the backlog prioritization test so it checks path parts instead of assuming Unix-style / separators.
  • Replaced the POSIX-only fcntl session-file lock with a cross-platform lock helper that uses msvcrt on Windows and keeps fcntl on Unix.
  • Verified with uv run pytest tests/unit: 329 passed, 14 warnings

@Eishaan-Khatri Eishaan-Khatri force-pushed the fix-windows-unit-suite branch from 4e7a272 to 8d9e528 Compare May 7, 2026 15:01
@Eishaan-Khatri
Copy link
Copy Markdown
Author

The Windows unit suite passes locally with uv run pytest tests/unit (329 passed, 14 warnings). The failing Claude PR Review check appears to be a workflow permission issue on fork PRs rather than a code failure.

@Eishaan-Khatri Eishaan-Khatri force-pushed the fix-windows-unit-suite branch from 8d9e528 to 218f7ba Compare May 7, 2026 15:53
@Eishaan-Khatri
Copy link
Copy Markdown
Author

I have re-signed the latest commit, so the verified-signature requirement should now be resolved. The remaining failing check appears to be the Claude PR Review workflow permission issue for fork PRs.

I have used Codex for assistance in resolving the issue and creating pr.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 11, 2026

Claude finished @lewtun's task in 2m 0s —— View job


Claude PR Review

No blocking issues — 1 P1, 2 P2.

The two fixes are small and targeted, and the test change is the right approach (Path.parts is the canonical way to compare path segments cross-platform). The lock helper works, but the Windows path has a couple of semantic differences from fcntl worth knowing about.


P1 — Windows path silently degrades shared lock to exclusive

agent/core/session_uploader.py:304-327

The exclusive keyword is honored on Unix (LOCK_SH vs LOCK_EX) but ignored on Windows: msvcrt.locking only supports exclusive locks (LK_LOCK / LK_NBLCK). There is no LK_SH equivalent. That means _read_session_file (exclusive=False) takes an exclusive lock on Windows, so two concurrent readers serialize where they would not on Unix.

Concrete consequence: the org uploader and the personal uploader subprocesses both call _read_session_file at the start of upload_session_as_file (line 485). On Windows they'll now block each other on that read. Each lock window is short (a single json.load), so the practical risk of LK_LOCK's ~10-second retry budget being exhausted and raising OSError is low — but it is nonzero, and it would surface as a hard failure in the outer except Exception at line 571 with no retry. On Unix the same scenario would just wait.

Two reasonable resolutions, either is fine:

  1. Document the divergence in the helper's docstring and accept it (the lock windows really are short).
  2. Catch OSError from msvcrt.locking in the Windows branch and degrade to a short bounded retry loop, so contention manifests the same way it does on Unix.

Fix this →


P2 — Dead local on Windows branch

agent/core/session_uploader.py:311mode = msvcrt.LK_LOCK is just a rename; inlining or dropping it would make the branch a couple of lines shorter. Pre-existing pattern was no intermediate, so this is cosmetic.

P2 — Lock-helper has no Windows-specific test

Existing tests/unit/test_session_uploader.py::test_update_upload_status_preserves_other_uploader_fields does exercise the new helper on every platform, which is enough to catch the obvious regression. A dedicated Windows-only contention test isn't worth the complexity, but it's worth noting the helper's Windows semantics are only covered by the happy path. Informational.


Confirmed correct

  • tests/unit/test_prioritize_backlog.py:721out.parts[-3:-1] == ("scratch", "backlog-prioritization") is the right pattern. Holds across ("/", "scratch", "backlog-prioritization", "20260504T123000Z") on POSIX and ("C:\\", "scratch", "backlog-prioritization", "20260504T123000Z") on Windows.
  • The _file_lock finally block correctly re-seeks before LK_UNLCK so a raised exception inside the with doesn't leave the file pointer past the locked byte.
  • Locking a single byte at offset 0 is a standard convention for advisory whole-file coordination; both reader and writer use the same byte, so they coordinate correctly.

  • Read changed files
  • Inspect callers of lock helpers (_read_session_file, _update_upload_status are the only call sites; both within upload_session_as_file and retry_failed_uploads)
  • Check for prior Claude review (none — first review on this PR)
  • Verify cross-platform lock semantics
  • Post review
    · fix-windows-unit-suite

@lewtun
Copy link
Copy Markdown
Member

lewtun commented May 11, 2026

Hi @Eishaan-Khatri thank you for the PR! Please address the Claude review and we can merge :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants