GH-120: Cancel in-progress learning when a new button is selected#199
Conversation
…ne place that can be updated.
… stdout, to minimize log agent noise
- _doc_*.md (17 files): Add a one-line Summary: to every architecture doc so agents can discover and triage docs via grep rather than a hardcoded index. - researcher-plan.md: Replace the hardcoded area→file table with a grep instruction that finds _doc_*.md files dynamically using Summary: lines. - researcher.md: Broaden the scope language beyond .NET-specific technologies to cover any domain the task touches (cloud, ML, backend, etc.). Add a TODO for the Microsoft Learn MCP tool names once they can be discovered. - developer-fix.md: Revamp the review comment workflow — developer fetches PR threads directly via GitHub MCP, always replies to each thread (agree or disagree), and posts a rationale rather than silently applying a disputed change. Add dotnet test --filter by class name to narrow per-fix test runs. - developer-implement.md: Add dotnet test --filter by class name to narrow per-layer test runs during TDD implementation. - researcher-validate.md: Add $TASK_BRIEF and $WORK_SUMMARIES template placeholders so the pipeline can inject context when calling the skill. - reviewer-review.md: Remove Priority 1 (requirements check) — this moves to researcher-validate in the signoff step. Renumber priorities 1–5. Update doc discovery to use grep instead of the CLAUDE.md table reference. - reviewer-sign-off.md: Add explicit thread resolution rules for the disagree case (accept rationale → resolve; reject → reply and leave unresolved). Base the approved/changes_requested decision on unresolved threads. Remove requirements from the new-issue scan. Remove redundant push step (pipeline now pushes before signoff runs). - implementation-pipeline.md: Consolidate validating-pr + reviewing-signoff into a single signoff state. Document that signoff runs reviewer-sign-off, researcher-validate, and script validation in parallel. - dev_team.py: Replace SignOffStep with SignoffStep that runs the three signoff tasks concurrently via ThreadPoolExecutor. Remove validating-pr from step_handlers. Push to remote before launching parallel tasks so the reviewer agent sees the latest commits on the PR. - delete old `team-task.md` skill and related subagent skills, since they are superseded by the new skills. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace the WaitAsync(0) early-return guard with a cancel-and-wait pattern. A per-operation CancellationTokenSource (_currentLearnCts) is exchanged via Interlocked.Exchange before waiting on the semaphore, so a new button click cancels the previous cycle and the device re-enters learning mode for the new button. Duplicate clicks on the same button also cancel-and-restart. Key decisions: - learnCts.Token is linked into the inner linkedCts so preemption propagates through EnterLearningModeAsync and the polling loop. - Timeout-catch condition widens to exclude learnCts cancellation, preventing preemption from surfacing as a TimeoutException. - Interlocked.CompareExchange in finally ensures the shared field is only cleared by the holder that set it. - EventId 1009 (LearningAlreadyInProgress) removed; EventId 1010 (CancellingLearningForNewCommand) added. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Directory.Exists() returns false when .git is a file (git worktree layout). Path.Exists() accepts both files and directories, so it correctly finds the repo root in worktree checkouts, fixing all 29 AdaptiveRemote.Backend.ApiTests that were failing with "Could not find repository root (no .git directory found)".
Test Results448 tests 439 ✅ 2m 24s ⏱️ For more details on these failures, see this check. Results for commit 4be913d. ♻️ This comment has been updated with latest results. |
jodavis-claude
left a comment
There was a problem hiding this comment.
GH-120 Code Review
Overall the cancel-and-wait design is correct and well-structured. One correctness issue was found (latent, currently unreachable) and one style nit.
| .Verifiable(Times.Exactly(2)); | ||
|
|
||
| // Both calls use the same message; expect it to be shown twice | ||
| MockModalMessageService |
There was a problem hiding this comment.
nit: This test bypasses the Expect_ModalMessageService_ShowMessage helper and calls MockModalMessageService.Setup() directly. CONTRIBUTING.md requires wrapping all mock setup chains in Expect_* helpers.
The direct setup is necessary here because Expect_ModalMessageService_ShowMessage hard-codes Times.Once, and this scenario needs Times.Exactly(2) for the same message. Consider adding a times parameter overload to the helper:
private void Expect_ModalMessageService_ShowMessage(string expectedMessage, Times? times = null)
=> MockModalMessageService
.Setup(x => x.ShowMessageAsync(expectedMessage, It.IsAny<Func<CancellationToken, Task>>(), It.IsAny<bool>(), It.IsAny<CancellationToken>()))
.Returns(delegate (string message, Func<CancellationToken, Task> action, bool keepAlive, CancellationToken ct)
{
Assert.AreEqual(expectedMessage, message);
return action(ct);
})
.Verifiable(times ?? Times.Once);Both new tests could then use the helper consistently.
There was a problem hiding this comment.
Fixed. Added an optional Times? times parameter (defaulting to Times.Once()) to the Expect_ModalMessageService_ShowMessage helper. Updated the DuplicateClick test to call Expect_ModalMessageService_ShowMessage(message, Times.Exactly(2)) and removed the raw MockModalMessageService.Setup() block.
There was a problem hiding this comment.
Confirmed fixed. The helper now accepts (line 669) and the duplicate-click test calls (line 516). No raw calls remain. Nit resolved.
Moved WaitAsync inside the try block with a bool acquired guard so Interlocked.CompareExchange always runs in finally on every exit path. Without this, an OperationCanceledException from WaitAsync (e.g. outer CT fires before semaphore acquired) would leave _currentLearnCts pointing to a disposed CTS, causing ObjectDisposedException on the next ProgramAsync call.
Added an optional Times? parameter (defaulting to Times.Once()) so callers needing a different verification count don't have to bypass the helper. Updated DuplicateClick test to use Expect_ModalMessageService_ShowMessage with Times.Exactly(2) instead of raw MockModalMessageService.Setup().
jodavis-claude
left a comment
There was a problem hiding this comment.
GH-120 Sign-off Review
All prior review threads are resolved. No new issues were found in the three modified files (BroadlinkCommandService.cs, BroadlinkCommandServiceTests.cs, ServiceFixture.cs).
Thread 1 — P1 correctness (BroadlinkCommandService.cs): Resolved. WaitAsync is now inside the try block with an acquired boolean guard. CompareExchange fires in finally on every exit path, including OperationCanceledException from WaitAsync before semaphore acquisition. Release() is correctly conditioned on acquired.
Thread 2 — Style nit (BroadlinkCommandServiceTests.cs): Resolved. Expect_ModalMessageService_ShowMessage now accepts an optional Times? times parameter (defaulting to Times.Once()). The duplicate-click test uses the helper with Times.Exactly(2); no raw Setup calls remain.
New file — ServiceFixture.cs: Directory.Exists changed to Path.Exists for .git detection, correctly handling git worktrees where .git is a file rather than a directory. No issues.
Sign-off: approved.
The modal appears before EnterLearningModeAsync completes, so the test could call ProvideLearnedData before the device's HandleEnterLearningAsync lock ran. If the lock then ran after, it cleared the queued data and all subsequent polls returned "no data available", leaving the modal open indefinitely. Removing the clear is correct: HandleCheckLearnedDataAsync already clears _pendingLearnedData when it is consumed, so the clear in HandleEnterLearningAsync served no purpose except to create this race.
jodavis-claude
left a comment
There was a problem hiding this comment.
GH-120 Sign-off (commit 4be913d)
One commit was pushed after the previous sign-off (2026-05-23T02:49Z). This review covers that commit only.
New commit — 4be913d Fix race: HandleEnterLearningAsync cleared data queued by test
File: test/AdaptiveRemote.EndtoEndTests.TestServices/SimulatedBroadlink/SimulatedBroadlinkDevice.cs
The removed line (_pendingLearnedData = null) inside the HandleEnterLearningAsync lock was redundant: HandleCheckLearnedDataAsync already clears _pendingLearnedData when it is consumed (line 438). The only effect of the cleared line was a race window where a test calling ProvideLearnedData before HandleEnterLearningAsync acquired the lock would have its queued data silently discarded, causing the poll loop to hang indefinitely. The fix is correct and the commit message accurately explains the reasoning.
No new issues found. All prior review threads remain resolved.
Sign-off: approved.
Work item
GH-120 — When a user clicks a programmable button while another button's learning cycle is already running, the running cycle should be cancelled and the new button's learning should start immediately.
Changes
BroadlinkCommandService.cs— ReplacedWaitAsync(0)early-return guard with a cancel-and-wait pattern: added_currentLearnCtsfield; on eachProgramAsyncinvocation, the previous CTS is exchanged viaInterlocked.Exchangeand cancelled before waiting on the semaphore;learnCts.Tokenis linked into the body'slinkedCts; timeout-catch condition widened to exclude cancellation fromlearnCts;_currentLearnCtscleared atomically infinallyviaInterlocked.CompareExchange.MessageLogger.cs— Removed EventId 1009 (LearningAlreadyInProgress, which described the old silently-drop behaviour); added EventId 1010 (CancellingLearningForNewCommand) atLogLevel.Information.BroadlinkCommandServiceTests.cs— ReplacedWhileLearningAlreadyInProgress_ReturnsImmediately(which verified the wrong behaviour) withWhileOtherLearningInProgress_CancelsPreviousAndStartsNewandDuplicateClick_CancelsPreviousAndStartsSelf.ProgrammingMode.feature(WPF host) — Added scenario "Programming a command is preempted by clicking a different programmable button": clicks Mute, verifies modal, clicks Power, verifies Power modal replaces it, sends IR data, verifies Power is programmed and Mute is not.ServiceFixture.cs— ChangedDirectory.ExiststoPath.Existswhen walking for the repo root, fixing a test-infrastructure failure in git worktree checkouts.Design decisions
_currentLearnCtsset before semaphore wait (not after): This ensures a concurrent second request can always find and cancel the first request's CTS, even if the first request is still waiting for the semaphore. Three rapid clicks (A→B→C) result in A and B both cancelled and C completing — acceptable "last click wins" semantics.learnCtsis ausing-scoped CTS owned by the operation that created it.Interlocked.CompareExchangeinfinallyclears_currentLearnCtsonly if it still points to our CTS, so disposal fires exactly once regardless of how many preemptions occurred.WaitForLearningMode) to verify preemption:IsInLearningModestaystrueafter the first command is cancelled (it's a one-way latch in the simulated device), so checking the modal message is the reliable signal that Power's learning actually started.