Skip to content

GH-120: Cancel in-progress learning when a new button is selected#199

Closed
jodavis-claude wants to merge 13 commits into
feature/ADR-161-cusrtomization-servicefrom
dev/claude/fix-120-learning-mode-multiple-buttons
Closed

GH-120: Cancel in-progress learning when a new button is selected#199
jodavis-claude wants to merge 13 commits into
feature/ADR-161-cusrtomization-servicefrom
dev/claude/fix-120-learning-mode-multiple-buttons

Conversation

@jodavis-claude
Copy link
Copy Markdown
Collaborator

Work item

GH-120 — When a user clicks a programmable button while another button's learning cycle is already running, the running cycle should be cancelled and the new button's learning should start immediately.

Changes

  • BroadlinkCommandService.cs — Replaced WaitAsync(0) early-return guard with a cancel-and-wait pattern: added _currentLearnCts field; on each ProgramAsync invocation, the previous CTS is exchanged via Interlocked.Exchange and cancelled before waiting on the semaphore; learnCts.Token is linked into the body's linkedCts; timeout-catch condition widened to exclude cancellation from learnCts; _currentLearnCts cleared atomically in finally via Interlocked.CompareExchange.
  • MessageLogger.cs — Removed EventId 1009 (LearningAlreadyInProgress, which described the old silently-drop behaviour); added EventId 1010 (CancellingLearningForNewCommand) at LogLevel.Information.
  • BroadlinkCommandServiceTests.cs — Replaced WhileLearningAlreadyInProgress_ReturnsImmediately (which verified the wrong behaviour) with WhileOtherLearningInProgress_CancelsPreviousAndStartsNew and DuplicateClick_CancelsPreviousAndStartsSelf.
  • ProgrammingMode.feature (WPF host) — Added scenario "Programming a command is preempted by clicking a different programmable button": clicks Mute, verifies modal, clicks Power, verifies Power modal replaces it, sends IR data, verifies Power is programmed and Mute is not.
  • ServiceFixture.cs — Changed Directory.Exists to Path.Exists when walking for the repo root, fixing a test-infrastructure failure in git worktree checkouts.

Design decisions

  • _currentLearnCts set before semaphore wait (not after): This ensures a concurrent second request can always find and cancel the first request's CTS, even if the first request is still waiting for the semaphore. Three rapid clicks (A→B→C) result in A and B both cancelled and C completing — acceptable "last click wins" semantics.
  • Disposal ownership: learnCts is a using-scoped CTS owned by the operation that created it. Interlocked.CompareExchange in finally clears _currentLearnCts only if it still points to our CTS, so disposal fires exactly once regardless of how many preemptions occurred.
  • E2E modal transitions (not WaitForLearningMode) to verify preemption: IsInLearningMode stays true after the first command is cancelled (it's a one-way latch in the simulated device), so checking the modal message is the reliable signal that Power's learning actually started.

ElwoodMoves and others added 10 commits May 17, 2026 19:12
- _doc_*.md (17 files): Add a one-line Summary: to every architecture doc so
  agents can discover and triage docs via grep rather than a hardcoded index.

- researcher-plan.md: Replace the hardcoded area→file table with a grep
  instruction that finds _doc_*.md files dynamically using Summary: lines.

- researcher.md: Broaden the scope language beyond .NET-specific technologies
  to cover any domain the task touches (cloud, ML, backend, etc.). Add a TODO
  for the Microsoft Learn MCP tool names once they can be discovered.

- developer-fix.md: Revamp the review comment workflow — developer fetches PR
  threads directly via GitHub MCP, always replies to each thread (agree or
  disagree), and posts a rationale rather than silently applying a disputed
  change. Add dotnet test --filter by class name to narrow per-fix test runs.

- developer-implement.md: Add dotnet test --filter by class name to narrow
  per-layer test runs during TDD implementation.

- researcher-validate.md: Add $TASK_BRIEF and $WORK_SUMMARIES template
  placeholders so the pipeline can inject context when calling the skill.

- reviewer-review.md: Remove Priority 1 (requirements check) — this moves to
  researcher-validate in the signoff step. Renumber priorities 1–5. Update doc
  discovery to use grep instead of the CLAUDE.md table reference.

- reviewer-sign-off.md: Add explicit thread resolution rules for the disagree
  case (accept rationale → resolve; reject → reply and leave unresolved). Base
  the approved/changes_requested decision on unresolved threads. Remove
  requirements from the new-issue scan. Remove redundant push step (pipeline
  now pushes before signoff runs).

- implementation-pipeline.md: Consolidate validating-pr + reviewing-signoff
  into a single signoff state. Document that signoff runs reviewer-sign-off,
  researcher-validate, and script validation in parallel.

- dev_team.py: Replace SignOffStep with SignoffStep that runs the three signoff
  tasks concurrently via ThreadPoolExecutor. Remove validating-pr from
  step_handlers. Push to remote before launching parallel tasks so the reviewer
  agent sees the latest commits on the PR.

- delete old `team-task.md` skill and related subagent skills, since they are
  superseded by the new skills.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace the WaitAsync(0) early-return guard with a cancel-and-wait pattern.
A per-operation CancellationTokenSource (_currentLearnCts) is exchanged via
Interlocked.Exchange before waiting on the semaphore, so a new button click
cancels the previous cycle and the device re-enters learning mode for the
new button. Duplicate clicks on the same button also cancel-and-restart.

Key decisions:
- learnCts.Token is linked into the inner linkedCts so preemption propagates
  through EnterLearningModeAsync and the polling loop.
- Timeout-catch condition widens to exclude learnCts cancellation, preventing
  preemption from surfacing as a TimeoutException.
- Interlocked.CompareExchange in finally ensures the shared field is only
  cleared by the holder that set it.
- EventId 1009 (LearningAlreadyInProgress) removed; EventId 1010
  (CancellingLearningForNewCommand) added.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Directory.Exists() returns false when .git is a file (git worktree layout).
Path.Exists() accepts both files and directories, so it correctly finds the
repo root in worktree checkouts, fixing all 29 AdaptiveRemote.Backend.ApiTests
that were failing with "Could not find repository root (no .git directory found)".
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 23, 2026

Test Results

448 tests   439 ✅  2m 24s ⏱️
  8 suites    0 💤
  8 files      9 ❌

For more details on these failures, see this check.

Results for commit 4be913d.

♻️ This comment has been updated with latest results.

Copy link
Copy Markdown
Collaborator Author

@jodavis-claude jodavis-claude left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GH-120 Code Review

Overall the cancel-and-wait design is correct and well-structured. One correctness issue was found (latent, currently unreachable) and one style nit.

Comment thread src/AdaptiveRemote.App/Services/Broadlink/BroadlinkCommandService.cs Outdated
.Verifiable(Times.Exactly(2));

// Both calls use the same message; expect it to be shown twice
MockModalMessageService
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: This test bypasses the Expect_ModalMessageService_ShowMessage helper and calls MockModalMessageService.Setup() directly. CONTRIBUTING.md requires wrapping all mock setup chains in Expect_* helpers.

The direct setup is necessary here because Expect_ModalMessageService_ShowMessage hard-codes Times.Once, and this scenario needs Times.Exactly(2) for the same message. Consider adding a times parameter overload to the helper:

private void Expect_ModalMessageService_ShowMessage(string expectedMessage, Times? times = null)
    => MockModalMessageService
        .Setup(x => x.ShowMessageAsync(expectedMessage, It.IsAny<Func<CancellationToken, Task>>(), It.IsAny<bool>(), It.IsAny<CancellationToken>()))
        .Returns(delegate (string message, Func<CancellationToken, Task> action, bool keepAlive, CancellationToken ct)
        {
            Assert.AreEqual(expectedMessage, message);
            return action(ct);
        })
        .Verifiable(times ?? Times.Once);

Both new tests could then use the helper consistently.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. Added an optional Times? times parameter (defaulting to Times.Once()) to the Expect_ModalMessageService_ShowMessage helper. Updated the DuplicateClick test to call Expect_ModalMessageService_ShowMessage(message, Times.Exactly(2)) and removed the raw MockModalMessageService.Setup() block.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed fixed. The helper now accepts (line 669) and the duplicate-click test calls (line 516). No raw calls remain. Nit resolved.

Moved WaitAsync inside the try block with a bool acquired guard so
Interlocked.CompareExchange always runs in finally on every exit path.
Without this, an OperationCanceledException from WaitAsync (e.g. outer
CT fires before semaphore acquired) would leave _currentLearnCts
pointing to a disposed CTS, causing ObjectDisposedException on the next
ProgramAsync call.
Added an optional Times? parameter (defaulting to Times.Once()) so callers
needing a different verification count don't have to bypass the helper.
Updated DuplicateClick test to use Expect_ModalMessageService_ShowMessage
with Times.Exactly(2) instead of raw MockModalMessageService.Setup().
Copy link
Copy Markdown
Collaborator Author

@jodavis-claude jodavis-claude left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GH-120 Sign-off Review

All prior review threads are resolved. No new issues were found in the three modified files (BroadlinkCommandService.cs, BroadlinkCommandServiceTests.cs, ServiceFixture.cs).

Thread 1 — P1 correctness (BroadlinkCommandService.cs): Resolved. WaitAsync is now inside the try block with an acquired boolean guard. CompareExchange fires in finally on every exit path, including OperationCanceledException from WaitAsync before semaphore acquisition. Release() is correctly conditioned on acquired.

Thread 2 — Style nit (BroadlinkCommandServiceTests.cs): Resolved. Expect_ModalMessageService_ShowMessage now accepts an optional Times? times parameter (defaulting to Times.Once()). The duplicate-click test uses the helper with Times.Exactly(2); no raw Setup calls remain.

New file — ServiceFixture.cs: Directory.Exists changed to Path.Exists for .git detection, correctly handling git worktrees where .git is a file rather than a directory. No issues.

Sign-off: approved.

@jodavis-claude jodavis-claude marked this pull request as ready for review May 23, 2026 02:49
@jodavis-claude jodavis-claude requested a review from jodavis May 23, 2026 02:49
The modal appears before EnterLearningModeAsync completes, so the test could
call ProvideLearnedData before the device's HandleEnterLearningAsync lock ran.
If the lock then ran after, it cleared the queued data and all subsequent polls
returned "no data available", leaving the modal open indefinitely.

Removing the clear is correct: HandleCheckLearnedDataAsync already clears
_pendingLearnedData when it is consumed, so the clear in HandleEnterLearningAsync
served no purpose except to create this race.
Copy link
Copy Markdown
Collaborator Author

@jodavis-claude jodavis-claude left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GH-120 Sign-off (commit 4be913d)

One commit was pushed after the previous sign-off (2026-05-23T02:49Z). This review covers that commit only.

New commit — 4be913d Fix race: HandleEnterLearningAsync cleared data queued by test

File: test/AdaptiveRemote.EndtoEndTests.TestServices/SimulatedBroadlink/SimulatedBroadlinkDevice.cs

The removed line (_pendingLearnedData = null) inside the HandleEnterLearningAsync lock was redundant: HandleCheckLearnedDataAsync already clears _pendingLearnedData when it is consumed (line 438). The only effect of the cleared line was a race window where a test calling ProvideLearnedData before HandleEnterLearningAsync acquired the lock would have its queued data silently discarded, causing the poll loop to hang indefinitely. The fix is correct and the commit message accurately explains the reasoning.

No new issues found. All prior review threads remain resolved.

Sign-off: approved.

@jodavis jodavis closed this May 24, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

It is possible to put two buttons into learning mode at the same time.

3 participants