Harden Doghouse trust, correctness, and character#5
Conversation
… --version and review --format-list
…ent with non-JSON handling
…ude/gemini/other)
…n; llm builder reads from config when env absent (failing)
…cmd when env missing
…nd flatten_review_threads service
…--enable-anna flag for now; tests for adapter and config write
…abbit_only; tests; update sample config
…I with secret code; wire cli tui command; tests for gh adapter
…; repo util for owner/repo; wire PR selection to comments
…adapter implements list_open_prs; author wired
…prs; comments viewer logs counts; gh adapter author+list_open_prs
…progress callback logs
…; TUI option to apply suggestion with auto commit; extend GitPort for add_and_commit; sample config reply flag
…d; tests; GitPort head_sha
…estion, replies, summary/help/batch send)
… gh CLI; TUI uses selector
Three narrative scene-endings, 5 variations each, rendered as dim italic paragraphs at natural exit points: - Merge-ready (snapshot + watch): the officers' club moment. PhiedBach climbs down from the doghouse, the Sopwith Camel cools, BunBun waits with Red Bulls and crushed cans. - Watch exit (Ctrl+C): the patrol ends. Stars, moonlight, the radar dish retracting, silence returning to the aerodrome. - Export complete: the black box sealed. Wax stamps, oilcloth bundles, the evidence cabinet locked. All scenes are grounded in the Snoopy doghouse / WWI aerodrome setting — not the LED Bike Shed Dungeon (which is the README's intro scene, not the CLI's world). Also fixed one remaining dungeon reference: "wingback chair" → "flying goggles on ze nail by ze doghouse door" in the officers' club spectacles variations.
There was a problem hiding this comment.
Actionable comments posted: 27
♻️ Duplicate comments (5)
CHANGELOG.md (1)
7-8:⚠️ Potential issue | 🟡 MinorMD022 still failing: add blank lines after subsection headings.
Line 7, Line 16, and Line 31 need a blank line before the bullet list to keep changelog lint-clean.
Patch
### Added + - **Doghouse Flight Recorder**: A new agent-native engine for PR state reconstruction. @@ ### Fixed + - **Merge-Readiness Semantics**: Formal approval state (`CHANGES_REQUESTED`, `REVIEW_REQUIRED`) is now separated from unresolved thread state. Stale `CHANGES_REQUESTED` no longer masquerades as active unresolved work when all threads are resolved. @@ ### Tests + - Added blocker-semantics tests (review/thread interaction, verdict priority chain).Also applies to: 16-17, 31-32
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@CHANGELOG.md` around lines 7 - 8, Add a blank line after each subsection heading so the bullet lists are separated from the headings (fix MD022). Specifically, update the "### Added" subsection instances (and any other "### ..." subsection headings around the Doghouse Flight Recorder entry and the other sections mentioned) to insert an empty line between the heading and the subsequent "-" bullet list so the changelog passes the linter.src/doghouse/core/services/recorder_service.py (1)
45-45:⚠️ Potential issue | 🔴 CriticalSeverity comparison bug STILL PRESENT — lexicographic string ordering is WRONG.
Past review flagged this.
BlockerSeverityenum values are strings like"blocker","warning","info". String comparison:"warning" > "info" > "blocker".This means if remote has
BLOCKERseverity and local hasWARNINGseverity, the merged blocker getsWARNINGbecause"warning" > "blocker"lexicographically. That's backwards.Use explicit ranking:
+ SEVERITY_RANK = {"info": 0, "warning": 1, "blocker": 2} blocker_map = {b.id: b for b in remote_blockers} for b in local_blockers: if b.id in blocker_map: existing = blocker_map[b.id] + b_rank = SEVERITY_RANK.get(b.severity.value, 0) + existing_rank = SEVERITY_RANK.get(existing.severity.value, 0) blocker_map[b.id] = Blocker( id=b.id, type=b.type, message=b.message, - severity=b.severity if b.severity.value > existing.severity.value else existing.severity, + severity=b.severity if b_rank > existing_rank else existing.severity, is_primary=b.is_primary or existing.is_primary, metadata={**existing.metadata, **b.metadata} )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/doghouse/core/services/recorder_service.py` at line 45, The severity comparison is wrong because it compares string values lexicographically; instead, compare by explicit ranking: introduce a severity ranking (or use a numeric priority on the BlockerSeverity enum) and replace the expression in the merge (the line using severity=b.severity if b.severity.value > existing.severity.value else existing.severity) to pick the higher-priority severity by comparing their ranks (e.g., rank[b.severity.value] vs rank[existing.severity.value]) so that "blocker" < "warning" < "info" (or inverse depending on your priority) and the correct, more severe value is chosen; reference BlockerSeverity and the merge/assignment expression to locate the change.src/doghouse/core/domain/delta.py (1)
96-166:⚠️ Potential issue | 🟠 Major
verdictandverdict_displaystill ignoreis_primary.Both properties rank blockers purely by
BlockerType, so the Blocking Matrix never affects the next-action text. Theis_primarysignal preserved fromsrc/doghouse/adapters/github/gh_cli_adapter.py:1-50is currently display-only, which means primary/secondary splits can still narrate the wrong blocker.Mirror the same scoping in `verdict_display`, and add a regression test for a primary blocker losing to a secondary one.Patch
- all_current = self.added_blockers + self.still_open_blockers - if not all_current: + all_current = self.added_blockers + self.still_open_blockers + blockers_for_verdict = tuple(b for b in all_current if b.is_primary) or all_current + if not blockers_for_verdict: return "Merge ready! All blockers resolved. 🎉" @@ - if any(b.type == BlockerType.DIRTY_MERGE_STATE for b in all_current): + if any(b.type == BlockerType.DIRTY_MERGE_STATE for b in blockers_for_verdict): return "Resolve merge conflicts first! ⚔️" @@ - failing = [b for b in all_current if b.type == BlockerType.FAILING_CHECK] + failing = [b for b in blockers_for_verdict if b.type == BlockerType.FAILING_CHECK] @@ - threads = [b for b in all_current if b.type == BlockerType.UNRESOLVED_THREAD] + threads = [b for b in blockers_for_verdict if b.type == BlockerType.UNRESOLVED_THREAD] @@ - if any(b.type == BlockerType.PENDING_CHECK for b in all_current): + if any(b.type == BlockerType.PENDING_CHECK for b in blockers_for_verdict): return "Wait for CI to complete. ⏳" @@ - if any(b.type == BlockerType.NOT_APPROVED for b in all_current): + if any(b.type == BlockerType.NOT_APPROVED for b in blockers_for_verdict): return "Approval needed before merge. 📋" @@ - return f"Resolve remaining blockers: {len(all_current)} items. 🚧" + return f"Resolve remaining blockers: {len(blockers_for_verdict)} items. 🚧"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/doghouse/core/domain/delta.py` around lines 96 - 166, verdict and verdict_display ignore the Blocker.is_primary flag and only consider Blocker.type, so primary blockers can be shadowed by secondary ones; update both verdict and verdict_display to first filter all_current into primary_blockers = [b for b in all_current if getattr(b, "is_primary", False)] and use primary_blockers for the same priority checks when non-empty (fall back to all_current if primary_blockers is empty), referencing the existing attributes added_blockers, still_open_blockers, BlockerType, verdict, and verdict_display; also add a regression test that constructs a primary blocker and a secondary blocker of differing types to assert the primary blocker wins the verdict when present.src/doghouse/cli/main.py (2)
607-615:⚠️ Potential issue | 🟠 Major
playbackis still tied to a source checkout.The fallback walks from
__file__back intotests/..., which helps from a subdirectory but still fails once this is installed as a console script without the repository tree. Ship the seeded playbacks as package data and resolve them viaimportlib.resourcesinstead.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/doghouse/cli/main.py` around lines 607 - 615, The current playback_path logic (variable playback_path in src/doghouse/cli/main.py) relies on a repository checkout; change it to load packaged playbacks via importlib.resources: add the tests/doghouse/fixtures/playbacks directory as package data in your packaging config (pyproject/setup) and then resolve the playback directory using importlib.resources.files("doghouse").joinpath("fixtures", "playbacks", name) (with a fallback to a local Path if desired). Replace the manual __file__ walk with this resource-based resolution and update the existence/error handling to use the resolved resource path or raise the existing console error if not found. Ensure the unique symbol playback_path and the surrounding playback resolution code are updated accordingly.
696-705:⚠️ Potential issue | 🟠 MajorReject non-positive watch intervals before the first poll.
--interval 0becomes a tight loop against GitHub and the JSONL store, and negative values fail only after the first cycle whensleep()raises. Guard this at command entry.Patch
def watch( pr: Optional[int] = typer.Option(None, "--pr", help="PR number"), repo: Optional[str] = typer.Option(None, "--repo", help="Repository (owner/name)"), interval: int = typer.Option(180, "--interval", help="Polling interval in seconds") ): """PhiedBach's Radar: Live monitoring of PR state.""" + if interval < 1: + raise typer.BadParameter("--interval must be >= 1") + repo, repo_owner, repo_name, pr = resolve_repo_context(repo, pr)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/doghouse/cli/main.py` around lines 696 - 705, The watch command accepts non-positive intervals which can cause tight loops or delayed errors; add an early validation in the watch(...) function to reject interval <= 0 before any polling or printing occurs. Inside watch (the function that calls resolve_repo_context and prints _WATCH_OPENING/_WATCH_INTERVAL), check if interval <= 0 and raise typer.BadParameter or call typer.echo + raise typer.Exit with a clear message like "--interval must be a positive integer" so the command aborts immediately; keep all existing messages and behavior otherwise.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pyproject.toml`:
- Line 7: Update the requests version constraint in the dependencies list to
require a minimum of 2.32.4 (replace "requests>=2.31" with "requests>=2.32.4" in
the dependencies entry) so known CVEs are excluded; after changing the
dependencies entry, regenerate your lockfile / reinstall deps to ensure the new
minimum takes effect.
In `@src/doghouse/adapters/github/gh_cli_adapter.py`:
- Around line 89-93: In the Blocker construction where you use
first_comment['id'] (creating id=f"thread-{first_comment['id']}"), avoid
KeyError by using first_comment.get('id') with a safe fallback (e.g. the thread
index, a generated uuid, or another stable identifier) and format that into the
f"thread-{...}" string; update the code in the function that builds blockers
(the Blocker(...) call in gh_cli_adapter.py) to use first_comment.get('id',
fallback) instead of direct indexing so malformed/missing id fields won't raise.
- Around line 94-100: Replace the broad "except Exception as e" around the call
to _run_gh_json that appends the Blocker with a narrow except that only catches
the expected failures (e.g., subprocess.CalledProcessError,
subprocess.TimeoutExpired, json.JSONDecodeError, KeyError) and logs/appends the
Blocker there; remove the bare except so unexpected exceptions propagate. Ensure
the except clause references those exception classes (importing subprocess and
json if needed) and keep the Blocker creation using the same
blockers.append(Blocker(...)) call and message formatting when handling these
specific errors.
- Around line 103-121: The current loop in gh_cli_adapter.py builds blocker IDs
using check_name which can be None, producing non-unique IDs like "check-None"
and breaking deduplication; update the logic that computes check_name (or the
blocker id) inside the loop over statusCheckRollup so that if both
check.get("context") and check.get("name") are missing you either skip that
check entirely or derive a unique fallback (e.g., use check.get("id") or append
the loop index) and apply the same fallback in both Blocker(...) constructions
(the f"check-{check_name}" id creation) to ensure IDs are meaningful and unique.
- Around line 114-121: The nested if can be flattened: replace the outer
elif/state + inner status check with a single condition that preserves current
logic — e.g. trigger the Blocker when state is one of ["PENDING","IN_PROGRESS"]
OR when state is one of ["QUEUED", None] and check.get("status") != "COMPLETED".
Update the conditional around variables state, check.get("status"), and
check_name in the gh_cli_adapter logic so the Blocker(...) creation (using
Blocker, BlockerType.PENDING_CHECK, BlockerSeverity.INFO) remains unchanged but
the nested if is removed.
In `@src/doghouse/cli/main.py`:
- Around line 673-680: The repro bundle currently mixes local git history with
remote PR metadata by calling subprocess.run(["git","log"...]) and adding
git_log_recent to repro_bundle; change this so the bundle either includes PR
commits fetched from the GitHub adapter (use the adapter method that returns PR
commits/commits list for the given repo and pr and populate git_log_recent from
that), or omit git_log_recent unless you first verify the current working
directory matches the target repo and branch; update code around git_log /
repro_bundle to use the GitHub adapter's commits method (or remove the field)
and ensure metadata/snapshots remain sourced from the selected repo/pr (refer to
repro_bundle, git_log, and wherever the GitHub adapter/API calls are
implemented).
- Around line 480-489: The helper currently auto-detects the missing half when
only repo or pr is provided, which can mismatch PR numbers; change the logic so
_auto_detect_repo_and_pr() is only used when neither repo nor pr is supplied,
and if exactly one of repo or pr is provided raise a clear error (or
click.BadParameter) instructing the user to provide both flags together (or omit
both to auto-detect). Update the control flow around the variables repo and pr
and preserve the existing repo parsing (owner,name = repo.split("/",1) or
fallback) in the same function where _auto_detect_repo_and_pr is called.
In `@src/doghouse/core/domain/delta.py`:
- Around line 83-85: The head_changed property incorrectly returns True when
baseline_sha is None; update the head_changed implementation in the Delta class
(property head_changed) to return False if baseline_sha is None (i.e., only
compare baseline_sha and current_sha when a baseline exists) so that first-run
deltas do not report a spurious change; ensure you reference baseline_sha and
current_sha in the check and preserve the boolean semantics for normal
comparisons.
- Around line 73-81: The Delta dataclass currently uses mutable lists for
added_blockers, removed_blockers, and still_open_blockers which breaks the
intent of frozen=True; change their type annotations to tuple[Blocker, ...] and
use field(default_factory=tuple) for each, and then update
DeltaEngine.compute_delta() (the function that currently builds and returns
Delta instances) to construct tuples (e.g., tuple(...) around comprehensions or
use tuple(...) conversions) instead of lists when creating the Delta; also
consider mirroring this tuple-based immutability in Snapshot where list() is
used to keep domain classes consistent.
In `@src/doghouse/core/domain/snapshot.py`:
- Line 2: Remove the unused import by deleting "asdict" from the import
statement in snapshot.py (the dataclass import line: from dataclasses import
dataclass, field, asdict) so it becomes just "from dataclasses import dataclass,
field"; this removes dead code and resolves the unused-import warning.
- Around line 13-16: The Snapshot currently copies its top-level containers but
individual Blocker.metadata dicts remain shared; add a defensive shallow copy in
Blocker.__post_init__ so the frozen dataclass does not hold a reference to the
caller's mutable dict—inside Blocker.__post_init__ call object.__setattr__(self,
'metadata', dict(self.metadata)) to replace the reference with a copy
(preserving immutability while avoiding direct mutation from external
references).
- Around line 18-27: blocker_signature currently omits b.message which is
inconsistent with other code paths that treat message as meaningful (see
to_dict/from_dict and still_open_blockers); either include b.message in the
frozenset tuple returned by blocker_signature (update the tuple in def
blocker_signature to add b.message) so message changes affect is_equivalent_to
comparisons, or leave the tuple as-is but add a clear docstring to
blocker_signature stating message is intentionally ignored and add a unit test
(e.g., test_equivalent_ignores_message_changes in test_delta_engine.py) that
asserts snapshots differing only by blocker.message are considered equivalent;
update references to is_equivalent_to tests accordingly to ensure behavior is
covered.
In `@src/doghouse/core/services/recorder_service.py`:
- Line 10: RecorderService currently imports the concrete GitAdapter
(GitAdapter) from adapters, violating hexagonal boundaries; replace that
dependency by introducing an abstract GitPort interface under core/ports (e.g.,
GitPort) and have the existing GitAdapter implement GitPort, then update
RecorderService to depend on GitPort instead of GitAdapter and accept it via
constructor or setter injection; ensure the actual GitAdapter instance is wired
into RecorderService at the composition root (CLI layer) when composing objects.
- Line 56: Replace the naive timestamp construction at the call site using
timestamp=datetime.datetime.now() with a timezone-aware timestamp (e.g.,
timestamp=datetime.datetime.now(datetime.timezone.utc) or
timestamp=datetime.datetime.now(tz=datetime.timezone.utc)); update imports if
needed (use datetime.timezone or from datetime import timezone) and ensure the
change is applied where the timestamp argument is set (the
timestamp=datetime.datetime.now() expression in recorder_service.py).
- Around line 20-25: The constructor currently hides a concrete dependency by
doing self.git = git or GitAdapter(); change this to require an explicit
GitAdapter to be passed or explicitly fail when git is None to avoid implicit
instantiation: update the constructor signature (the git parameter) and the
initialization in the class (where self.git is assigned) so that if git is None
the constructor raises a clear exception (or make the parameter non-Optional),
and document the behavior; reference the GitAdapter type and the constructor
assignment location (self.git) when applying the change so callers/composition
root provide the concrete adapter instead of creating one here.
In `@tests/doghouse/test_blocker_semantics.py`:
- Around line 66-75: The test test_review_required_is_warning_not_blocker is
tautological because it constructs a Blocker with
severity=BlockerSeverity.WARNING and then asserts that same value; instead,
remove or change this test so it does not verify a constructor round-trip and
move the REVIEW_REQUIRED -> severity assertion into the adapter-layer test that
exercises the mapping in gh_cli_adapter (the code that produces Blocker
instances from a REVIEW_REQUIRED payload). Specifically, delete the redundant
assertion in test_review_required_is_warning_not_blocker (or convert the test to
only validate Blocker construction behavior without assuming REVIEW_REQUIRED
semantics) and add a new adapter test that sends a REVIEW_REQUIRED payload
through the gh_cli_adapter mapping code and asserts the resulting
Blocker.severity == BlockerSeverity.WARNING.
In `@tests/doghouse/test_packaging.py`:
- Around line 55-73: The test currently named test_entry_point_module_importable
only checks for file existence; update it to actually import the module after
the existing file/existence checks: for each script entry (using scripts and
module_path = entry.split(":")[0]) call importlib.import_module(module_path)
(ensuring the src directory is on sys.path, e.g. by temporarily prepending
PROJECT_ROOT / "src" if needed), and on any Exception call pytest.fail with a
clear message including the entry name, module_path and the exception text; keep
the existing file existence assertions before attempting the import.
- Line 13: PROJECT_ROOT is computed by assuming three parent levels
(Path(__file__).parent.parent.parent), which is fragile; replace it with a
robust finder: implement a helper (e.g., find_project_root) that resolves
Path(__file__).resolve() and walks up parents looking for a repo marker
(pyproject.toml, setup.cfg, .git, or another agreed file), and set PROJECT_ROOT
= find_project_root(); update any test code that references PROJECT_ROOT to use
this finder so moving the test file won't break path resolution.
In `@tests/doghouse/test_repo_context.py`:
- Around line 49-64: The test test_all_commands_share_resolve_repo_context is
brittle because it inspects source; instead mock resolve_repo_context and call
each command to assert the helper is invoked. Replace the inspect-based check
with a patch of doghouse.cli.resolve_repo_context (or the exact import used by
main) using unittest.mock.patch or pytest's monkeypatch, then call
main.snapshot, main.watch, and main.export with minimal required args/context
and assert the mock was called for each command; keep the test name and loop
over cmd_name to locate functions via getattr(main, cmd_name).
- Around line 19-23: In test_resolve_parses_owner_name_from_repo_string rename
the unused tuple elements returned by resolve_repo_context so static analysis
doesn't flag them — e.g. assign the first and fourth values to _repo and _pr (or
use single underscores _) instead of repo and pr, leaving owner and name as-is;
update the assignment to match resolve_repo_context(...) -> _repo, owner, name,
_pr.
- Around line 38-46: The test test_resolve_auto_detects_pr_only should assert
that the mocked _auto_detect_repo_and_pr is called with the provided repo string
(not None) when resolve_repo_context("my/repo", None) is invoked; update the
test to verify mock_detect was called once with the repo "my/repo" (using
mock_detect.assert_called_with or equivalent) so that _auto_detect_repo_and_pr
receives the repo context for PR inference while continuing to mock its return
value and assert returned repo/owner/name/pr from resolve_repo_context.
In `@tests/doghouse/test_snapshot.py`:
- Around line 70-84: Add a new unit test that documents the intentional behavior
of ignoring Blocker.message when computing equivalence: create two Blocker
instances with the same id/type/severity/is_primary but different message text,
wrap each in a Snapshot (use same head_sha and differing timestamps/metadata as
needed) and assert Snapshot.is_equivalent_to returns True; reference Blocker,
BlockerType, Snapshot, blocker_signature(), and is_equivalent_to so the test
clearly demonstrates message-only changes are considered equivalent.
- Around line 52-67: Add a new test in tests/doghouse/test_snapshot.py that
mirrors the severity-change test but flips the Blocker.is_primary flag to ensure
Snapshot.is_equivalent_to detects the change: create two Blocker instances with
the same id, type (BlockerType.NOT_APPROVED) and message but differing
is_primary (True vs False), build two Snapshots (using Snapshot with same
head_sha and different timestamps) each containing one blocker, and assert that
s1.is_equivalent_to(s2) is False; this exercises blocker_signature() and
validates that changes to is_primary break equivalence.
- Around line 1-100: Add a unit test that verifies Snapshot serialization
roundtrip by calling Snapshot.to_dict() and Snapshot.from_dict() and asserting
all data fields survive; specifically construct a Blocker with non-default
fields (use Blocker(..., severity=BlockerSeverity.WARNING, is_primary=False,
metadata={...})), build a Snapshot with a timezone-aware datetime, head_sha,
blockers list and metadata, then do roundtripped =
Snapshot.from_dict(original.to_dict()) and assert roundtripped.timestamp ==
original.timestamp, roundtripped.head_sha == original.head_sha, len(blockers)
matches, and every Blocker attribute (id, type, message, severity, is_primary,
metadata) plus Snapshot.metadata match the originals; place the test alongside
the other tests in tests/doghouse/test_snapshot.py and name it
test_roundtrip_serialization.
In `@tests/doghouse/test_watch_persistence.py`:
- Around line 37-53: Add a new unit test named
test_message_only_change_does_not_persist in
tests/doghouse/test_watch_persistence.py that creates two Blocker instances with
the same id and type but different message text (e.g., b_v1 and b_v2),
constructs a Snapshot baseline using b_v1, calls _make_service with
head_sha="abc123", remote_blockers=[b_v2], and stored_baseline=baseline, then
invokes service.record_sortie("owner/repo", 1) and asserts
storage.save_snapshot.assert_not_called(); this verifies blocker_signature()
ignores message-only changes and prevents persisting an identical logical state.
- Around line 56-70: The test test_head_sha_change_persists currently only
asserts storage.save_snapshot was called; update it to verify the saved Snapshot
contents so we ensure RecorderService persisted the correct data: after
service.record_sortie("owner/repo", 1) inspect storage.save_snapshot.call_args
(or use assert_called_once_with) to extract the Snapshot argument (third
positional arg) and assert its head_sha == "new_sha" and any other important
fields (e.g., timestamp/blockers) as needed to guarantee the correct snapshot
was saved.
- Around line 14-34: _add a return type annotation to _make_service to satisfy
ANN202: annotate it as returning a tuple of the RecorderService and the storage
mock (e.g., -> tuple[RecorderService, MagicMock] or -> tuple[RecorderService,
Any] if you prefer a looser type), and ensure typing names are imported (from
typing import tuple or Any, and import MagicMock or use unittest.mock.MagicMock)
so static analysis recognizes the types; reference the function _make_service,
and the returned values RecorderService and storage (currently a MagicMock).
---
Duplicate comments:
In `@CHANGELOG.md`:
- Around line 7-8: Add a blank line after each subsection heading so the bullet
lists are separated from the headings (fix MD022). Specifically, update the "###
Added" subsection instances (and any other "### ..." subsection headings around
the Doghouse Flight Recorder entry and the other sections mentioned) to insert
an empty line between the heading and the subsequent "-" bullet list so the
changelog passes the linter.
In `@src/doghouse/cli/main.py`:
- Around line 607-615: The current playback_path logic (variable playback_path
in src/doghouse/cli/main.py) relies on a repository checkout; change it to load
packaged playbacks via importlib.resources: add the
tests/doghouse/fixtures/playbacks directory as package data in your packaging
config (pyproject/setup) and then resolve the playback directory using
importlib.resources.files("doghouse").joinpath("fixtures", "playbacks", name)
(with a fallback to a local Path if desired). Replace the manual __file__ walk
with this resource-based resolution and update the existence/error handling to
use the resolved resource path or raise the existing console error if not found.
Ensure the unique symbol playback_path and the surrounding playback resolution
code are updated accordingly.
- Around line 696-705: The watch command accepts non-positive intervals which
can cause tight loops or delayed errors; add an early validation in the
watch(...) function to reject interval <= 0 before any polling or printing
occurs. Inside watch (the function that calls resolve_repo_context and prints
_WATCH_OPENING/_WATCH_INTERVAL), check if interval <= 0 and raise
typer.BadParameter or call typer.echo + raise typer.Exit with a clear message
like "--interval must be a positive integer" so the command aborts immediately;
keep all existing messages and behavior otherwise.
In `@src/doghouse/core/domain/delta.py`:
- Around line 96-166: verdict and verdict_display ignore the Blocker.is_primary
flag and only consider Blocker.type, so primary blockers can be shadowed by
secondary ones; update both verdict and verdict_display to first filter
all_current into primary_blockers = [b for b in all_current if getattr(b,
"is_primary", False)] and use primary_blockers for the same priority checks when
non-empty (fall back to all_current if primary_blockers is empty), referencing
the existing attributes added_blockers, still_open_blockers, BlockerType,
verdict, and verdict_display; also add a regression test that constructs a
primary blocker and a secondary blocker of differing types to assert the primary
blocker wins the verdict when present.
In `@src/doghouse/core/services/recorder_service.py`:
- Line 45: The severity comparison is wrong because it compares string values
lexicographically; instead, compare by explicit ranking: introduce a severity
ranking (or use a numeric priority on the BlockerSeverity enum) and replace the
expression in the merge (the line using severity=b.severity if b.severity.value
> existing.severity.value else existing.severity) to pick the higher-priority
severity by comparing their ranks (e.g., rank[b.severity.value] vs
rank[existing.severity.value]) so that "blocker" < "warning" < "info" (or
inverse depending on your priority) and the correct, more severe value is
chosen; reference BlockerSeverity and the merge/assignment expression to locate
the change.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 572da6cf-9aa4-469e-9a4d-5c09a79d2fc0
📒 Files selected for processing (13)
CHANGELOG.mddocs/code-reviews/PR5/55095b07e382e97bbf3a1e695ebffa01017c9bd9.mdpyproject.tomlsrc/doghouse/adapters/github/gh_cli_adapter.pysrc/doghouse/cli/main.pysrc/doghouse/core/domain/delta.pysrc/doghouse/core/domain/snapshot.pysrc/doghouse/core/services/recorder_service.pytests/doghouse/test_blocker_semantics.pytests/doghouse/test_packaging.pytests/doghouse/test_repo_context.pytests/doghouse/test_snapshot.pytests/doghouse/test_watch_persistence.py
CRITICAL: - Fix severity comparison bug: blocker merge used alphabetical string comparison on enum values (BLOCKER ranked below WARNING). Added explicit numeric rank property to BlockerSeverity. - Fix architecture violation: RecorderService no longer imports from adapter layer. New GitPort ABC; GitAdapter implements it; callers provide the concrete adapter via constructor injection. HIGH: - Remove dead `history` Makefile target (command never existed) - Fix empty-string pr_id args in gh CLI calls (conditional arg build) - Guard against None check names in status check parsing - Fix variable shadowing (local `snapshot` in `snapshot()` function) - Move mid-module imports to top of file (PEP 8) MEDIUM: - Add timeouts to all subprocess.run calls in GitAdapter and export - Narrow bare `except Exception` to specific exception types - Validate repo names in storage adapter path construction - Use `is None` checks in resolve_repo_context (not truthiness) - Print absolute path in export output - Defensive copy of Blocker.metadata in __post_init__ - Remove unused deps (requests, textual) from pyproject.toml - Fix leading whitespace in pyproject.toml TOML keys - Reduce CI permissions (pull-requests: write → read) - Remove feature branch from CI push trigger - Clean up unused imports across domain and adapter modules - Modernize type annotations (list/dict/X|None) in all modified files LOW: - Move verdict_display + _V_* variation lists from domain to CLI layer as _theatrical_verdict() function (domain stays pure/deterministic) - Remove trivial _pick() wrapper, use random.choice() directly - Document structural source-inspection test intent 42 tests green.
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (16)
tests/doghouse/test_repo_context.py (2)
19-23: 🛠️ Refactor suggestion | 🟠 MajorUnused unpacked variables
repoandpr.Static analysis flagged this. You only assert on
ownerandname. Either assert on all four or underscore-prefix the unused ones.def test_resolve_parses_owner_name_from_repo_string(): """The repo string should be split into owner and name.""" - repo, owner, name, pr = resolve_repo_context("acme/widgets", 7) + _repo, owner, name, _pr = resolve_repo_context("acme/widgets", 7) assert owner == "acme" assert name == "widgets"🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/doghouse/test_repo_context.py` around lines 19 - 23, The test unpacks four values from resolve_repo_context into repo, owner, name, pr but only asserts owner and name; rename the unused variables to underscore-prefixed names (e.g., _repo and _pr) or include assertions for repo and pr to avoid unused-variable warnings; update the unpacking in test_resolve_parses_owner_name_from_repo_string accordingly to use _repo and _pr or add explicit asserts for repo and pr while keeping the checks for owner and name.
38-46:⚠️ Potential issue | 🟡 MinorTest doesn't verify
_auto_detect_repo_and_prwas called.You mock it, you return a value, but you never assert the mock was invoked. If the implementation silently skips detection, this test passes anyway. Add the assertion.
assert repo == "my/repo" assert owner == "my" assert name == "repo" assert pr == 55 + mock_detect.assert_called_once()🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/doghouse/test_repo_context.py` around lines 38 - 46, The test test_resolve_auto_detects_pr_only currently mocks _auto_detect_repo_and_pr but never asserts it was invoked; update the test to assert the mock was called (e.g., mock_detect.assert_called_once()) after calling resolve_repo_context("my/repo", None) to ensure _auto_detect_repo_and_pr was actually used during PR detection.src/doghouse/core/services/recorder_service.py (1)
55-60:⚠️ Potential issue | 🟡 Minor
datetime.datetime.now()without timezone — still naive.Past review flagged this. Static analysis is still flagging it. Naive datetimes are ambiguous and cause comparison issues when serialized/deserialized across timezones.
current_snapshot = Snapshot( - timestamp=datetime.datetime.now(), + timestamp=datetime.datetime.now(tz=datetime.timezone.utc), head_sha=head_sha, blockers=blockers, metadata=metadata )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/doghouse/core/services/recorder_service.py` around lines 55 - 60, The Snapshot timestamp is created with a naive datetime (datetime.datetime.now()) which should be timezone-aware; update the timestamp assignment when constructing current_snapshot to use a tz-aware UTC datetime (e.g. datetime.datetime.now(datetime.timezone.utc) or equivalent) so Snapshot.timestamp is unambiguous across timezones, and adjust imports if necessary to reference datetime.timezone.src/doghouse/adapters/git/git_adapter.py (1)
42-49:⚠️ Potential issue | 🟡 MinorStill mapping all
git rev-listfailures to "no upstream configured."Past review explicitly called this out. You're treating permission errors, corrupt repos, and actual git failures identically to "upstream not set." The fix was to check
stderrfor upstream-specific messaging.🔧 Distinguish failure modes
elif unpushed_res.returncode != 0: - # Upstream might be missing - blockers.append(Blocker( - id="local-no-upstream", - type=BlockerType.LOCAL_UNPUSHED, - message="Local branch has no upstream configured", - severity=BlockerSeverity.WARNING - )) + stderr_lower = unpushed_res.stderr.lower() + if "no upstream" in stderr_lower or "unknown revision" in stderr_lower: + blockers.append(Blocker( + id="local-no-upstream", + type=BlockerType.LOCAL_UNPUSHED, + message="Local branch has no upstream configured", + severity=BlockerSeverity.WARNING + )) + # Other failures silently ignored — git state indeterminate🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/doghouse/adapters/git/git_adapter.py` around lines 42 - 49, The current code maps any nonzero result from the git rev-list call (variable unpushed_res) to a LOCAL_UNPUSHED blocker; change this to inspect the command stderr for upstream-specific text (e.g., substrings like "no upstream", "no upstream configured for branch", or the exact git wording) and only append the Blocker with id "local-no-upstream" / type BlockerType.LOCAL_UNPUSHED when that message is present; for other failures (permission issues, corrupt repo, other git errors) create a distinct blocker (e.g., id "git-rev-list-failed" / type a suitable error type) with severity ERROR and include the stderr content in the message so diagnostics are preserved; ensure you read/compare unpushed_res.stderr as text (decode if bytes) rather than relying solely on returncode.Makefile (1)
1-7: 🧹 Nitpick | 🔵 TrivialNo explicit
alltarget, buthelpas default is reasonable.Checkmake wants an
alltarget, but your first recipe ishelp, which is a sensible default. If you want to silence the linter and be explicit:♻️ Add explicit default
-.PHONY: dev-venv test snapshot playback watch export clean help +.PHONY: all dev-venv test snapshot playback watch export clean help + +all: help VENV = .venv🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@Makefile` around lines 1 - 7, Add an explicit all target to the Makefile so checkmake stops warning: create a top-level target named all that depends on help (or set it to .DEFAULT_GOAL = help), referencing the existing help target; update the Makefile so the first declared target is all (or set .DEFAULT_GOAL) and leave help as-is to preserve current behavior..github/workflows/ci.yml (1)
5-7: 🧹 Nitpick | 🔵 TrivialYAML bracket spacing inconsistency.
YAMLlint is screaming at you.
[ main ]has gratuitous whitespace padding. Either remove the spaces or quote'on'to silence the truthy warning while you're at it.♻️ Cleaner YAML
-on: +'on': push: - branches: [ main ] + branches: [main] pull_request: - branches: [ main ] + branches: [main]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/ci.yml around lines 5 - 7, Fix YAML lint errors by removing the extra spaces inside the array entries for both the branches key and the pull_request->branches key (change `[ main ]` to `[main]`) and, if present, quote the top-level on key as 'on' to avoid truthy warnings; update the branches and pull_request definitions accordingly so the YAML linter stops flagging spacing/truthy issues.src/doghouse/core/services/delta_engine.py (1)
11-20:⚠️ Potential issue | 🔴 CriticalLine 17 still leaks mutable state to Delta—commit claims "addressed" but the issue persists.
The
Snapshotis frozen with a defensive copy in__post_init__, but that copy is then shared directly with Delta viaadded_blockers=current.blockers.While
SnapshotandDeltaare both frozen against reassignment, their internallistobjects remain mutable. Any caller mutatingdelta.added_blockersalso mutatessnapshot.blockers. SinceDeltaoutlives the compute call andSnapshotcan live longer than the baseline check, this shared mutable reference is a real footgun.The past review requested
list(current.blockers). The commit notes claim "Addressed," but line 17 still passes the list directly. Either copy it here, or document that Delta must never mutate its blocker lists (and enforce that everywhere Delta is used).Fix: Defensive copy
return Delta( baseline_timestamp=None, current_timestamp=current.timestamp.isoformat(), baseline_sha=None, current_sha=current.head_sha, - added_blockers=current.blockers, + added_blockers=list(current.blockers), removed_blockers=[], still_open_blockers=[] )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/doghouse/core/services/delta_engine.py` around lines 11 - 20, The Delta constructor is receiving a direct reference to Snapshot.blockers, leaking mutable state; change the baseline-none branch in compute to pass a defensive copy (e.g., use list(current.blockers)) into the Delta.added_blockers field (and similarly ensure any other Delta fields that receive existing lists are copied) so mutating Delta.added_blockers cannot mutate the Snapshot.blockers; locate the instantiation of Delta (symbol: Delta(... added_blockers=current.blockers ...)) and replace the passed list with a shallow copy.CHANGELOG.md (1)
7-7:⚠️ Potential issue | 🟡 MinorMarkdownlint is still going to fail under these headings.
MD022wants a blank line after### Added,### Fixed, and### Tests. Right now each heading runs straight into its list.Patch
### Added + - **Doghouse Flight Recorder**: A new agent-native engine for PR state reconstruction. @@ ### Fixed + - **Merge-Readiness Semantics**: Formal approval state (`CHANGES_REQUESTED`, `REVIEW_REQUIRED`) is now separated from unresolved thread state. Stale `CHANGES_REQUESTED` no longer masquerades as active unresolved work when all threads are resolved. @@ ### Tests + - Added blocker-semantics tests (review/thread interaction, verdict priority chain).Also applies to: 16-16, 49-49
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@CHANGELOG.md` at line 7, MD022 failures are caused by missing blank lines after the heading tokens; insert a single blank line immediately after each "### Added", "### Fixed", and "### Tests" heading so the heading is followed by an empty line before the list content (ensure the same fix is applied to the other occurrences noted by the reviewer).src/doghouse/core/domain/delta.py (1)
16-18:⚠️ Potential issue | 🟠 Major
head_changedis wrong on first capture.When
baseline_shaisNone, this still returnsTrue, so first-run JSON/output reports a SHA change that never happened. Only compare once a baseline exists.Patch
`@property` def head_changed(self) -> bool: - return self.baseline_sha != self.current_sha + return self.baseline_sha is not None and self.baseline_sha != self.current_sha🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/doghouse/core/domain/delta.py` around lines 16 - 18, The head_changed property currently returns True when baseline_sha is None, causing a false-positive on first capture; update the head_changed getter in the Delta class (head_changed, referencing baseline_sha and current_sha) to only compare SHAs if baseline_sha is not None — i.e., return False when baseline_sha is None, otherwise return baseline_sha != current_sha.src/doghouse/adapters/github/gh_cli_adapter.py (2)
112-129:⚠️ Potential issue | 🟠 Major
check-unknownstill collapses distinct checks.If multiple rollup entries arrive without both
contextandname, they all becomecheck-unknown, andRecorderServicededups them byb.id. Skip unnamed entries or derive a stable unique fallback instead of the plain"unknown"sentinel.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/doghouse/adapters/github/gh_cli_adapter.py` around lines 112 - 129, In the loop over data.get("statusCheckRollup", []) (inside gh_cli_adapter.py) avoid collapsing distinct unnamed checks by not using the literal "unknown" for check_name in the Blocker id; instead either skip entries that lack both context and name or generate a stable unique fallback (e.g., incorporate a hash or index) so the Blocker id (f"check-{check_name}") is unique per rollup entry; update the logic that sets check_name and the Blocker id creation accordingly to ensure RecorderService doesn't dedupe distinct checks into "check-unknown".
60-87:⚠️ Potential issue | 🟠 MajorThe review-thread fetch still hard-stops at 100.
This query never asks for
pageInfoor follows anaftercursor, so large PRs silently lose unresolved threads after page one. That can undercount blockers and hand the delta engine a false merge-ready verdict.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/doghouse/adapters/github/gh_cli_adapter.py` around lines 60 - 87, The GraphQL query in gql_query and the call to _run_gh_json only fetch the first 100 reviewThreads and never follows cursors, so threads can be silently dropped for large PRs; modify gql_query to request reviewThreads(first: $first, after: $after) including pageInfo { endCursor hasNextPage } and the nodes, then implement a loop around the _run_gh_json call that supplies variables for first (e.g., 100) and after (initially null), accumulates nodes into threads, updates after with endCursor, and breaks when hasNextPage is false; ensure the subsequent calls set the -F param for after and preserve with_repo=False and actual_pr_id/name/owner variables used today.tests/doghouse/test_blocker_semantics.py (1)
85-94:⚠️ Potential issue | 🟠 MajorThis test is still a tautology.
It hardcodes
severity=BlockerSeverity.WARNINGin the constructor and then asserts that same value back, so a regression inGhCliAdapter'sREVIEW_REQUIREDmapping stays green. Put this assertion in an adapter-layer test that builds blockers from the real review-decision payload.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/doghouse/test_blocker_semantics.py` around lines 85 - 94, The test test_review_required_is_warning_not_blocker currently asserts a value it constructed and is therefore a tautology; replace it with an adapter-layer test that exercises the actual mapping in GhCliAdapter (the method that converts a review decision/payload into a Blocker) by feeding a real review-decision payload fixture and asserting the produced Blocker has id "review-required" (or BlockerType.NOT_APPROVED as applicable) and severity BlockerSeverity.WARNING; locate the adapter mapping method in GhCliAdapter (the function that returns Blocker instances from GitHub review payloads) and call that to build the Blocker instead of constructing Blocker(...) directly, then assert on the resulting Blocker.severity.src/doghouse/cli/main.py (3)
791-798:⚠️ Potential issue | 🟠 MajorReject non-positive watch intervals.
--interval 0becomes a tight poll loop, and negative values die on the firstsleep(). Fail fast before starting the patrol.Patch
def watch( pr: Optional[int] = typer.Option(None, "--pr", help="PR number"), repo: Optional[str] = typer.Option(None, "--repo", help="Repository (owner/name)"), interval: int = typer.Option(180, "--interval", help="Polling interval in seconds") ): """PhiedBach's Radar: Live monitoring of PR state.""" + if interval < 1: + raise typer.BadParameter("--interval must be >= 1") + repo, repo_owner, repo_name, pr = resolve_repo_context(repo, pr)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/doghouse/cli/main.py` around lines 791 - 798, The watch command accepts an interval that can be zero or negative; add an explicit validation in the watch function (after resolve_repo_context and before starting the polling loop) to reject non-positive values and fail fast—for example, if interval <= 0 then raise typer.BadParameter("--interval must be a positive integer") or call typer.echo(...) and raise typer.Exit(1); update the watch function (and any startup path that consumes interval) to ensure the check runs before any time.sleep or polling begins.
601-606:⚠️ Potential issue | 🟠 Major
--repocan still import the wrong local state.
GhCliAdapteris repo-scoped now, butRecorderServicestill mergesGitAdapter().get_local_blockers()from the current checkout, andexport()still shells out to localgit log. Runningdoghouse ... --repo other/repo --pr 42can still persist/report remote PR state mixed with unrelated local blockers and commit history.Also applies to: 771-779, 803-806
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/doghouse/cli/main.py` around lines 601 - 606, RecorderService is inadvertently mixing local checkout data into remote repo work because you instantiate a plain GitAdapter() and let RecorderService call GitAdapter.get_local_blockers() and export() which shell out to the current working tree; fix by passing a repo-scoped GitAdapter into RecorderService (or have GhCliAdapter produce a repo-bound git adapter) so that RecorderService.record_sortie and export use that repo-scoped adapter instead of a default GitAdapter(); update the instantiation (where GhCliAdapter(repo_owner..., repo_name...), RecorderService(github, storage, engine, git=GitAdapter()) is created) and ensure the code paths at record_sortie and export consume the injected git adapter rather than calling a global or default GitAdapter methods.
581-590:⚠️ Potential issue | 🟠 MajorStop inventing repo context.
This resolver still splices a missing
--repo/--prfrom the current checkout, and it still rewrites a slashless--repo foointofoo/foo. Both cases silently point the command at the wrong PR.Patch
def resolve_repo_context( repo: Optional[str], pr: Optional[int] ) -> tuple[str, str, str, int]: @@ - if repo is None or pr is None: + if (repo is None) ^ (pr is None): + raise typer.BadParameter( + "Pass both --repo and --pr, or omit both for auto-detection." + ) + + if repo is None and pr is None: detected_repo, detected_pr = _auto_detect_repo_and_pr() - repo = repo if repo is not None else detected_repo - pr = pr if pr is not None else detected_pr + repo, pr = detected_repo, detected_pr - if "/" in repo: - owner, name = repo.split("/", 1) - else: - owner, name = repo, repo + if "/" not in repo: + raise typer.BadParameter("--repo must be in owner/name form.") + + owner, name = repo.split("/", 1) return repo, owner, name, pr🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/doghouse/cli/main.py` around lines 581 - 590, The code in main.py currently auto-detects missing repo/pr via _auto_detect_repo_and_pr and silently converts a slashless repo ("foo") into "foo/foo"; instead, stop inventing context: if repo or pr is None, raise a clear usage error asking the user to pass --repo and --pr (do not call _auto_detect_repo_and_pr to fill them), and validate that repo contains a "/" (if not, raise an error instead of splitting into owner/name); update the block around repo/pr handling (the repo, pr variables and the repo "/" split logic) to enforce these checks and error messages.src/doghouse/adapters/storage/jsonl_adapter.py (1)
30-45:⚠️ Potential issue | 🟠 MajorOne torn JSONL append still poisons the whole ledger.
save_snapshot()writes straight into the live file, andlist_snapshots()still trusts every nonblank line. One truncated tail record will makejson.loads()raise on the next read and take outget_latest_snapshot()plus any caller that lists/exports snapshots for that PR instead of recovering from the bad tail record.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/doghouse/adapters/storage/jsonl_adapter.py` around lines 30 - 45, save_snapshot currently appends directly and can leave a truncated JSON line; make the append resilient by flushing and syncing the file after write (call f.flush() and os.fsync(f.fileno())) in save_snapshot (function save_snapshot) so the record is durably written; also make list_snapshots robust by catching json.JSONDecodeError when parsing lines in list_snapshots and ignoring a single trailing malformed line (log or silently break) instead of failing the entire load so Snapshot.from_dict(json.loads(...)) won't crash the whole listing when the last record is partial.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/workflows/ci.yml:
- Around line 18-25: Add a pip cache step before the "Install" step in the CI
job that uses actions/cache (e.g., actions/cache@v4) to cache pip's
download/cache directory (path ~/.cache/pip) and use a stable cache key tied to
the repo's dependency lockfile(s) (for example using python-${{ runner.os
}}-pip-${{ hashFiles('**/poetry.lock','**/requirements.txt','**/pyproject.toml')
}}) with an appropriate restore-keys fallback; place this step immediately after
the actions/setup-python@v5 step so the "Install" step reuses the cached
packages and avoids re-downloading every run.
- Around line 19-21: CI currently pins actions/setup-python@v5 to
python-version: '3.12' while pyproject.toml declares support for Python 3.11;
update the workflow to test the claimed versions by replacing the single
python-version entry with a matrix (e.g., python-version: [ '3.11', '3.12' ]) in
the actions/setup-python step, or alternatively change the pyproject.toml
classifiers to only list 3.12 if you intend to drop 3.11 support; ensure the
step using actions/setup-python@v5 references the matrix variable
(matrix.python-version) so both versions run in CI.
In `@tests/doghouse/test_repo_context.py`:
- Line 6: Remove the unused MagicMock import from the import statement in the
test file: update the import line that currently reads "from unittest.mock
import patch, MagicMock" to only import the symbols actually used (e.g.,
"patch"), ensuring there are no unused imports remaining; search for references
to MagicMock in the file (e.g., in test functions) and if none exist, safely
delete MagicMock from the import list.
---
Duplicate comments:
In @.github/workflows/ci.yml:
- Around line 5-7: Fix YAML lint errors by removing the extra spaces inside the
array entries for both the branches key and the pull_request->branches key
(change `[ main ]` to `[main]`) and, if present, quote the top-level on key as
'on' to avoid truthy warnings; update the branches and pull_request definitions
accordingly so the YAML linter stops flagging spacing/truthy issues.
In `@CHANGELOG.md`:
- Line 7: MD022 failures are caused by missing blank lines after the heading
tokens; insert a single blank line immediately after each "### Added", "###
Fixed", and "### Tests" heading so the heading is followed by an empty line
before the list content (ensure the same fix is applied to the other occurrences
noted by the reviewer).
In `@Makefile`:
- Around line 1-7: Add an explicit all target to the Makefile so checkmake stops
warning: create a top-level target named all that depends on help (or set it to
.DEFAULT_GOAL = help), referencing the existing help target; update the Makefile
so the first declared target is all (or set .DEFAULT_GOAL) and leave help as-is
to preserve current behavior.
In `@src/doghouse/adapters/git/git_adapter.py`:
- Around line 42-49: The current code maps any nonzero result from the git
rev-list call (variable unpushed_res) to a LOCAL_UNPUSHED blocker; change this
to inspect the command stderr for upstream-specific text (e.g., substrings like
"no upstream", "no upstream configured for branch", or the exact git wording)
and only append the Blocker with id "local-no-upstream" / type
BlockerType.LOCAL_UNPUSHED when that message is present; for other failures
(permission issues, corrupt repo, other git errors) create a distinct blocker
(e.g., id "git-rev-list-failed" / type a suitable error type) with severity
ERROR and include the stderr content in the message so diagnostics are
preserved; ensure you read/compare unpushed_res.stderr as text (decode if bytes)
rather than relying solely on returncode.
In `@src/doghouse/adapters/github/gh_cli_adapter.py`:
- Around line 112-129: In the loop over data.get("statusCheckRollup", [])
(inside gh_cli_adapter.py) avoid collapsing distinct unnamed checks by not using
the literal "unknown" for check_name in the Blocker id; instead either skip
entries that lack both context and name or generate a stable unique fallback
(e.g., incorporate a hash or index) so the Blocker id (f"check-{check_name}") is
unique per rollup entry; update the logic that sets check_name and the Blocker
id creation accordingly to ensure RecorderService doesn't dedupe distinct checks
into "check-unknown".
- Around line 60-87: The GraphQL query in gql_query and the call to _run_gh_json
only fetch the first 100 reviewThreads and never follows cursors, so threads can
be silently dropped for large PRs; modify gql_query to request
reviewThreads(first: $first, after: $after) including pageInfo { endCursor
hasNextPage } and the nodes, then implement a loop around the _run_gh_json call
that supplies variables for first (e.g., 100) and after (initially null),
accumulates nodes into threads, updates after with endCursor, and breaks when
hasNextPage is false; ensure the subsequent calls set the -F param for after and
preserve with_repo=False and actual_pr_id/name/owner variables used today.
In `@src/doghouse/adapters/storage/jsonl_adapter.py`:
- Around line 30-45: save_snapshot currently appends directly and can leave a
truncated JSON line; make the append resilient by flushing and syncing the file
after write (call f.flush() and os.fsync(f.fileno())) in save_snapshot (function
save_snapshot) so the record is durably written; also make list_snapshots robust
by catching json.JSONDecodeError when parsing lines in list_snapshots and
ignoring a single trailing malformed line (log or silently break) instead of
failing the entire load so Snapshot.from_dict(json.loads(...)) won't crash the
whole listing when the last record is partial.
In `@src/doghouse/cli/main.py`:
- Around line 791-798: The watch command accepts an interval that can be zero or
negative; add an explicit validation in the watch function (after
resolve_repo_context and before starting the polling loop) to reject
non-positive values and fail fast—for example, if interval <= 0 then raise
typer.BadParameter("--interval must be a positive integer") or call
typer.echo(...) and raise typer.Exit(1); update the watch function (and any
startup path that consumes interval) to ensure the check runs before any
time.sleep or polling begins.
- Around line 601-606: RecorderService is inadvertently mixing local checkout
data into remote repo work because you instantiate a plain GitAdapter() and let
RecorderService call GitAdapter.get_local_blockers() and export() which shell
out to the current working tree; fix by passing a repo-scoped GitAdapter into
RecorderService (or have GhCliAdapter produce a repo-bound git adapter) so that
RecorderService.record_sortie and export use that repo-scoped adapter instead of
a default GitAdapter(); update the instantiation (where
GhCliAdapter(repo_owner..., repo_name...), RecorderService(github, storage,
engine, git=GitAdapter()) is created) and ensure the code paths at record_sortie
and export consume the injected git adapter rather than calling a global or
default GitAdapter methods.
- Around line 581-590: The code in main.py currently auto-detects missing
repo/pr via _auto_detect_repo_and_pr and silently converts a slashless repo
("foo") into "foo/foo"; instead, stop inventing context: if repo or pr is None,
raise a clear usage error asking the user to pass --repo and --pr (do not call
_auto_detect_repo_and_pr to fill them), and validate that repo contains a "/"
(if not, raise an error instead of splitting into owner/name); update the block
around repo/pr handling (the repo, pr variables and the repo "/" split logic) to
enforce these checks and error messages.
In `@src/doghouse/core/domain/delta.py`:
- Around line 16-18: The head_changed property currently returns True when
baseline_sha is None, causing a false-positive on first capture; update the
head_changed getter in the Delta class (head_changed, referencing baseline_sha
and current_sha) to only compare SHAs if baseline_sha is not None — i.e., return
False when baseline_sha is None, otherwise return baseline_sha != current_sha.
In `@src/doghouse/core/services/delta_engine.py`:
- Around line 11-20: The Delta constructor is receiving a direct reference to
Snapshot.blockers, leaking mutable state; change the baseline-none branch in
compute to pass a defensive copy (e.g., use list(current.blockers)) into the
Delta.added_blockers field (and similarly ensure any other Delta fields that
receive existing lists are copied) so mutating Delta.added_blockers cannot
mutate the Snapshot.blockers; locate the instantiation of Delta (symbol:
Delta(... added_blockers=current.blockers ...)) and replace the passed list with
a shallow copy.
In `@src/doghouse/core/services/recorder_service.py`:
- Around line 55-60: The Snapshot timestamp is created with a naive datetime
(datetime.datetime.now()) which should be timezone-aware; update the timestamp
assignment when constructing current_snapshot to use a tz-aware UTC datetime
(e.g. datetime.datetime.now(datetime.timezone.utc) or equivalent) so
Snapshot.timestamp is unambiguous across timezones, and adjust imports if
necessary to reference datetime.timezone.
In `@tests/doghouse/test_blocker_semantics.py`:
- Around line 85-94: The test test_review_required_is_warning_not_blocker
currently asserts a value it constructed and is therefore a tautology; replace
it with an adapter-layer test that exercises the actual mapping in GhCliAdapter
(the method that converts a review decision/payload into a Blocker) by feeding a
real review-decision payload fixture and asserting the produced Blocker has id
"review-required" (or BlockerType.NOT_APPROVED as applicable) and severity
BlockerSeverity.WARNING; locate the adapter mapping method in GhCliAdapter (the
function that returns Blocker instances from GitHub review payloads) and call
that to build the Blocker instead of constructing Blocker(...) directly, then
assert on the resulting Blocker.severity.
In `@tests/doghouse/test_repo_context.py`:
- Around line 19-23: The test unpacks four values from resolve_repo_context into
repo, owner, name, pr but only asserts owner and name; rename the unused
variables to underscore-prefixed names (e.g., _repo and _pr) or include
assertions for repo and pr to avoid unused-variable warnings; update the
unpacking in test_resolve_parses_owner_name_from_repo_string accordingly to use
_repo and _pr or add explicit asserts for repo and pr while keeping the checks
for owner and name.
- Around line 38-46: The test test_resolve_auto_detects_pr_only currently mocks
_auto_detect_repo_and_pr but never asserts it was invoked; update the test to
assert the mock was called (e.g., mock_detect.assert_called_once()) after
calling resolve_repo_context("my/repo", None) to ensure _auto_detect_repo_and_pr
was actually used during PR detection.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 8f0a8be7-29b8-41e2-a09d-c5fb6639d833
📒 Files selected for processing (17)
.github/workflows/ci.ymlCHANGELOG.mdMakefiledocs/code-reviews/PR5/03e8896e0554bc4c5f54a2f68a17fdc1b183d55b.mdpyproject.tomlsrc/doghouse/adapters/git/git_adapter.pysrc/doghouse/adapters/github/gh_cli_adapter.pysrc/doghouse/adapters/storage/jsonl_adapter.pysrc/doghouse/cli/main.pysrc/doghouse/core/domain/blocker.pysrc/doghouse/core/domain/delta.pysrc/doghouse/core/domain/snapshot.pysrc/doghouse/core/ports/git_port.pysrc/doghouse/core/services/delta_engine.pysrc/doghouse/core/services/recorder_service.pytests/doghouse/test_blocker_semantics.pytests/doghouse/test_repo_context.py
- CI: add pip caching and Python 3.11+3.12 matrix - CI: tighten publish.yml tag pattern (v[0-9]+ not v[0-9]*) - Tests: remove unused MagicMock import, underscore unused vars - Docs: remove contradictory "Initial Entries: (none yet)" from PRODUCTION_LOG.mg - Docs: add MD022-required blank lines after CHANGELOG subsection headings
|
@coderabbitai review please 🥕 |
|
🧠 Learnings used✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 10
♻️ Duplicate comments (1)
.github/workflows/publish.yml (1)
8-10:⚠️ Potential issue | 🟠 Major
id-token: writeat workflow level exposes OIDC to the build job. Still not addressed.This was flagged previously and remains unfixed. The
buildjob executespip install hatchandhatch build— arbitrary code from PyPI. Granting itid-token: writemeans a compromised build dependency could mint OIDC tokens. Thepublishjob is the only legitimate consumer of this permission.Move
id-token: writeto job-level permissions onpublishonly. Thebuildjob should inherit onlycontents: read.,
🔒 Scope OIDC token to publish job only
permissions: contents: read - id-token: write jobs: build: runs-on: ubuntu-latest + permissions: + contents: read steps: ... publish: needs: build runs-on: ubuntu-latest environment: pypi + permissions: + contents: read + id-token: write steps:🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/publish.yml around lines 8 - 10, Top-level workflow permissions currently include "id-token: write" which exposes OIDC to every job; remove "id-token: write" from the global permissions block and add it only to the publish job's permissions. Keep the build job's permissions limited to "contents: read" (so it inherits only read access) and in the publish job add both "contents: read" and "id-token: write" (or explicitly add id-token: write if contents is already inherited) to ensure only the publish job can mint OIDC tokens.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/workflows/ci.yml:
- Around line 17-19: The CI matrix currently uses the default fail-fast
behavior; add an explicit fail-fast setting to the strategy block to avoid
cancelling remaining matrix jobs on the first failure. Under the existing
strategy: matrix: python-version: ['3.11','3.12'] add fail-fast: false so the
matrix entries (python-version) will all run and provide per-version failure
visibility.
- Around line 15-16: The CI workflow job "test" currently has no timeout and can
run indefinitely; add a "timeout-minutes" key under the "test" job definition
(next to "runs-on") with a reasonable limit (e.g., 10 or 30) to prevent runaway
tests from consuming CI minutes; ensure the key is placed at the same
indentation level as "runs-on" in the "test" job block so GitHub Actions
recognizes it.
In @.github/workflows/publish.yml:
- Around line 13-14: The workflow's jobs (e.g., the build job named "build" and
the publish job referenced around lines 29-31) are missing explicit timeouts;
add a timeout-minutes setting to each job declaration (for example under the job
keys "build" and "publish") to limit how long the job can run and avoid relying
on GitHub's 6-hour default—choose an appropriate minute value for each job based
on expected run-time and add timeout-minutes: <number> directly under the job
definition in the publish.yml file.
- Around line 20-21: The GitHub Actions step that currently runs "pip install
hatch" (the "Install hatch" run step) must be changed to pin the hatch package
to a fixed stable release; update that run step so it installs hatch version
1.16.5 instead of unpinned latest to mitigate supply-chain risk.
In `@CHANGELOG.md`:
- Around line 53-59: The seven consecutive bullets all start with "Added", which
reads monotonously; edit the listed bullets so a few use varied lead-ins (for
example replace some "Added ..." with "Covers ...", "Includes ...", "Verifies
...", or "Adds tests for ...") while preserving the exact meaning and test scope
for each item (the blocker-semantics, repo-context consistency, watch
persistence, snapshot equivalence, packaging smoke, severity rank ordering, and
theatrical verdict bullets); keep punctuation and tone consistent with the rest
of the changelog.
- Around line 39-49: The changelog contains overlapping bullets that repeat the
same fix areas; consolidate duplicate entries into one canonical bullet per area
by merging related lines such as "CI Permissions" and "CI/CD Security" into a
single CI/security bullet, combine "Publishing Hygiene" with "Refined tag
patterns and split build/publish steps" into one publishing bullet, merge "Error
Handling" with "Hardened subprocess calls..." into one error-handling bullet,
and merge "Import Paths" with "Fixed packaging bugs..." into a single
import/packaging bullet; update the remaining bullets (e.g., "Unused Imports",
"Modern Type Syntax", "Missing Import", "Core Immutability", "Deterministic
Delta", "Docs Drift") to be concise and non-overlapping so each fix area appears
once and preserves the original intent.
In `@PRODUCTION_LOG.mg`:
- Line 44: Replace the suspicious placeholder line "Timestamp: 2025-11-08
00:00:00" in PRODUCTION_LOG.mg with either the accurate recorded timestamp or an
explicit estimated marker; for example update that line to "Timestamp:
2025-11-08 ~00:00:00 (estimated; exact time not recorded)" or the correct ISO
timestamp, and ensure future entries follow the same convention so the
"Timestamp:" field never masquerades as precise when it is not.
- Line 31: Normalize the task identifier format across the production log by
converting all task entries to the canonical "DP-F-XX / Description" form;
update the existing entries "DP-F-20 / Sprint 0 planning", "DP-F-30 / Thread
verbs + Debug LLM (tests-first)", "DP-F-21" and "Refinement & CodeRabbit
Feedback" so each includes the DP-F-## ID followed by a concise description
(e.g., change "DP-F-21" to "DP-F-21 / <Description>" and "Refinement &
CodeRabbit Feedback" to "DP-F-?? / Refinement & CodeRabbit Feedback"), and
enforce this template for future entries in the PRODUCTION_LOG.mg task field.
- Around line 57-68: Add the missing mandatory "What could we have done
differently" section to the "Incident: Doghouse Reboot (The Great Pivot)" entry
and populate it with concrete, short learnings and action items; also update the
Timestamp under that incident from "2026-03-27" to the full required format
"YYYY-MM-DD HH:MM:SS local" (e.g., 2026-03-27 14:23:00 local) so the entry
complies with the template contract and includes precise timing information.
In `@tests/doghouse/test_repo_context.py`:
- Around line 10-65: Add a unit test to tests/doghouse/test_repo_context.py that
covers the edge case where the repo string has no slash: create a test function
(e.g., test_resolve_handles_repo_without_slash) that calls
resolve_repo_context("widgets", 1) and asserts repo == "widgets", owner ==
"widgets", name == "widgets", and pr == 1 so the fallback branch in
resolve_repo_context is exercised and protected from regressions.
---
Duplicate comments:
In @.github/workflows/publish.yml:
- Around line 8-10: Top-level workflow permissions currently include "id-token:
write" which exposes OIDC to every job; remove "id-token: write" from the global
permissions block and add it only to the publish job's permissions. Keep the
build job's permissions limited to "contents: read" (so it inherits only read
access) and in the publish job add both "contents: read" and "id-token: write"
(or explicitly add id-token: write if contents is already inherited) to ensure
only the publish job can mint OIDC tokens.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 17fc907b-e106-45ab-9999-f900ea3f7738
📒 Files selected for processing (6)
.github/workflows/ci.yml.github/workflows/publish.ymlCHANGELOG.mdPRODUCTION_LOG.mgdocs/code-reviews/PR5/60d0717b54c26fda363c9294750a9eb68f9d2820.mdtests/doghouse/test_repo_context.py
- Scope id-token:write to publish job only (build job no longer has OIDC access — supply chain hardening) - Pin hatch dependency to >=1.21,<2 - Add timeout-minutes to all CI/publish jobs - Add fail-fast: false to test matrix for full visibility
- Add missing "What could we have done differently" sections to incidents 3 and 4 (mandatory per template) - Add estimated timestamps where only dates were recorded - Mark approximate midnight timestamp as estimated - Normalize task ID format (DP-F-XX / context) across all incidents
- CHANGELOG: deduplicate overlapping Fixed bullets; vary test lead-ins - Tests: add edge-case test for repo string without slash
|
@coderabbitai review please 🥕 |
|
🧠 Learnings used✅ Actions performedReview triggered.
|
- Fix head_changed returning True on initial snapshot (baseline_sha is None) - Use timezone-aware datetime.now(utc) in recorder_service - Reject non-positive watch intervals (guard interval >= 1) - Distinguish "no upstream" from other git rev-list failures - Lock fixture decoding to UTF-8 in playback_service - Skip corrupt JSONL lines instead of crashing the reader - Use .get() for thread comment ID (defensive against malformed API responses) - Flatten nested pending-check condition (SIM102) - Document export git-log limitation as known behavior - Add tests: no-baseline delta, roundtrip serialization, is_primary equivalence, message-only equivalence, message-only persistence skip, entry-point rename - Add return type annotation to test helper 48 tests green.
- MD022: Add blank lines around headings in archived docs, README,
SECURITY.md, SPRINTS.md, FEATURES.md, and mind/* docs
- MD029: Normalize ordered list numbering in DRIFT_REPORT.md
- MD031: Add blank lines around fenced code blocks in TECH-SPEC.md
- MD040: Add language tags to bare fences in SPEC.md
- Replace literal {response} placeholders in review artifact
- Note DP-US-0201 cross-reference as intentional in FEATURES.md
|
@coderabbitai review please 🥕 |
|
🧠 Learnings used✅ Actions performedReview triggered.
|
Summary
CHANGES_REQUESTED) separated from unresolved thread state. Stale review decisions no longer masquerade as active blockers. Verdict priority chain fixed (was dead code due tois_primarydefault). Approval-needed verdict added at Priority 4.watchandexportnow honor--repo owner/namevia centralizedresolve_repo_context(). Previously silently queried the wrong repository.pyproject.tomlreadme path (cli/README.md→README.md). Editable install works.record_sortie()skips persistence when nothing meaningful changed (same head SHA + same blocker set). Sorties are episodes, not heartbeats.Blockerimport torecorder_service.py(blocker merge would crash at runtime).Test plan
pytest tests/doghouse -v— 40 passedpip install -e .— editable install succeedsdoghouse --help— CLI entry point works