-
Notifications
You must be signed in to change notification settings - Fork 0
Harden Doghouse trust, correctness, and character #5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
+48,843
−7,405
Merged
Changes from all commits
Commits
Show all changes
69 commits
Select commit
Hold shift + click to select a range
137e316
tests: scaffold failing CLI tests (version and PR list formatting)
flyingrobots 8a524f9
feat(cli): minimal Python entrypoint with fake GH adapter; implements…
flyingrobots e8c6eb2
tests: enforce no absolute path literals within repo (mac/home/window…
flyingrobots 25531f6
tests(core): add LoggingPort contract and non-JSON LLM handling tests…
flyingrobots 501c8db
core+ports: add LoggingPort, LlmPort, GitPort; implement process_comm…
flyingrobots 79751f4
tests(adapters): add failing tests for LLM command builder (codex/cla…
flyingrobots 69998cc
adapters(llm): implement env-driven command builder + runner with cla…
flyingrobots 4de0bec
tests(config+llm): config path under ~/.draft-punks/{repo}/config.jso…
flyingrobots e894fb3
adapters(config): read ~/.draft-punks/{repo}/config.json for llm/llm_…
flyingrobots cabae85
tests(github): add failing test to flatten paged review threads and c…
flyingrobots ece094f
core+ports+fake: add GitHubPort, domain types, fake paging adapter, a…
flyingrobots d12c5e4
docs: add examples/config.sample.json with llm, llm_cmd, force_json, …
flyingrobots 1ab66a5
feat(voice): add OSX say adapter + bonus mode enable service; hidden …
flyingrobots 56fb62f
voice(scope): add speak_comment_if_allowed enforcing read_scope=coder…
flyingrobots a84772e
feat: add gh CLI adapter with paging, Rich logger, minimal Textual TU…
flyingrobots 36d60dc
tui: add comment viewer with BunBun voice; gh adapter includes author…
flyingrobots 2eea504
tui+gh: PR picker lists open PRs; selection opens comment viewer; gh …
flyingrobots f4463b9
tui: add Textual log panel and progress; PR picker uses gh list_open_…
flyingrobots 65fee43
tui: add summary/push, help, rewrite via ; git port push methods; gh …
flyingrobots b5f54f0
build: add textual runtime dep; dev extras with pytest
flyingrobots 9e1ccac
tests(git): temp-repo test for is_commit/current/upstream/push
flyingrobots d283519
tests(gh): ensure adapters handle invalid/empty JSON without crashing
flyingrobots 1c59181
tui: header counters with %; autos [AUTO] badges in list
flyingrobots 193fa7d
feat(suggest): parse & apply CodeRabbit suggested replacements; tests…
flyingrobots b02ce02
feat(reply): gh post_reply; TUI thread replies on success when enable…
flyingrobots fef61aa
docs(cli): TUI README in PhiedBach voice (quickstart, keys, config)
flyingrobots 5aa7f98
tui: rewrite comments viewer stable (header, modal, autos, apply sugg…
flyingrobots 979229d
build: add requests for HTTP GitHub adapter
flyingrobots e280fc6
feat(gh-http): HTTP GitHub adapter using GH_TOKEN; selector tests for…
flyingrobots 9d04e22
feat(gh): auto-select HTTP adapter when GH_TOKEN present; fallback to…
flyingrobots 01697c8
gatos/git-mind: ref-native snapshot engine + JSONL API (v0.1)\n\n- Ad…
flyingrobots a4157a9
draft-punks: TUI compatibility & CLI polish\n\n- Textual OptionList c…
flyingrobots e0fe1fd
tools: bootstrap-git-mind exporter + Makefile target\n\n- Add tools/b…
flyingrobots 9971333
docs: add STORY.md — origin, rationale, and forward vision for GATOS …
flyingrobots e49f88b
docs: add IDEAS.md — git-message-bus, git chat, consensus/grants, CRD…
flyingrobots f93ed92
tests(git_mind): failing tests for JSONL thread verbs (list/select/sh…
flyingrobots 3d91ef3
feat(git_mind): add thread.list/select/show and llm.send debug path (…
flyingrobots 39af47b
docs: document thread.* and llm.send debug in CLI-STATE; log pytest e…
flyingrobots 6d59539
docs(ideas): add Git‑backed Redis concept — ref layout, semantics, TT…
flyingrobots 60f5b36
docs: integration plan — align GATOS git mind with git-kv (Stargate):…
flyingrobots fe78107
docs(doghouse): seed flight recorder design brief
flyingrobots d1308c7
feat(doghouse): reboot project into doghouse flight recorder engine\n…
flyingrobots 5145bd9
docs(doghouse): embellish identity and lore\n\n- Restore and enhance …
flyingrobots 6f07c95
docs(readme): add physical humor and finishing touches\n\n- Add Phied…
flyingrobots aee587e
feat(doghouse): implement radar, local state, and blocking matrix\n\n…
flyingrobots 4f339fb
opus(PR#5): seed rehearsal score (aee587e7aad9af37f73dd997dfbdef8dcbb…
56964e6
fix(doghouse): correct GitAdapter import path and add missing __init_…
flyingrobots cfcc3ee
opus(PR#5): seed rehearsal score (56964e6b72bbe7639f9c725c6e9f2327f75…
6d8640d
Fix: address CodeRabbit feedback and Doghouse dogfooding issues\n\n- …
flyingrobots d4def97
opus(PR#5): seed rehearsal score (6d8640d23be73ee61c9b962f90a4141768a…
939dfd6
fix(doghouse): harden trust and correctness across recorder stack
flyingrobots 0d09a5a
feat(doghouse): give PhiedBach and BunBun their missing moments
flyingrobots 7c1c88a
feat(doghouse): randomize all character dialog with 5 variations each
flyingrobots 6eb5f85
feat(doghouse): add closing scenes set at the doghouse aerodrome
flyingrobots 55095b0
Merge remote-tracking branch 'origin/feat/doghouse-reboot' into feat/…
flyingrobots 13388de
opus(PR#5): seed rehearsal score (55095b07e382e97bbf3a1e695ebffa01017…
03e8896
fix(doghouse): resolve 28 code review findings from self-review
flyingrobots c24784f
opus(PR#5): seed rehearsal score (03e8896e0554bc4c5f54a2f68a17fdc1b18…
ee55503
fix: address remaining CodeRabbit review findings
flyingrobots 60d0717
Merge remote-tracking branch 'origin/feat/doghouse-reboot' into feat/…
flyingrobots 95f450a
opus(PR#5): seed rehearsal score (60d0717b54c26fda363c9294750a9eb68f9…
199c784
fix(ci): harden workflow security and reliability
flyingrobots 542b760
fix(docs): bring PRODUCTION_LOG into template compliance
flyingrobots b556fb3
fix: address CodeRabbit round 2 nits
flyingrobots f95479f
Merge remote-tracking branch 'origin/feat/doghouse-reboot' into feat/…
flyingrobots a2d87d5
opus(PR#5): seed rehearsal score (f95479fe64543984c4151e40dbf3b880004…
6264881
fix(doghouse): resolve remaining CodeRabbit code review findings
flyingrobots e8d97fa
fix(docs): normalize markdown lint across all docs
flyingrobots 9a094ac
opus(PR#5): seed rehearsal score (e8d97fa14bf033ecf3ef3a85603c8816936…
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,34 @@ | ||
| name: CI | ||
|
|
||
| on: | ||
| push: | ||
| branches: [main] | ||
| pull_request: | ||
| branches: [main] | ||
|
|
||
| permissions: | ||
| contents: read | ||
| pull-requests: read | ||
|
|
||
| jobs: | ||
|
|
||
| test: | ||
| runs-on: ubuntu-latest | ||
| timeout-minutes: 10 | ||
| strategy: | ||
| fail-fast: false | ||
| matrix: | ||
| python-version: ['3.11', '3.12'] | ||
coderabbitai[bot] marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| steps: | ||
| - uses: actions/checkout@v4 | ||
| - uses: actions/setup-python@v5 | ||
| with: | ||
| python-version: ${{ matrix.python-version }} | ||
| cache: 'pip' | ||
| - name: Install | ||
| run: | | ||
| python -m pip install --upgrade pip | ||
| pip install -e .[dev] | ||
coderabbitai[bot] marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| - name: Run tests | ||
| run: | | ||
| pytest -q | ||
flyingrobots marked this conversation as resolved.
Fixed
Show fixed
Hide fixed
|
||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,43 @@ | ||
| name: Publish | ||
|
|
||
| on: | ||
| push: | ||
| tags: | ||
| - 'v[0-9]+.[0-9]+.[0-9]+' | ||
|
|
||
| permissions: | ||
| contents: read | ||
|
|
||
| jobs: | ||
| build: | ||
| runs-on: ubuntu-latest | ||
coderabbitai[bot] marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| timeout-minutes: 10 | ||
| steps: | ||
| - uses: actions/checkout@v4 | ||
| - uses: actions/setup-python@v5 | ||
| with: | ||
| python-version: "3.12" | ||
| - name: Install hatch | ||
| run: pip install 'hatch>=1.21,<2' | ||
| - name: Build package | ||
| run: hatch build | ||
| - uses: actions/upload-artifact@v4 | ||
| with: | ||
| name: dist | ||
| path: dist/ | ||
|
|
||
| publish: | ||
| needs: build | ||
| runs-on: ubuntu-latest | ||
| timeout-minutes: 5 | ||
| environment: pypi | ||
| permissions: | ||
| contents: read | ||
| id-token: write | ||
| steps: | ||
| - uses: actions/download-artifact@v4 | ||
| with: | ||
| name: dist | ||
| path: dist/ | ||
| - name: Publish to PyPI | ||
| uses: pypa/gh-action-pypi-publish@release/v1 | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,52 @@ | ||
| # Changelog | ||
|
|
||
| All notable changes to this project will be documented in this file. | ||
|
|
||
| ## [Unreleased] | ||
|
|
||
| ### Added | ||
|
|
||
| - **Doghouse Flight Recorder**: A new agent-native engine for PR state reconstruction. | ||
| - **CLI Subcommands**: `snapshot`, `watch`, `playback`, `export`. | ||
| - **Blocking Matrix**: Logic to distinguish merge conflicts from secondary blockers. | ||
| - **Local Awareness**: Detection of uncommitted/unpushed local repository state. | ||
| - **Machine-Readable Output**: `--json` flag on `snapshot` for Thinking Automatons. | ||
| - **Repro Bundles**: `export` command to create "Manuscript Fragments" for debugging. | ||
| - **Snapshot Equivalence**: `Snapshot.is_equivalent_to()` for meaningful-change detection. | ||
|
|
||
| ### Fixed | ||
|
|
||
| - **Merge-Readiness Semantics**: Formal approval state (`CHANGES_REQUESTED`, `REVIEW_REQUIRED`) is now separated from unresolved thread state. Stale `CHANGES_REQUESTED` no longer masquerades as active unresolved work when all threads are resolved. | ||
| - **Verdict Priority Chain**: Fixed dead-code bug where `is_primary` default caused Priority 0 to swallow all BLOCKER-severity items. Merge-conflict check now uses explicit type match. Added approval-needed verdict at Priority 4. | ||
| - **Repo-Context Consistency**: `watch` and `export` now honor `--repo owner/name` via centralized `resolve_repo_context()`. Previously they silently ignored `--repo` and queried the wrong repository. | ||
| - **Packaging**: Fixed `pyproject.toml` readme path (`cli/README.md` → `README.md`). Editable install now works. | ||
| - **Watch Snapshot Spam**: `record_sortie()` no longer persists duplicate snapshots on identical polls. Only meaningful state transitions (head SHA change, blocker set change) create new ledger entries. | ||
| - **Severity Comparison Bug**: Blocker merge logic used alphabetical string comparison on enum values, causing BLOCKER to rank below WARNING. Now uses explicit numeric `rank` property. | ||
| - **Architecture Violation**: `RecorderService` no longer imports from the adapter layer. New `GitPort` ABC in `core/ports/`; `GitAdapter` implements it; callers provide the concrete adapter. | ||
| - **Dead Makefile Target**: Removed non-existent `history` command from Makefile. | ||
| - **Empty PR ID Args**: `gh pr view ""` replaced with conditional arg construction (omit pr_id when None). | ||
| - **Fragile Check Names**: Status checks with no `context` or `name` now default to `"unknown"` instead of producing `check-None` collisions. | ||
| - **Variable Shadowing**: Local `snapshot` variable in the `snapshot()` function no longer shadows the function name. | ||
| - **Mid-Module Imports**: `PlaybackService`, `Path`, `time` moved to top-of-file imports. | ||
| - **Missing Timeouts**: All `subprocess.run` calls in `GitAdapter` and `export` now have timeouts. | ||
| - **Bare Except**: GraphQL thread fetch now catches specific exceptions instead of bare `Exception`. | ||
| - **Repo Name Validation**: Storage adapter validates repo names against `[\w.-]+` pattern. | ||
| - **Resolve Truthiness**: `resolve_repo_context` uses `is None` checks instead of falsy checks. | ||
| - **Export Absolute Path**: Export now prints the absolute path of the repro bundle. | ||
| - **Blocker Metadata Copy**: `Blocker.__post_init__` now defensively copies `metadata` dict. | ||
| - **Domain Purity**: `verdict_display` and all randomized variation lists moved from domain layer to CLI presentation layer. | ||
| - **Unused Dependencies**: Removed `requests` and `textual` from `pyproject.toml`. | ||
| - **CI/CD Hardening**: Scoped `id-token:write` to publish job only; added job timeouts and `fail-fast: false`; pinned hatch; reduced `pull-requests` to read; tightened tag pattern. | ||
| - **Code Hygiene**: Removed unused imports across domain and adapter modules; modernized type annotations to `list`/`dict`/`X | None` syntax; added `Blocker` import to `recorder_service.py`. | ||
| - **Core Immutability**: Snapshot and Blocker objects own defensive copies of mutable data. | ||
| - **Deterministic Delta**: Sorted blocker IDs for stable output across runs. | ||
| - **Docs Drift**: Archived legacy TUI documentation; brought PRODUCTION_LOG incidents into template compliance. | ||
|
|
||
| ### Tests | ||
|
|
||
| - Covers blocker-semantics interactions (review/thread, verdict priority chain, severity ranking). | ||
| - Verifies repo-context consistency (all commands use `resolve_repo_context`). | ||
| - Pins watch persistence behavior (dedup on identical polls, persist on meaningful change). | ||
| - Validates snapshot equivalence and blocker signature. | ||
| - Includes packaging smoke tests (readme path, metadata, entry point). | ||
| - Exercises theatrical verdict variations from CLI presentation layer. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,42 @@ | ||
| .PHONY: dev-venv test snapshot playback watch export clean help | ||
|
|
||
| VENV = .venv | ||
| PYTHON = $(VENV)/bin/python3 | ||
| PIP = $(VENV)/bin/pip | ||
|
|
||
| help: | ||
| @echo "Doghouse Makefile" | ||
| @echo " dev-venv: Create venv and install dependencies" | ||
| @echo " test: Run unit tests" | ||
| @echo " snapshot [PR=id]: Capture PR state" | ||
| @echo " playback NAME=name: Run a playback fixture" | ||
| @echo " watch [PR=id]: Monitor PR live" | ||
| @echo " export [PR=id]: Create repro bundle" | ||
|
|
||
| dev-venv: | ||
| python3 -m venv $(VENV) | ||
| $(PIP) install --upgrade pip | ||
| $(PIP) install -e .[dev] | ||
|
|
||
| test: | ||
| PYTHONPATH=src $(PYTHON) -m pytest tests/doghouse | ||
|
|
||
| snapshot: | ||
| @if [ -z "$(PR)" ]; then PYTHONPATH=src $(PYTHON) -m doghouse.cli.main snapshot; \ | ||
| else PYTHONPATH=src $(PYTHON) -m doghouse.cli.main snapshot --pr $(PR); fi | ||
|
|
||
| playback: | ||
| @if [ -z "$(NAME)" ]; then echo "Usage: make playback NAME=pb1_push_delta"; exit 1; fi | ||
| PYTHONPATH=src $(PYTHON) -m doghouse.cli.main playback $(NAME) | ||
|
|
||
| watch: | ||
| @if [ -z "$(PR)" ]; then PYTHONPATH=src $(PYTHON) -m doghouse.cli.main watch; \ | ||
| else PYTHONPATH=src $(PYTHON) -m doghouse.cli.main watch --pr $(PR); fi | ||
|
|
||
| export: | ||
| @if [ -z "$(PR)" ]; then PYTHONPATH=src $(PYTHON) -m doghouse.cli.main export; \ | ||
| else PYTHONPATH=src $(PYTHON) -m doghouse.cli.main export --pr $(PR); fi | ||
|
|
||
| clean: | ||
| rm -rf build/ dist/ *.egg-info | ||
| find . -type d -name "__pycache__" -exec rm -rf {} + |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,85 @@ | ||
| # Draft Punks — Production Log | ||
|
|
||
| Guideline: Append an entry for any unexpected/unanticipated work, dependency, requirement, or risk we discover during implementation and testing. | ||
|
|
||
| Template | ||
|
|
||
| ````markdown | ||
| ## Incident: <title> | ||
|
|
||
| Timestamp: <YYYY-MM-DD HH:MM:SS local> | ||
|
|
||
| Task: <current task id> | ||
|
|
||
| ### Problem | ||
|
|
||
| <problem description> | ||
|
|
||
| ### Resolution | ||
|
|
||
| <resolution> | ||
|
|
||
| ### What could we have done differently | ||
|
|
||
| <how could this have been anticipated? how should we have planned for this? what can we do better next time to avoid this sort of issue again?> | ||
| ```` | ||
|
|
||
| ## Incident: Product Pivot to CLI-Only (Git-backed State) | ||
|
|
||
| Timestamp: 2025-11-07 19:07:32 | ||
|
|
||
| Task: DP-F-20 / Sprint 0 planning | ||
coderabbitai[bot] marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| ### Problem | ||
| TUI cannot be driven programmatically in our harness and is slower to iterate for both humans and LLMs. | ||
|
|
||
| ### Resolution | ||
| Pivot to a CLI-only experience with a Git-backed state repo and JSONL stdio server. Update SPRINTS.md, add CLI-STATE.md, and refocus FEATURES/TASKLIST over time. | ||
|
|
||
| ### What could we have done differently | ||
| Call out environment constraints earlier and consider dual-mode from day one. Favor CLI-first for automation-heavy tools; treat TUI as an optional skin over the same state engine. | ||
|
|
||
| ## Incident: Local test runner missing (pytest not installed) | ||
|
|
||
| Timestamp: 2025-11-08 ~00:00:00 (estimated; exact time not recorded) | ||
|
|
||
| Task: DP-F-30 / Thread verbs + Debug LLM (tests-first) | ||
|
|
||
| ### Problem | ||
| The environment lacks `pytest`, so tests could not be executed immediately after adding failing tests. | ||
|
|
||
| ### Resolution | ||
| Committed failing tests first, then implemented the features. Left tests in place for local/CI execution. Next dev step is `make dev-venv && . .venv/bin/activate && pip install -e .[dev] && pytest`. | ||
|
|
||
| ### What could we have done differently | ||
| Include a lightweight script or Makefile target that ensures a dev venv with pytest is provisioned before test steps, or run tests inside CI where the toolchain is guaranteed. | ||
|
|
||
| ## Incident: Doghouse Reboot (The Great Pivot) | ||
|
|
||
| Timestamp: 2026-03-27 14:00:00 (estimated) | ||
|
|
||
| Task: DP-F-21 / Doghouse flight recorder reboot | ||
|
|
||
| ### Problem | ||
| Project had drifted into "GATOS" and "git-mind" concepts that strayed from the original PhiedBach vision and immediate needs. | ||
|
|
||
| ### Resolution | ||
| Rebooted the project to focus on **DOGHOUSE**, the PR flight recorder. Deleted legacy TUI/kernel, implemented hexagonal core, and restored the original lore. | ||
|
|
||
coderabbitai[bot] marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| ### What could we have done differently | ||
| Established clearer scope boundaries earlier. The pivot from TUI to CLI to git-mind to Doghouse reflects successive scope corrections that could have been one decision with a tighter product brief upfront. | ||
|
|
||
| ## Incident: Doghouse Refinement (Ze Radar) | ||
|
|
||
| Timestamp: 2026-03-28 15:00:00 (estimated) | ||
|
|
||
| Task: DP-F-21 / Refinement & CodeRabbit feedback | ||
|
|
||
| ### Problem | ||
| The initial Doghouse cut lacked live monitoring, repro capabilities, and sensitivity to merge conflicts vs. secondary check failures. | ||
|
|
||
| ### Resolution | ||
| Implemented `doghouse watch`, `doghouse export`, and the Blocking Matrix. Hardened adapters with timeouts and deduplication. Addressed 54 threads of feedback. | ||
|
|
||
| ### What could we have done differently | ||
| Include watch/export in the initial cut. The design brief (flight-recorder-brief.md) already described these use cases but they were deferred to a second pass, creating churn when the first review surfaced them as gaps. | ||
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.