Skip to content

Bob runner: strip ANSI escapes on live-terminal write to stop flicker#381

Merged
xukai92 merged 3 commits into
mainfrom
factory/run-1154149c
May 28, 2026
Merged

Bob runner: strip ANSI escapes on live-terminal write to stop flicker#381
xukai92 merged 3 commits into
mainfrom
factory/run-1154149c

Conversation

@colehurwitz
Copy link
Copy Markdown
Collaborator

Closes #379

Changes

Bob Shell is a TUI that emits cursor/clear/alt-screen/spinner escape sequences even in headless -p mode. tee_stream wrote those raw to the parent terminal, so every Bob redraw repainted the parent screen — the rapid "epileptic" flicker reported in #379. The claude and codex runners stream line-oriented output and are unaffected.

The fix sanitizes only the bytes written to the live terminal, gated behind a sanitize: bool = False flag so claude/codex stay byte-identical and the captured buffer the CEO consumes stays fully raw.

  • factory/runners/_stream.py: add module-level strip_ansi() + robust multi-branch byte regex covering CSI (ECMA-48 [0-?] params — colon-delimited truecolor & DEC private modes like \x1b[?1049h/\x1b[?25l), OSC (BEL/ST terminator), Fe/C1, and Fp (DECSC/DECRC/keypad). Preserves \r/\n and UTF-8 multibyte (8-bit C1 ST \x9C deliberately not matched). Thread sanitize through stream_subprocessboth tee_stream calls (stdout + stderr). tee_stream sanitizes only dest writes (buffer always raw) and skips empty-after-strip lines so redraw-only output doesn't flood the terminal with bare [bob:role] prefixes.
  • factory/runners/bob.py: pass sanitize=True at the Bob call site only. claude.py/codex.py untouched.
  • Tests (tests/test_runners.py, tests/test_codex_runner.py): ~12 tests mirroring TestStreamingOutputstrip_ansi units, sanitize-keeps-buffer-raw, prefix/redraw-skip, no-sanitize byte-identical, stream_subprocess threads to both streams, and runner wiring (bob sanitizes; claude/codex do not).

Out of scope (per #379)

  • Captured buffer / saved review file stay raw (the CEO consumes them).
  • cli.py's own colored output/spinner escapes are untouched.

Verification

  • pytest tests/test_runners.py tests/test_codex_runner.py — 86 passed
  • Smoke suite (test_models, test_guards, test_cli) — 298 passed
  • ruff check and mypy factory/runners/ — clean

🤖 Generated with Claude Code

Bob Shell is a TUI that emits cursor/clear/alt-screen/spinner escapes even
in headless -p mode. tee_stream wrote them raw to the parent terminal, so
every Bob redraw repainted the parent screen ("epileptic" flicker). claude
and codex run line-oriented and are unaffected.

Fix sanitizes only the bytes written to the live terminal, gated by a
default-False flag so claude/codex are byte-identical and the captured
buffer the CEO consumes stays fully raw.

- _stream.py: add strip_ansi() + robust multi-branch byte regex (CSI with
  ECMA-48 [0-?] params, OSC with BEL/ST terminator, Fe/C1, Fp); preserves
  \r/\n and UTF-8 multibyte. Thread sanitize: bool = False through
  stream_subprocess -> both tee_stream calls. tee_stream sanitizes only
  dest writes (buffer always raw) and skips empty-after-strip lines so
  redraw-only output doesn't flood the terminal with bare prefixes.
- bob.py: pass sanitize=True at the Bob call site only.
- tests: ~12 tests covering strip_ansi units, sanitize-keeps-buffer-raw,
  prefix/redraw-skip, no-sanitize byte-identical, and runner wiring
  (bob sanitizes; claude/codex do not).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@codecov
Copy link
Copy Markdown

codecov Bot commented May 26, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 87.32%. Comparing base (8097461) to head (3d54c37).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #381   +/-   ##
=======================================
  Coverage   87.31%   87.32%           
=======================================
  Files          61       61           
  Lines        9339     9346    +7     
=======================================
+ Hits         8154     8161    +7     
  Misses       1185     1185           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

colehurwitz and others added 2 commits May 25, 2026 22:17
The strip_ansi regex only matched the 2-byte introducer of DCS (ESC P),
SOS (ESC X), PM (ESC ^), and APC (ESC _) via the Fe/C1 branch, leaking
their payloads as visible text. Add a dedicated string-introducer branch
(mirroring OSC) before the Fe/C1 branch so the full payload through the
BEL/ST terminator is consumed. Keep \x9C unmatched (UTF-8 safety).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Refines the redraw-skip guard in tee_stream so it only drops lines that
were redraw-only — i.e. lines that actually contained an escape sequence
stripped to empty — rather than ANY line empty after \r/\n stripping. A
genuine blank line (plain b'\n') is now preserved because out == line.

Also documents the known stateless/line-based limitation of strip_ansi:
unterminated or boundary-split escape sequences may leak their payload.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@colehurwitz colehurwitz marked this pull request as ready for review May 26, 2026 02:28
@colehurwitz
Copy link
Copy Markdown
Collaborator Author

Independent review: LGTM — the fix is correct, well-scoped, and the regex holds up under empirical testing (CSI/OSC/DCS-SOS-PM-APC/Fe/Fp branches, branch ordering, ReDoS, UTF-8 preservation all verified). Local run: pytest tests/test_runners.py tests/test_codex_runner.py88 passed.

Strengths

  • Tight scope. claude.py and codex.py call stream_subprocess(...) without sanitize, so they default to False and stay byte-identical. Only the Bob call site opts in. Verified by reading all three runners plus the two negative wiring tests.
  • Raw-buffer invariant is solid. buffer.append(line) happens before any sanitization and is never touched again; strip_ansi only ever feeds dest. The dedicated test_tee_stream_sanitize_strips_dest_keeps_buffer_raw test pins this. The CEO consumes the raw buffer, so behavior is preserved.
  • Branch ordering is correct and intentional. The DCS/SOS/PM/APC branch ([PX^_]...) precedes the Fe branch ([@-Z\\-_]), which is necessary because P/X/^/_ fall inside the Fe char class — without the earlier branch, Fe would greedily match just the 2-byte introducer and leak the payload. The inline comment calls this out. Confirmed empirically: \x1bP1$r0m\x1b\\afterafter.
  • UTF-8 safety. The deliberate omission of 8-bit C1 ST (\x9C) avoids clipping multibyte continuation bytes. strip_ansi("café — 日本語".encode()) round-trips unchanged.
  • Redraw-skip logic is precise. out != line and not out.strip(b"\r\n") drops lines that became empty because escapes were stripped (incl. spinner frames like \x1b[2K\r\r → dropped) while preserving genuine blank lines (\nout == line, not dropped) and progress lines with text. All three cases are covered by tests.
  • No ReDoS. Every alternative is linear with no nested quantifiers; negated char classes ([^\x07\x1B]*) bound the string branches. Stress tests (200k-byte unterminated OSC, 100k CSI params, 100k bare ESC) all complete in <2ms.
  • Honest, accurate documentation. The module comment correctly documents the stateless/line-oriented limitation (unterminated or readline-split sequences may leak) rather than overclaiming. No secrets handling concerns — sanitization is purely subtractive on display bytes.

Minor observations (non-blocking)

  • Known limitation — split/unterminated sequences (_stream.py:25-30, severity: low/informational). Because stripping is per-readline() chunk, a sequence split across a line boundary, or an unterminated OSC/DCS, leaks its payload to the terminal (verified: \x1b]0;title-no-term0;title-no-term; split CSI hello\x1b[ + 2Jworld leaks 2Jworld). This is correctly documented as out of scope and is low-probability for Bob, whose escapes normally arrive intact per line. Worth keeping in mind if a future TUI emits chunkier output. No change requested.
  • RIS (\x1bc, full reset) is not stripped (severity: low). \x1bc (ESC c) isn't in any branch — c (0x63) is above the Fe range (@-Z, \-_). Bob is unlikely to emit a full terminal reset in headless mode, and adding it risks little, but if reset-flicker ever shows up this is the gap. Informational only.

Verdict

The implementation matches the issue's proposed fix, the invariants are enforced and tested, and the blast radius is contained to the Bob runner. Test coverage is thorough — units for each escape class, the buffer-raw invariant, redraw-skip vs. genuine-blank distinction, byte-identical no-sanitize path, threading to both stdout/stderr, and positive/negative runner wiring. No blocking issues. Ship it.

— Independent reviewer (fresh-context subagent)

@colehurwitz
Copy link
Copy Markdown
Collaborator Author

colehurwitz commented May 26, 2026

@akashgit made some bob fixes. should be ready to merge.

@colehurwitz
Copy link
Copy Markdown
Collaborator Author

@akashgit can you take a look at this. want to merge and test bob more

Copy link
Copy Markdown
Collaborator

@xukai92 xukai92 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM — narrowly scoped sanitization on the live-terminal write path only; raw bytes preserved in the captured buffer so downstream parsing is unaffected. Regex is compiled once at module level, terminator-aware, linear-time, no catastrophic backtracking, and gated behind sanitize: bool = False so Claude/Codex stay byte-identical. Test coverage spans the regex unit cases, tee_stream behavior, and per-runner wiring.

Approving.

@xukai92 xukai92 merged commit d0a7504 into main May 28, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bob runner: TUI escape sequences cause terminal to flash/flicker during streaming

2 participants