Skip to content

Cache View on ConversationState, update incrementally#3324

Closed
csmith49 wants to merge 9 commits into
mainfrom
incremental-view-state
Closed

Cache View on ConversationState, update incrementally#3324
csmith49 wants to merge 9 commits into
mainfrom
incremental-view-state

Conversation

@csmith49
Copy link
Copy Markdown
Collaborator

@csmith49 csmith49 commented May 20, 2026

Refs: #3053

Summary

Lays the groundwork for the incremental View proposed in #3053. ConversationState now owns a cached View that is updated in lockstep with the event log via a new append_event write path. The expensive View.from_events rebuild (which runs enforce_properties) is reserved for three explicit entry points: cold load, fork, and error recovery.

This PR is intentionally scoped to plumbing onlyprepare_llm_messages is untouched and Agent.step still calls View.from_events(state.events) on every iteration. A tiny follow-up will swap that single call site to state.view.events to harvest the perf win, but keeping it out of this PR means there is zero behavioral change for existing callers and the diff stays easy to review.

Changes

openhands-sdk/openhands/sdk/conversation/state.py

  • New _view: View = PrivateAttr(default_factory=View) — derived, not serialized.
  • New view read-only property.
  • New append_event(event) — the only sanctioned write path; appends to _events and incrementally updates _view. O(1) for LLMConvertibleEvents, O(view) only when applying a Condensation. Does not run enforce_properties.
  • New rebuild_view() — full View.from_events re-derivation; invoked only on cold load, fork, and error recovery.
  • ConversationState.create() resume path calls rebuild_view() after attaching the EventLog.

openhands-sdk/openhands/sdk/conversation/impl/local_conversation.py

  • _default_callback calls self._state.append_event(e) instead of self._state.events.append(e), so the cached view stays current on every step.
  • Fork path uses append_event per copied event, then a single rebuild_view() at the end (same posture as cold load).

openhands-sdk/openhands/sdk/context/condenser/base.py

  • CondenserBase.condense() docstring now explicitly says implementations must treat the input view as read-only, since it may now be a cached projection owned by ConversationState. Existing condensers in this repo already honor this — none mutate the input view in place — but the contract is now documented for future implementers.

openhands-sdk/openhands/sdk/agent/agent.py

  • Agent.step()'s LLMMalformedConversationHistoryError handler calls state.rebuild_view() before emitting CondensationRequest, so the condenser operates on a freshly enforced view if the incremental one is the source of the malformed history.

Where rebuild_view() is wired

Three explicit entry points, matching the "only run enforce when..." list in #3053:

  1. Cold loadConversationState.create() resume path.
  2. ForkLocalConversation.fork(...) after deep-copying events.
  3. Error recoveryAgent.step() malformed-history handler.

Normal agent steps stay on the O(1) incremental path with zero enforce_properties invocations.

Tests

  • tests/sdk/conversation/test_state_view_cache.py (new, 7 tests):
    • Fresh state → empty view.
    • append_event updates both the log and the view.
    • Parity with View.from_events(state.events) after many appends.
    • Condensation events are applied incrementally.
    • Hot-path invariant: append_event calls enforce_properties 0 times (verified by monkey-patch counter).
    • rebuild_view does run enforce_properties.
    • rebuild_view replaces the cached instance with one derived from the log.
  • tests/sdk/agent/test_agent_context_window_condensation.py (one new test):
    • test_agent_rebuilds_view_on_malformed_history_recovery — asserts rebuild_view is called exactly once before the condensation retry on the malformed-history path.

Verification run locally

  • uv run pre-commit run on all changed files: clean.
  • tests/sdk/agent/ + tests/sdk/conversation/ + tests/sdk/context/: 1228 passed, 1 deselected (pre-existing unrelated test_acp_agent failure on main).

Backward compatibility

  • All public APIs preserved: View, View.from_events, View.append_event, View.enforce_properties, CondenserBase.condense(view, ...), ConversationState.events.
  • New additive surface only: ConversationState.view, ConversationState.append_event, ConversationState.rebuild_view.
  • No behavior change for Agent.step: it still calls prepare_llm_messages(state.events, ...) as today. The cached view is maintained but unused on the hot path; consumers can opt in later.

Follow-ups (separate PRs)

  1. Swap Agent.step to read from state.view.events, harvesting the perf win this PR sets up.
  2. Optional debug-mode OPENHANDS_VIEW_VERIFY parity assertion that re-runs enforce_properties on a copy after each step and logs any divergence without modifying the view.

This PR was created by an AI agent (OpenHands) on behalf of the user.

@csmith49 can click here to continue refining the PR


Agent Server images for this PR

GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant Architectures Base Image Docs / Tags
java amd64, arm64 eclipse-temurin:17-jdk Link
python amd64, arm64 nikolaik/python-nodejs:python3.13-nodejs22-slim Link
golang amd64, arm64 golang:1.21-bookworm Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:1e1d4f4-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-1e1d4f4-python \
  ghcr.io/openhands/agent-server:1e1d4f4-python

All tags pushed for this build

ghcr.io/openhands/agent-server:1e1d4f4-golang-amd64
ghcr.io/openhands/agent-server:1e1d4f4837b76bac9a9a1a77997c49b13ff0fdd0-golang-amd64
ghcr.io/openhands/agent-server:incremental-view-state-golang-amd64
ghcr.io/openhands/agent-server:1e1d4f4-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:1e1d4f4-golang-arm64
ghcr.io/openhands/agent-server:1e1d4f4837b76bac9a9a1a77997c49b13ff0fdd0-golang-arm64
ghcr.io/openhands/agent-server:incremental-view-state-golang-arm64
ghcr.io/openhands/agent-server:1e1d4f4-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:1e1d4f4-java-amd64
ghcr.io/openhands/agent-server:1e1d4f4837b76bac9a9a1a77997c49b13ff0fdd0-java-amd64
ghcr.io/openhands/agent-server:incremental-view-state-java-amd64
ghcr.io/openhands/agent-server:1e1d4f4-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:1e1d4f4-java-arm64
ghcr.io/openhands/agent-server:1e1d4f4837b76bac9a9a1a77997c49b13ff0fdd0-java-arm64
ghcr.io/openhands/agent-server:incremental-view-state-java-arm64
ghcr.io/openhands/agent-server:1e1d4f4-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:1e1d4f4-python-amd64
ghcr.io/openhands/agent-server:1e1d4f4837b76bac9a9a1a77997c49b13ff0fdd0-python-amd64
ghcr.io/openhands/agent-server:incremental-view-state-python-amd64
ghcr.io/openhands/agent-server:1e1d4f4-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-amd64
ghcr.io/openhands/agent-server:1e1d4f4-python-arm64
ghcr.io/openhands/agent-server:1e1d4f4837b76bac9a9a1a77997c49b13ff0fdd0-python-arm64
ghcr.io/openhands/agent-server:incremental-view-state-python-arm64
ghcr.io/openhands/agent-server:1e1d4f4-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-arm64
ghcr.io/openhands/agent-server:1e1d4f4-golang
ghcr.io/openhands/agent-server:1e1d4f4837b76bac9a9a1a77997c49b13ff0fdd0-golang
ghcr.io/openhands/agent-server:incremental-view-state-golang
ghcr.io/openhands/agent-server:1e1d4f4-golang_tag_1.21-bookworm
ghcr.io/openhands/agent-server:1e1d4f4-java
ghcr.io/openhands/agent-server:1e1d4f4837b76bac9a9a1a77997c49b13ff0fdd0-java
ghcr.io/openhands/agent-server:incremental-view-state-java
ghcr.io/openhands/agent-server:1e1d4f4-eclipse-temurin_tag_17-jdk
ghcr.io/openhands/agent-server:1e1d4f4-python
ghcr.io/openhands/agent-server:1e1d4f4837b76bac9a9a1a77997c49b13ff0fdd0-python
ghcr.io/openhands/agent-server:incremental-view-state-python
ghcr.io/openhands/agent-server:1e1d4f4-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim

About Multi-Architecture Support

  • Each variant tag (e.g., 1e1d4f4-python) is a multi-arch manifest supporting both amd64 and arm64
  • Docker automatically pulls the correct architecture for your platform
  • Individual architecture tags (e.g., 1e1d4f4-python-amd64) are also available if needed

Introduces `ConversationState.view` as a derived-but-cached projection of
the event log, updated in lockstep with appends via the new
`ConversationState.append_event` write path. The view is fully re-derived
(running `View.enforce_properties`) only via
`ConversationState.rebuild_view`, which is invoked on cold load (resuming
from persistence) and after fork event-copy. Hot-path agent steps no
longer need to pay the `View.from_events` cost on every iteration once
call sites are migrated in a follow-up PR.

The default callback in `LocalConversation` and the fork event-copy now
go through `state.append_event`, keeping the cached view in sync without
introducing any behavioral change for existing callers (who read events
via `state.events`).

Also documents on `CondenserBase.condense` that implementations must
treat the passed view as read-only, since it may now be a cached
projection owned by `ConversationState`.

Refs: #3053

Co-authored-by: openhands <openhands@all-hands.dev>
When the LLM rejects the current message history as structurally malformed
(broken tool_use/tool_result pairing, etc.), the incremental View we just
sent may itself be the source of the problem — for example, a
manipulation-indices bug that let a property slip past the inductive
maintenance path. Re-derive the cached view from the event log with full
`enforce_properties` before triggering the condensation retry, so the
condenser operates on a clean, enforced view.

This is one of the two explicit `rebuild_view` entry points described in
the original design (the other being cold-load on `ConversationState.create`,
which already exists in the previous commit on this branch).

Refs: #3053

Co-authored-by: openhands <openhands@all-hands.dev>
@github-actions github-actions Bot added the release-note-required PR requires explicit release-note coverage for behavioral or default changes label May 20, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 20, 2026

Python API breakage checks — ✅ PASSED

Result:PASSED

Behavioral default changes detected

These public Field(default=...) changes were auto-marked with the release-note-required label:

  • openhands.sdk.llm.llm.LLM.model: 'claude-sonnet-4-20250514''gpt-5.5'

Action log

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 20, 2026

REST API breakage checks (OpenAPI) — ✅ PASSED

Result:PASSED

Action log

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 20, 2026

Coverage

Coverage Report •
FileStmtsMissCoverMissing
openhands-sdk/openhands/sdk/agent
   agent.py3653989%101, 215, 340, 344, 612–613, 620–621, 710, 714–715, 720–722, 724, 734–735, 752–753, 760–761, 785, 792, 796, 800, 803–804, 806–807, 821–822, 1109–1110, 1112, 1140, 1148–1149, 1183, 1190
openhands-sdk/openhands/sdk/context/condenser
   base.py692268%78, 192–193, 211–214, 216–217, 219–220, 222–224, 227–230, 232, 234, 243, 254
openhands-sdk/openhands/sdk/conversation
   event_store.py1611789%127, 134, 138–139, 184, 188, 194, 196–199, 219–222, 265, 278
   state.py229597%480, 545–547, 688
openhands-sdk/openhands/sdk/conversation/impl
   local_conversation.py5394591%311, 316, 471, 517, 554, 570, 635, 855–856, 859, 975, 986–989, 996–997, 1000, 1006–1007, 1010, 1016, 1031, 1034, 1038–1039, 1043–1045, 1052, 1138, 1143, 1259, 1262, 1264, 1268–1269, 1280–1281, 1306, 1500, 1504, 1574, 1581–1582
TOTAL27726821570% 

@csmith49 csmith49 marked this pull request as ready for review May 20, 2026 14:31
Copy link
Copy Markdown
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ QA Report: PASS

Functional QA verified the cached ConversationState.view plumbing through real SDK usage; the PR achieves its stated plumbing goal with no QA issues found.

Does this PR achieve its stated goal?

Yes. The goal was to add and maintain a cached View on ConversationState, update it incrementally through the sanctioned write path, and rebuild it at explicit recovery/load/fork entry points without changing Agent.step behavior. I exercised the SDK as a user would: sending messages through Conversation, appending events/condensations through ConversationState, forking conversations, and resuming a persisted conversation. In each case the cached view stayed in sync with the event log, hot-path appends did not call enforce_properties, and explicit rebuild/cold-load/fork paths produced views matching the full event log.

Phase Result
Environment Setup make build completed successfully and created the uv environment.
CI Status 🟡 At time checked: 18 successful, 7 pending, 1 cancelled, 1 skipped; no failing checks reported.
Functional Verification ✅ Real SDK scripts verified conversation append, condensation, rebuild, fork, and persistence resume behavior.
Functional Verification

Test 1: Baseline on main versus PR branch API/user path

Step 1 — Establish baseline without the PR:
Ran git switch --detach origin/main && OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_view_cache.py using a script that creates a real Conversation, sends a user message, and probes the state APIs:

api.has_view=False
api.has_append_event=False
api.has_rebuild_view=False
conversation.events_after_send=2
conversation.cached_view_available=False
append.skipped=no_append_event_api
fork.skipped=no_view_api

This shows the base branch can record conversation events, but it has no cached view API or incremental append/rebuild API to exercise.

Step 2 — Apply the PR's changes:
Checked out incremental-view-state at 566db5dc43d426a95c12289fdb62f989ff9b4848.

Step 3 — Re-run with the PR in place:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_view_cache.py:

api.has_view=True
api.has_append_event=True
api.has_rebuild_view=True
conversation.initial_events=0
conversation.initial_view=0
conversation.events_after_send=2
conversation.view_after_send=2
conversation.ids_match=True
conversation.last_view_source=user

This confirms the new surface exists and the real Conversation.send_message("hello from qa") path updates both the event log and cached view in lockstep.

Test 2: Incremental append, condensation, and explicit rebuild behavior

Step 1 — Baseline without the PR:
The same base-branch script reported append.skipped=no_append_event_api, so there was no incremental append/rebuild path to use.

Step 2 — Apply the PR's changes:
Used the PR branch and appended three real MessageEvents plus a real Condensation through state.append_event(...), while instrumenting View.enforce_properties to count hot-path calls.

Step 3 — Re-run with the PR in place:
Output from OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_view_cache.py:

append.enforce_calls_after_messages=0
append.event_log_len=4
append.view_len_after_condensation=1
append.remaining_is_middle_message=True
append.enforce_calls_after_condensation=0
rebuild.enforce_calls_after_rebuild_and_compare=2
rebuild.replaced_cached_instance=True
rebuild.matches_full_from_events=True

This confirms normal incremental appends/condensation do not run enforcement, condensation is reflected in the view, and rebuild_view() replaces the cache with a view equivalent to View.from_events(state.events).

Test 3: Fork and cold-load/reload entry points

Fork path:
The PR-branch script used a real conversation, sent a message, and called convo.fork():

fork.source_events=2
fork.events=2
fork.view=2
fork.ids_match_source=True
fork.view_matches_events=True

This confirms the forked conversation gets a cached view matching its copied event log.

Cold-load / persistence resume path:
Ran a real persisted conversation, resumed it by conversation_id, and compared event IDs to cached view IDs:

resume.before_events=2
resume.before_view=2
resume.after_events=2
resume.after_view=2
resume.ids_preserved=True
resume.view_matches_events=True

This confirms the cold-load resume path rebuilds/initializes state.view from persisted events.

Issues Found

None.

This review was created by an AI agent (OpenHands) on behalf of the user.

Copy link
Copy Markdown
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Taste Rating: Acceptable — good direction, but cache ownership is not airtight yet.

I found three correctness issues around keeping the derived View synchronized with every event write. Because this touches conversation/condenser plumbing, I’m leaving a COMMENT for maintainer follow-up rather than approval.

[RISK ASSESSMENT]

  • [Overall PR] ⚠️ Risk Assessment: 🟡 MEDIUM
    Derived state caching in core conversation flow can silently diverge under existing write/concurrency paths; no dependency or security changes.

VERDICT:Needs follow-up before merge.

KEY INSIGHT: The cache needs a single enforced writer, not just a documented preferred writer.

This review was generated by an AI agent (OpenHands) on behalf of the user.


Improve this review? If any feedback above seems incorrect or irrelevant to this repository, you can teach the reviewer to do better:

  1. Add a .agents/skills/custom-codereview-guide.md file to your branch (or edit it if one already exists) with the /codereview trigger and the context the reviewer is missing (e.g., "Security concerns about X do not apply here because Y"). See the customization docs for the required frontmatter format.
  2. Re-request a review - the reviewer reads guidelines from the PR branch, so your changes take effect immediately.
  3. When your PR is merged, the guideline file goes through normal code review by repository maintainers.

Resolve with AI? Install the iterate skill in your agent and run /iterate to automatically drive this PR through CI, review, and QA until it's merge-ready.


Was this automated review useful? React with 👍 or 👎 to this review to help us measure review quality.
Workflow run: https://github.com/OpenHands/software-agent-sdk/actions/runs/26169383121

Comment thread openhands-sdk/openhands/sdk/conversation/state.py Outdated
Comment thread openhands-sdk/openhands/sdk/conversation/state.py
Comment thread openhands-sdk/openhands/sdk/conversation/impl/local_conversation.py
csmith49 and others added 2 commits May 20, 2026 12:21
…ondense() race

Thread 1 – Guard against direct EventLog.append bypassing the view cache:
  Add _on_append callback to EventLog, invoked after every successful append
  with (event, synced_count). ConversationState._wire_view_sync() wires this
  callback so that both state.append_event() and direct state.events.append()
  keep the cached _view in sync. append_event() no longer calls
  _view.append_event() directly — the callback handles it uniformly.

Thread 2 – Handle EventLog syncing extra events from disk:
  EventLog.append() now tracks how many events were synced from disk before
  writing the new event and passes this count to _on_append. When synced_count
  > 0, the callback performs a full View.from_events rebuild instead of an
  incremental update, so disk-synced events are not missed by the view.

Thread 3 – Fix condense() race condition:
  Move _on_event(condensation_request) inside the `with self._state:` block
  in condense(), so the view mutation cannot race with a concurrent run().

Tests:
  - test_direct_eventlog_append_also_updates_view: verifies the on_append
    callback keeps the view in sync for direct EventLog.append calls.
  - test_view_rebuilds_when_eventlog_syncs_extra_events: simulates another
    writer and verifies the full rebuild path.

Co-authored-by: openhands <openhands@all-hands.dev>
@csmith49 csmith49 requested a review from all-hands-bot May 20, 2026 18:28
Copy link
Copy Markdown
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The three issues raised in the previous review have all been addressed and resolved — the _on_append callback on EventLog, the synced_count tracking for disk-synced events, and the condense() race are all fixed. The plumbing is clean: the single _sync closure in _wire_view_sync handles both the incremental O(1) path and the full-rebuild fallback for disk-sync scenarios, and the test suite covers the hot-path invariant (zero enforce_properties calls), parity with View.from_events, condensation application, fork, persistence resume, and the direct-eventlog-append bypass.

Two observations below worth considering before merge.

This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation.

Comment thread openhands-sdk/openhands/sdk/conversation/impl/local_conversation.py Outdated
Comment thread openhands-sdk/openhands/sdk/conversation/event_store.py Outdated
Copy link
Copy Markdown
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ QA Report: PASS

Verified the new ConversationState cached-view plumbing by running SDK conversation flows; the cache stayed aligned with the event log across normal appends, condensation, cold reload, fork, stale log sync, and malformed-history recovery.

Does this PR achieve its stated goal?

Yes. The PR set out to add incremental cached View plumbing with full rebuilds reserved for cold load, fork, and error recovery while preserving existing event-log behavior. The base branch showed the existing send_message path produced the same event-log behavior but had no state.view / append_event API; on the PR branch, the same path kept state.view in sync. Additional runtime checks showed hot-path appends made 0 enforce_properties calls, explicit rebuild_view() did run enforcement, reload/fork views matched logs, stale event-log sync rebuilt correctly, and malformed-history recovery rebuilt once before emitting CondensationRequest.

Phase Result
Environment Setup make build completed; SDK imports and scripts ran via uv run python.
CI Status ⏳ Most checks green; only image build jobs plus qa-changes/pr-review were still in progress when checked.
Functional Verification ✅ SDK scenarios exercised the changed behavior directly; no functional issues found.
Functional Verification

Test 1: Existing Conversation.send_message behavior plus new cached view

Step 1 — Establish baseline without the PR:
Ran git checkout origin/main && OPENHANDS_SUPPRESS_BANNER=1 uv run python -u /tmp/qa_state_view_scenario.py:

initial_event_count=0
has_state_view=False
has_append_event=False
after_send_event_count=2
last_event_type=MessageEvent
last_event_source=user
view_event_count=UNAVAILABLE

This confirms the previous SDK path appended the user message to the event log, but no cached view API existed.

Step 2 — Apply the PR's changes:
Checked out incremental-view-state at 65f3f9976e987bb06ae7f788ed8c4d8074495d1e.

Step 3 — Re-run with the PR:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python -u /tmp/qa_state_view_scenario.py:

initial_event_count=0
has_state_view=True
has_append_event=True
after_send_event_count=2
last_event_type=MessageEvent
last_event_source=user
view_event_count=2
view_last_event_type=MessageEvent
view_matches_log=True

This shows existing event-log behavior remained intact and the new cached view tracked the same events through the real SDK send_message entry point.

Test 2: Hot-path appends avoid full enforcement; explicit rebuild enforces

Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python -u /tmp/qa_hot_path_scenario.py:

event_count_after_sends=3
view_count_after_sends=3
enforce_calls_after_sends=0
enforce_calls_after_rebuild=1
rebuild_replaced_view=True
view_matches_log_after_rebuild=True

This confirms the PR's core performance claim: normal SDK appends maintained the cached view without running enforce_properties, while rebuild_view() did run enforcement and produced a fresh cache matching the log.

Test 3: Condensation updates the cached view but remains persisted in the log

Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python -u /tmp/qa_condensation_scenario.py:

view_count_before_condensation=3
log_count_after_condensation=4
view_count_after_condensation=2
forgotten_event_removed_from_view=True
condensation_persisted_in_log=True

This shows a Condensation event is retained in the event log while the cached view reflects the forgotten event removal.

Test 4: Cold reload and fork rebuild the cached view from persisted events

Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python -u /tmp/qa_persist_fork_scenario.py:

original_event_count=2
original_view_count=2
original_view_matches_log=True
reloaded_event_count=2
reloaded_view_count=2
reloaded_view_matches_log=True
fork_event_count=2
fork_view_count=2
fork_view_matches_log=True
fork_matches_source=True

This verifies the documented cold-load and fork entry points: after resume and fork, the cached view matched the persisted/copied event log.

Test 5: A stale live handle resyncs when another handle appended events

Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python -u /tmp/qa_two_handle_sync_scenario.py:

after_conv1_initial_log=2
after_conv1_initial_view=2
after_conv2_append_log=3
after_conv2_append_view=3
after_conv1_second_append_log=4
after_conv1_second_append_view=4
conv1_view_matches_log=True

This exercises the event-log sync path a user could hit with two live conversation handles sharing persisted storage; the stale handle rebuilt enough state to keep the cached view aligned.

Test 6: Malformed-history recovery rebuilds before requesting condensation

Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python -u /tmp/qa_malformed_recovery_scenario.py:

rebuild_calls=1
emitted_condensation_request=True
emitted_event_types=['CondensationRequest']

This confirms the error-recovery entry point calls rebuild_view() exactly once before emitting the condensation retry request.

Issues Found

None.

This review was generated by an AI agent (OpenHands) on behalf of the user.

Copy link
Copy Markdown
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Taste Rating: Acceptable — good direction, but cache synchronization still has lock-order holes.

I found two important concurrency issues that can let ConversationState.view diverge from the event log under existing write paths.

[RISK ASSESSMENT]

  • [Overall PR] ⚠️ Risk Assessment: 🟡 MEDIUM
    Derived mutable cache in core conversation state can silently affect future message preparation/condenser behavior if synchronization is not airtight.

VERDICT: ❌ Needs follow-up before approval.

This review was generated by an AI agent (OpenHands) on behalf of the user.


Was this automated review useful? React with 👍 or 👎 to this review to help us measure review quality.
Workflow run: https://github.com/OpenHands/software-agent-sdk/actions/runs/26181992831

Comment thread openhands-sdk/openhands/sdk/conversation/event_store.py Outdated
Comment thread openhands-sdk/openhands/sdk/conversation/impl/local_conversation.py
…optimization

Thread 6 – Move _on_append inside EventLog file lock:
  The callback now fires while the file lock is still held, so concurrent
  appends execute their callbacks in the same order as their writes. This
  prevents the view from diverging from the persisted event order.

Thread 7 – Move run() error emission under state lock:
  The except handler in run() now wraps both the execution_status
  mutation and the ConversationErrorEvent emission in `with self._state:`,
  so the view update in _on_event cannot race with a concurrent
  send_message()/run().

Thread 4 – Suppress callback during fork bulk copy:
  The fork loop now suppresses the _on_append callback during the n-event
  bulk copy (set_on_append(None)), since rebuild_view() on the next line
  discards the incremental work. _wire_view_sync() is called after
  rebuild_view() to restore the callback.

Thread 5 – Thread-safety contract documentation:
  Added docstring notes to set_on_append() and _wire_view_sync() making
  the locking contract explicit: the callback is serialized by the
  EventLog file lock, but callers must additionally hold the
  ConversationState lock to prevent races with non-append readers.

Co-authored-by: openhands <openhands@all-hands.dev>
@csmith49 csmith49 requested a review from all-hands-bot May 20, 2026 19:46
Copy link
Copy Markdown
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The three concurrency correctness issues from the previous rounds — event-log write bypass, disk-sync missed events, and the condense() / run() race — are all cleanly resolved in this version. The _on_append callback firing inside the file lock, synced_count-driven rebuild, and the state-lock expansions in condense() and the run() exception path are exactly the right fixes.

Two points worth addressing before merge, plus one small test gap:

This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation.

Comment thread openhands-sdk/openhands/sdk/conversation/state.py Outdated
Comment thread openhands-sdk/openhands/sdk/conversation/impl/local_conversation.py Outdated
Comment thread tests/sdk/conversation/test_state_view_cache.py
Copy link
Copy Markdown
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ QA Report: PASS WITH ISSUES

Functional QA confirms the PR’s cached ConversationState.view plumbing works through real SDK usage; CI is not fully green at review time.

Does this PR achieve its stated goal?

Yes. I exercised the SDK as a user would by creating conversations, sending messages, appending events, resuming persisted conversations, forking, and triggering malformed-history recovery. On the PR branch, state.view, append_event, and rebuild_view are present; the cached view stays synchronized with the event log, condensation is reflected in the view, cold resume/fork preserve the rebuilt view, hot-path appends did not call enforce_properties, and malformed-history recovery called rebuild_view() once before emitting CondensationRequest.

Phase Result
Environment Setup make build completed successfully; no test suite, linters, or pre-commit runs were executed.
CI Status ⚠️ Most checks pass, but Merge Multi-Arch Manifests (java) and unresolved-review-threads are failing; python/golang manifest merge checks are cancelled; QA/review workflows are still in progress.
Functional Verification ✅ SDK behavior matched the PR goal across baseline comparison, cached view operations, hot-path enforcement behavior, resume/fork, and recovery handling.
Functional Verification

Test 1: Baseline vs PR cached-view surface and conversation behavior

Step 1 — Establish baseline on origin/main:
Ran git checkout origin/main && OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_view_cache_exercise.py:

{
  "after_resume": {"events_len": 2},
  "after_send_message": {"events_len": 2},
  "has_append_event": false,
  "has_rebuild_view": false,
  "has_view_property": false,
  "sdk_import": "ok"
}

This shows the existing SDK could create/resume a conversation and enqueue a real user message, but had no cached-view API surface.

Step 2 — Apply the PR changes:
Checked out incremental-view-state at 42f382a9c77bd99d8c6c72e8cd57fdecb438ab7a.

Step 3 — Re-run the same SDK exercise on the PR branch:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_view_cache_exercise.py:

{
  "has_append_event": true,
  "has_rebuild_view": true,
  "has_view_property": true,
  "after_send_message": {
    "events_len": 2,
    "view_len": 2,
    "event_ids_equal_view_ids": true
  },
  "after_manual_appends_and_condensation": {
    "events_len": 5,
    "view_len": 3,
    "manual_1_in_view": false,
    "manual_2_in_view": true
  },
  "after_resume": {
    "events_len": 5,
    "view_len": 3,
    "view_ids_equal_fresh_log_ids": true
  },
  "after_fork": {
    "fork_events_len": 5,
    "fork_view_len": 3
  }
}

This confirms the new cached-view surface exists, user send_message() keeps events and view aligned, direct append_event/state.events.append keep the cache updated, condensation removes the forgotten event from the view while leaving it in the log, and persisted resume/fork both produce a synchronized cached view.

Test 2: Hot-path append avoids full enforcement, explicit rebuild enforces

Step 1 — Establish baseline:
The baseline branch has no ConversationState.append_event/rebuild_view API to exercise, so this behavior is new in the PR.

Step 2 — Apply the PR changes:
Used incremental-view-state at 42f382a9c77bd99d8c6c72e8cd57fdecb438ab7a.

Step 3 — Exercise the new write path with instrumentation:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_hot_path_enforce.py:

{
  "counts_after_create": 0,
  "counts_after_appends": 0,
  "counts_after_rebuild": 1,
  "events_len": 5,
  "view_len_after_appends": 5,
  "view_len_after_rebuild": 5
}

This confirms five real SDK event appends updated the cached view without invoking View.enforce_properties, while an explicit rebuild_view() did invoke enforcement once.

Test 3: Malformed-history recovery rebuilds before condensation retry

Step 1 — Establish baseline on origin/main:
Ran git checkout origin/main && OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_malformed_recovery.py:

{
  "emitted_condensation_request": true,
  "emitted_event_types": ["CondensationRequest"],
  "events_len_after_recovery": 2,
  "has_rebuild_view": false,
  "rebuild_calls": 0,
  "view_len_after_recovery": null
}

This shows the existing recovery path emitted a condensation request, but there was no cached view or rebuild hook to refresh first.

Step 2 — Apply the PR changes:
Checked out incremental-view-state at 42f382a9c77bd99d8c6c72e8cd57fdecb438ab7a.

Step 3 — Re-run with the PR behavior:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_malformed_recovery.py:

{
  "emitted_condensation_request": true,
  "emitted_event_types": ["CondensationRequest"],
  "events_len_after_recovery": 2,
  "has_rebuild_view": true,
  "rebuild_calls": 1,
  "view_len_after_recovery": 2
}

This confirms the malformed-history path now rebuilds the cached view exactly once before emitting the condensation retry request.

Issues Found

  • 🟡 CI status: Functional QA found no product behavior issues, but CI is not fully green: Merge Multi-Arch Manifests (java) and unresolved-review-threads are failing, with python/golang manifest merge checks cancelled and QA/review workflows still in progress.

Final verdict: PASS WITH ISSUES — the PR achieves its stated SDK behavior goal, but merge readiness still depends on resolving/triaging the non-green CI checks.

This QA review was created by an AI agent (OpenHands) on behalf of the user.

Copy link
Copy Markdown
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Taste Rating: Acceptable — the cache plumbing is close, but I found two remaining synchronization holes that can leave state.view divergent.

[RISK ASSESSMENT]

  • [Overall PR] ⚠️ Risk Assessment: 🟡 MEDIUM — derived mutable cache in core conversation state needs airtight initialization and lock discipline before future hot-path use.

VERDICT: ❌ Needs follow-up before approval. This review was generated by an AI agent (OpenHands) on behalf of the user.


Was this automated review useful? React with 👍 or 👎 to this review to help us measure review quality.
Workflow run: https://github.com/OpenHands/software-agent-sdk/actions/runs/26185966291

Comment thread openhands-sdk/openhands/sdk/conversation/state.py
Comment thread openhands-sdk/openhands/sdk/conversation/state.py
csmith49 and others added 2 commits May 20, 2026 14:25
… fork safety

Thread A – Replace O(N) rebuild with incremental sync:
  The _sync callback now applies only the synced_count missed events
  incrementally rather than rebuilding the entire view via
  View.from_events.  Lock hold is now O(synced_count) instead of O(N).

Thread B – try/finally in fork bulk copy:
  rebuild_view() + _wire_view_sync() now run in a finally block so the
  _on_append callback is unconditionally restored even if a deep-copy
  raises during the bulk event copy.

Thread C – CondensationRequest flag test:
  Added test_condensation_request_sets_flag_incrementally to verify the
  incremental path sets unhandled_condensation_request without adding
  the event to the view (not LLMConvertible).

Thread D – Orphaned events in fresh-create path:
  If events/ has files but base_state.json is missing (crash / partial
  cleanup), the fresh-create path now calls rebuild_view() to populate
  the cached view from pre-existing events.  Corrupted orphans degrade
  gracefully to an empty view with a warning.  Added regression test.

Thread E – close() session-end hooks under state lock:
  run_session_end() in close() now runs inside `with self._state:` so
  hook-emitted events cannot race with concurrent run()/send_message().
  Updated _wire_view_sync docstring to list close() and run() error
  handler as call sites that satisfy the lock contract.

Co-authored-by: openhands <openhands@all-hands.dev>
# Conflicts:
#	openhands-sdk/openhands/sdk/conversation/impl/local_conversation.py
@csmith49 csmith49 requested a review from all-hands-bot May 20, 2026 20:31
Copy link
Copy Markdown
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ QA Report: PASS

Verified the PR’s SDK plumbing by exercising cached ConversationState.view behavior through real Conversation usage, persistence resume, fork, direct event appends, condensation, and malformed-history recovery.

Does this PR achieve its stated goal?

Yes. The PR set out to add a cached ConversationState.view that stays synchronized with the event log, add sanctioned append/rebuild paths, and reserve full rebuilds for cold load/fork/error recovery while preserving existing conversation behavior. In real SDK scripts, main had no state.view/append_event; the PR branch exposed both, kept state.view synchronized after send_message, append_event, direct state.events.append, and Condensation, rebuilt correctly after persistence resume and fork, and rebuilt a deliberately corrupted cached view during malformed-history recovery before emitting CondensationRequest.

Phase Result
Environment Setup make build completed successfully; no test suite, linters, or pre-commit runs were executed.
CI Status 🟡 Queried via gh pr checks: 15 success, 9 in progress, 3 queued, 2 skipped at the time of QA.
Functional Verification ✅ SDK behavior matched the PR’s stated plumbing goals.
Functional Verification Evidence

Test 1: Cached view surface and incremental updates

Step 1 — Establish baseline on origin/main:
Ran git checkout origin/main && OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_view_cache_behavior.py:

HEAD is now at 3996252c feat: add async LLM completion support across the SDK (#3284)
state_has_view=False
state_has_append_event=False
after_send_message_event_count=2
after_send_message_view_ids=<no state.view>

This confirms the base branch does not expose the new cached view or append API, so there is no cached projection to inspect after a real Conversation.send_message(...) operation.

Step 2 — Apply the PR changes:
Checked out incremental-view-state at 4410b99f7f115731cd213c00c41ae705e8238d70.

Step 3 — Re-run with the PR:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_view_cache_behavior.py:

state_has_view=True
state_has_append_event=True
after_send_message_event_count=2
after_send_message_view_ids=['1cb7f2a5-d979-4c95-b24a-bf18d9be62c8', 'ae6fa3ad-c263-4e67-888d-63d0d564a56f']
after_append_event_count=3
after_append_event_view_ids=['1cb7f2a5-d979-4c95-b24a-bf18d9be62c8', 'ae6fa3ad-c263-4e67-888d-63d0d564a56f', '9f8bd7b1-e9cf-4e0f-ab4e-c7ffa3339825']
after_direct_append_count=4
after_direct_append_view_ids=['1cb7f2a5-d979-4c95-b24a-bf18d9be62c8', 'ae6fa3ad-c263-4e67-888d-63d0d564a56f', '9f8bd7b1-e9cf-4e0f-ab4e-c7ffa3339825', '4f6c226c-94b7-4aa0-8c6a-1c943802acb7']
after_condensation_event_count=5
after_condensation_view_ids=['ae6fa3ad-c263-4e67-888d-63d0d564a56f', '9f8bd7b1-e9cf-4e0f-ab4e-c7ffa3339825', '4f6c226c-94b7-4aa0-8c6a-1c943802acb7']
rebuild_preserves_view_ids=True

This shows the PR adds the intended surface and keeps the cached view updated through the real conversation callback path, the new append_event path, direct EventLog.append, condensation application, and explicit rebuild.

Test 2: Cold-load resume and fork rebuild paths

Step 1 — Baseline:
As shown above, main has no state.view, so cold-load/fork cache rebuilding is not available before this PR.

Step 2 — Apply the PR changes:
Used the PR checkout at 4410b99f7f115731cd213c00c41ae705e8238d70.

Step 3 — Exercise persisted resume and fork with the PR:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python -u /tmp/qa_view_cache_lifecycle.py:

created_conversation_id=c3f62895-2d16-44a0-aa3d-780779f39728
original_event_count=2
original_view_matches_events=True
resumed_event_count=2
resumed_view_matches_original=True
resumed_view_matches_events=True
fork_event_count=2
fork_view_matches_source=True
fork_view_matches_events=True

This confirms the cached view is rebuilt correctly on persisted conversation resume and on fork, matching the underlying event log and source view.

Test 3: Malformed-history recovery rebuild path

Step 1 — Baseline:
On main, there is no cached state.view to rebuild during malformed-history recovery. The PR’s new behavior is the added rebuild before emitting a condensation retry request.

Step 2 — Apply the PR changes:
Used the PR checkout at 4410b99f7f115731cd213c00c41ae705e8238d70.

Step 3 — Exercise recovery with a request-handling condenser:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python -u /tmp/qa_malformed_recovery.py:

view_corrupted_before_step=True
LLM raised malformed conversation history error, triggering condensation retry with condensed history: qa malformed history
condensation_request_emitted=True
view_rebuilt_after_malformed=True
rebuilt_event_count=2

This intentionally corrupted the cached view after a real send_message(...); when the LLM raised LLMMalformedConversationHistoryError, Agent.step(...) rebuilt the cached view from the event log and emitted CondensationRequest, matching the PR’s recovery goal.

Issues Found

None.

This QA review was created by an AI agent (OpenHands) on behalf of the user.

Copy link
Copy Markdown
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clean, well-scoped plumbing PR. The incremental-update invariant (_sync callback math, lock ordering, enforce_properties reserved for three explicit entry points) is correct and thoroughly documented. Test coverage is solid. Two items below worth addressing before merge.


This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation.

Comment thread openhands-sdk/openhands/sdk/conversation/impl/local_conversation.py Outdated
Comment thread tests/sdk/conversation/test_state_view_cache.py Outdated
Copy link
Copy Markdown
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Taste Rating: Acceptable — sync cache plumbing is coherent, but async malformed-history recovery missed the same repair path.

[RISK ASSESSMENT]

  • [Overall PR] ⚠️ Risk Assessment: 🟡 MEDIUM — derived conversation-history cache touches condenser/error-recovery behavior and future eval-sensitive hot paths.

VERDICT: ❌ Needs follow-up before approval. One important async parity issue inline.

This review was generated by an AI agent (OpenHands) on behalf of the user.


Was this automated review useful? React with 👍 or 👎 to this review to help us measure review quality.
Workflow run: https://github.com/OpenHands/software-agent-sdk/actions/runs/26188223827

Comment thread openhands-sdk/openhands/sdk/agent/agent.py
… astep() parity

Thread A – Nested try/finally in fork bulk copy:
  If rebuild_view() raises, _wire_view_sync() was silently skipped
  because a single finally block aborts on its first exception.
  Nested try/finally ensures the callback is unconditionally restored.

Thread B – Orphan test exercises ConversationState.create():
  Replaced the manual mimicry of the orphan guard with a real
  ConversationState.create(persistence_dir=tmpdir) call against
  a temp directory containing orphaned event files on disk.
  Any regression in the production code path is now caught.

Thread C – rebuild_view() parity in astep():
  The malformed-history handler in astep() now calls
  state.rebuild_view() before emitting CondensationRequest,
  mirroring the sync step() path. Added async regression test
  with acompletion/aresponses overrides on the test LLM.

Co-authored-by: openhands <openhands@all-hands.dev>
@csmith49 csmith49 requested a review from all-hands-bot May 20, 2026 20:53
Copy link
Copy Markdown
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ QA Report: PASS

The PR achieves its plumbing goal: ConversationState now exposes a cached view that stays synchronized through real SDK operations, while rebuild remains an explicit recovery/cold-load path.

Does this PR achieve its stated goal?

Yes. I exercised the SDK as a user would by creating conversations, sending messages, appending events, condensing history, resuming persisted conversations, and triggering malformed-history recovery with a deterministic LLM. Compared with origin/main, the new cached view API is present on the PR branch, updates through both public send_message() and event append paths, avoids enforce_properties on the hot append path, rebuilds explicitly, and is repopulated on resume.

Phase Result
Environment Setup make build completed successfully; no tests/lint/pre-commit run locally.
CI Status 🟡 Most checks are green; Review Thread Gate/unresolved-review-threads is failing and agent-server-tests/cross-tests are still queued, with QA/PR-review jobs in progress.
Functional Verification ✅ SDK scripts verified cached view sync, public send-message behavior, persistence resume, condensation handling, direct EventLog append sync, and malformed-history recovery.
Functional Verification

Test 1: Cached view API and incremental updates

Step 1 — Establish baseline on origin/main:
Ran git checkout --detach origin/main && OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_state_view_feature.py:

baseline_head=3996252c
state_type=ConversationState
has_view_property=False
has_append_event=False
has_rebuild_view=False
cached_view_api_available=False
baseline_exit_status=2

This confirms the baseline has no ConversationState.view, append_event, or rebuild_view plumbing.

Step 2 — Apply PR changes:
Checked out 1e1d4f4837b76bac9a9a1a77997c49b13ff0fdd0.

Step 3 — Re-run with the PR:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_state_view_feature.py:

has_view_property=True
has_append_event=True
has_rebuild_view=True
after_append_log_len=3
after_append_view_len=3
hot_path_enforce_calls=0
after_request_unhandled=True
after_condensation_view_len=1
after_direct_append_log_len=6
after_direct_append_view_len=2
enforce_calls_before_rebuild=0
rebuild_replaced_instance=True
enforce_calls_after_rebuild=1
persisted_before_close_view_len=1
resumed_view_len=1
resumed_event_log_len=1
resumed_after_append_view_len=2
resumed_after_append_log_len=2
cached_view_api_available=True

This shows the PR delivers the intended incremental cache behavior: appends update both log and view, condensation is reflected in the view, direct state.events.append(...) also keeps the cache current, hot-path appends did not invoke enforce_properties, explicit rebuild_view() did invoke it and replaced the cached instance, and a persisted conversation resumed with its cached view populated.

Test 2: Public Conversation.send_message() path

Step 1 — Establish baseline on origin/main:
Ran git checkout --detach origin/main && OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_send_message_view.py:

baseline_head=3996252c
has_state_view=False
events_after_send_message=2
send_message_baseline_exit_status=0

This confirms the existing public message path still records events, but had no cached view to inspect on main.

Step 2 — Apply PR changes:
Checked out 1e1d4f4837b76bac9a9a1a77997c49b13ff0fdd0.

Step 3 — Re-run with the PR:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_send_message_view.py:

has_state_view=True
events_after_send_message=2
view_after_send_message=2
view_last_role=user
view_last_text=QA hello through public send_message

This verifies the real user-facing conversation callback path now keeps state.view synchronized when a user sends a message, matching the PR’s claim that LocalConversation._default_callback uses the new append path.

Test 3: Malformed-history recovery rebuild path

Step 1 — Baseline context:
On origin/main, Test 1 shows there is no cached view or rebuild_view() method, so there is no recovery cache to rebuild.

Step 2 — Apply PR changes:
Checked out 1e1d4f4837b76bac9a9a1a77997c49b13ff0fdd0.

Step 3 — Exercise malformed-history recovery:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_malformed_recovery.py, using a deterministic SDK LLM subclass that raises LLMMalformedConversationHistoryError and a condenser that handles condensation requests:

has_view_property=True
has_rebuild_view=True
emitted_condensation_request=True
view_rebuilt_on_malformed_recovery=True
view_len_after_recovery=2
recovery_status=0

This confirms the malformed-history path emits a CondensationRequest and the cached view instance is rebuilt before recovery continues.

Issues Found

None.

This review was created by an AI agent (OpenHands) on behalf of the user.

Copy link
Copy Markdown
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The synchronization work from prior rounds is complete and correct. The _on_append/_wire_view_sync design cleanly serializes view mutations inside the EventLog file lock; all three explicit rebuild entry points (cold load, fork, malformed-history recovery) are properly wired and tested, and the async astep malformed-history parity test is now present. All 15 previously raised threads are resolved.

Two small points worth considering before merge.


This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation.

new_len = len(self._events)
start = new_len - synced_count - 1
for i in range(start, new_len - 1):
self._view.append_event(self._events[i])
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Suggestion: self._events[i] is called here while the EventLog file lock is held (the _on_append callback fires inside with self._fs.lock(...)). This is safe today because EventLog._get_single_item reads from the FileStore directly without re-acquiring the lock. But the safety is an implicit structural assumption — a future maintainer who adds locking inside __getitem__ (e.g., for a thread-safe read variant) would silently introduce a deadlock.

A short inline comment would make this assumption explicit:

Suggested change
self._view.append_event(self._events[i])
# Safe inside the EventLog file lock: _get_single_item
# reads via FileStore directly, no re-entrant lock.
self._view.append_event(self._events[i])

Comment on lines +495 to +511
state._wire_view_sync()
# Guard against orphaned event files from a previous incomplete
# cleanup or crash: base_state.json is missing (fresh path) but
# events/ may already contain files. Without this rebuild the
# cached view would start empty while the log is non-empty, and
# future appends would never pick up the pre-existing events.
# If the orphaned files are corrupted we fall back to an empty
# view — they will still surface errors on explicit iteration.
if len(state._events) > 0:
try:
state.rebuild_view()
except Exception:
logger.warning(
"Failed to rebuild view from orphaned events for "
"conversation %s; starting with empty view",
state.id,
)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Suggestion: The fresh-create path wires the callback before building from orphaned events (wire → rebuild), while the resume path above does the opposite (rebuild → wire). Both are correct — no concurrent access can occur during construction — but the inconsistency can mislead a future reader.

Moving _wire_view_sync() to after the orphan-rebuild block would match the resume path's "build then wire" discipline:

Suggested change
state._wire_view_sync()
# Guard against orphaned event files from a previous incomplete
# cleanup or crash: base_state.json is missing (fresh path) but
# events/ may already contain files. Without this rebuild the
# cached view would start empty while the log is non-empty, and
# future appends would never pick up the pre-existing events.
# If the orphaned files are corrupted we fall back to an empty
# view — they will still surface errors on explicit iteration.
if len(state._events) > 0:
try:
state.rebuild_view()
except Exception:
logger.warning(
"Failed to rebuild view from orphaned events for "
"conversation %s; starting with empty view",
state.id,
)
# Guard against orphaned event files from a previous incomplete
# cleanup or crash: base_state.json is missing (fresh path) but
# events/ may already contain files. Without this rebuild the
# cached view would start empty while the log is non-empty, and
# future appends would never pick up the pre-existing events.
# If the orphaned files are corrupted we fall back to an empty
# view — they will still surface errors on explicit iteration.
if len(state._events) > 0:
try:
state.rebuild_view()
except Exception:
logger.warning(
"Failed to rebuild view from orphaned events for "
"conversation %s; starting with empty view",
state.id,
)
state._wire_view_sync()

Copy link
Copy Markdown
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Taste Rating: Acceptable — the incremental view plumbing is close, but I found two remaining cache/ordering correctness issues.

[RISK ASSESSMENT]

  • [Overall PR] ⚠️ Risk Assessment: 🟡 MEDIUM — derived conversation-history cache touches condenser/error-recovery paths and future eval-sensitive hot paths.

VERDICT: ❌ Needs follow-up before approval. This review was generated by an AI agent (OpenHands) on behalf of the user.


Was this automated review useful? React with 👍 or 👎 to this review to help us measure review quality.
Workflow run: https://github.com/OpenHands/software-agent-sdk/actions/runs/26189342817

disk_length = self._count_events_on_disk()
synced_count = 0
if disk_length > self._length:
synced_count = disk_length - self._length
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 Important: synced_count is based on _count_events_on_disk(), but that helper counts any event-*.json file while _sync_from_disk() only indexes contiguous files matching EVENT_NAME_RE. A stale/corrupt file or index gap can therefore report missed events that are not actually readable; _wire_view_sync() then indexes those missing entries after the new event has already been written, so append() raises and leaves the persisted log and cached view inconsistent. Please derive synced_count from the actual indexed delta after syncing (and avoid inflating _length from the raw file count), or pass the synced events/indices explicitly.


# Force the agent to take a single step to process the condensation request
# This will trigger the condenser if it handles condensation requests
with self._state:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 Important: This lock does not make condense() wait for an in-flight arun() on the same event-loop thread. FIFOLock is reentrant by thread, and arun() holds it across await self.agent.astep(...), so another asyncio task can re-enter here, append a CondensationRequest, and run a synchronous agent.step() while the async step is still awaiting. That violates the method's wait guarantee and can order condensation events ahead of the in-flight step's response in both the event log and cached view. Please guard/coordinate with _arun_task before emitting the request (reject, interrupt/wait, or provide a task-aware async path).

@csmith49
Copy link
Copy Markdown
Collaborator Author

Superseded by #3335, which takes a different approach to the same problem.

The push-based callback design here (eagerly syncing the cached View via EventLog._on_append) kept surfacing synchronization issues across review rounds — lock ordering between the file lock and the view cache, synced_count divergence, FIFOLock reentrancy, callback wiring order, and write-path coupling through close() and condense().

#3335 replaces the callback mechanism with a pull-based watermark: the View is lazily brought up to date when state.view is read. This eliminates all EventLog changes, all lock-wrapping changes in close()/condense(), and leaves _default_callback untouched. Same public API surface (state.view, state.rebuild_view()), same three explicit rebuild entry points (cold load, fork, error recovery), 123 lines instead of 557.

#3328 (the follow-up that swaps prepare_llm_messages to consume the cached view) will be rebased onto the new branch.

This comment was posted by an AI agent (OpenHands) on behalf of @csmith49.

@csmith49 csmith49 closed this May 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

release-note-required PR requires explicit release-note coverage for behavioral or default changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants