Cache View on ConversationState, update incrementally#3324
Conversation
Introduces `ConversationState.view` as a derived-but-cached projection of the event log, updated in lockstep with appends via the new `ConversationState.append_event` write path. The view is fully re-derived (running `View.enforce_properties`) only via `ConversationState.rebuild_view`, which is invoked on cold load (resuming from persistence) and after fork event-copy. Hot-path agent steps no longer need to pay the `View.from_events` cost on every iteration once call sites are migrated in a follow-up PR. The default callback in `LocalConversation` and the fork event-copy now go through `state.append_event`, keeping the cached view in sync without introducing any behavioral change for existing callers (who read events via `state.events`). Also documents on `CondenserBase.condense` that implementations must treat the passed view as read-only, since it may now be a cached projection owned by `ConversationState`. Refs: #3053 Co-authored-by: openhands <openhands@all-hands.dev>
When the LLM rejects the current message history as structurally malformed (broken tool_use/tool_result pairing, etc.), the incremental View we just sent may itself be the source of the problem — for example, a manipulation-indices bug that let a property slip past the inductive maintenance path. Re-derive the cached view from the event log with full `enforce_properties` before triggering the condensation retry, so the condenser operates on a clean, enforced view. This is one of the two explicit `rebuild_view` entry points described in the original design (the other being cold-load on `ConversationState.create`, which already exists in the previous commit on this branch). Refs: #3053 Co-authored-by: openhands <openhands@all-hands.dev>
Python API breakage checks — ✅ PASSEDResult: ✅ PASSED Behavioral default changes detectedThese public
|
REST API breakage checks (OpenAPI) — ✅ PASSEDResult: ✅ PASSED |
Coverage Report •
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||
all-hands-bot
left a comment
There was a problem hiding this comment.
✅ QA Report: PASS
Functional QA verified the cached ConversationState.view plumbing through real SDK usage; the PR achieves its stated plumbing goal with no QA issues found.
Does this PR achieve its stated goal?
Yes. The goal was to add and maintain a cached View on ConversationState, update it incrementally through the sanctioned write path, and rebuild it at explicit recovery/load/fork entry points without changing Agent.step behavior. I exercised the SDK as a user would: sending messages through Conversation, appending events/condensations through ConversationState, forking conversations, and resuming a persisted conversation. In each case the cached view stayed in sync with the event log, hot-path appends did not call enforce_properties, and explicit rebuild/cold-load/fork paths produced views matching the full event log.
| Phase | Result |
|---|---|
| Environment Setup | ✅ make build completed successfully and created the uv environment. |
| CI Status | 🟡 At time checked: 18 successful, 7 pending, 1 cancelled, 1 skipped; no failing checks reported. |
| Functional Verification | ✅ Real SDK scripts verified conversation append, condensation, rebuild, fork, and persistence resume behavior. |
Functional Verification
Test 1: Baseline on main versus PR branch API/user path
Step 1 — Establish baseline without the PR:
Ran git switch --detach origin/main && OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_view_cache.py using a script that creates a real Conversation, sends a user message, and probes the state APIs:
api.has_view=False
api.has_append_event=False
api.has_rebuild_view=False
conversation.events_after_send=2
conversation.cached_view_available=False
append.skipped=no_append_event_api
fork.skipped=no_view_api
This shows the base branch can record conversation events, but it has no cached view API or incremental append/rebuild API to exercise.
Step 2 — Apply the PR's changes:
Checked out incremental-view-state at 566db5dc43d426a95c12289fdb62f989ff9b4848.
Step 3 — Re-run with the PR in place:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_view_cache.py:
api.has_view=True
api.has_append_event=True
api.has_rebuild_view=True
conversation.initial_events=0
conversation.initial_view=0
conversation.events_after_send=2
conversation.view_after_send=2
conversation.ids_match=True
conversation.last_view_source=user
This confirms the new surface exists and the real Conversation.send_message("hello from qa") path updates both the event log and cached view in lockstep.
Test 2: Incremental append, condensation, and explicit rebuild behavior
Step 1 — Baseline without the PR:
The same base-branch script reported append.skipped=no_append_event_api, so there was no incremental append/rebuild path to use.
Step 2 — Apply the PR's changes:
Used the PR branch and appended three real MessageEvents plus a real Condensation through state.append_event(...), while instrumenting View.enforce_properties to count hot-path calls.
Step 3 — Re-run with the PR in place:
Output from OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_view_cache.py:
append.enforce_calls_after_messages=0
append.event_log_len=4
append.view_len_after_condensation=1
append.remaining_is_middle_message=True
append.enforce_calls_after_condensation=0
rebuild.enforce_calls_after_rebuild_and_compare=2
rebuild.replaced_cached_instance=True
rebuild.matches_full_from_events=True
This confirms normal incremental appends/condensation do not run enforcement, condensation is reflected in the view, and rebuild_view() replaces the cache with a view equivalent to View.from_events(state.events).
Test 3: Fork and cold-load/reload entry points
Fork path:
The PR-branch script used a real conversation, sent a message, and called convo.fork():
fork.source_events=2
fork.events=2
fork.view=2
fork.ids_match_source=True
fork.view_matches_events=True
This confirms the forked conversation gets a cached view matching its copied event log.
Cold-load / persistence resume path:
Ran a real persisted conversation, resumed it by conversation_id, and compared event IDs to cached view IDs:
resume.before_events=2
resume.before_view=2
resume.after_events=2
resume.after_view=2
resume.ids_preserved=True
resume.view_matches_events=True
This confirms the cold-load resume path rebuilds/initializes state.view from persisted events.
Issues Found
None.
This review was created by an AI agent (OpenHands) on behalf of the user.
all-hands-bot
left a comment
There was a problem hiding this comment.
🟡 Taste Rating: Acceptable — good direction, but cache ownership is not airtight yet.
I found three correctness issues around keeping the derived View synchronized with every event write. Because this touches conversation/condenser plumbing, I’m leaving a COMMENT for maintainer follow-up rather than approval.
[RISK ASSESSMENT]
- [Overall PR]
⚠️ Risk Assessment: 🟡 MEDIUM
Derived state caching in core conversation flow can silently diverge under existing write/concurrency paths; no dependency or security changes.
VERDICT: ❌ Needs follow-up before merge.
KEY INSIGHT: The cache needs a single enforced writer, not just a documented preferred writer.
This review was generated by an AI agent (OpenHands) on behalf of the user.
Improve this review? If any feedback above seems incorrect or irrelevant to this repository, you can teach the reviewer to do better:
- Add a
.agents/skills/custom-codereview-guide.mdfile to your branch (or edit it if one already exists) with the/codereviewtrigger and the context the reviewer is missing (e.g., "Security concerns about X do not apply here because Y"). See the customization docs for the required frontmatter format.- Re-request a review - the reviewer reads guidelines from the PR branch, so your changes take effect immediately.
- When your PR is merged, the guideline file goes through normal code review by repository maintainers.
Resolve with AI? Install the iterate skill in your agent and run
/iterateto automatically drive this PR through CI, review, and QA until it's merge-ready.
Was this automated review useful? React with 👍 or 👎 to this review to help us measure review quality.
Workflow run: https://github.com/OpenHands/software-agent-sdk/actions/runs/26169383121
…ondense() race
Thread 1 – Guard against direct EventLog.append bypassing the view cache:
Add _on_append callback to EventLog, invoked after every successful append
with (event, synced_count). ConversationState._wire_view_sync() wires this
callback so that both state.append_event() and direct state.events.append()
keep the cached _view in sync. append_event() no longer calls
_view.append_event() directly — the callback handles it uniformly.
Thread 2 – Handle EventLog syncing extra events from disk:
EventLog.append() now tracks how many events were synced from disk before
writing the new event and passes this count to _on_append. When synced_count
> 0, the callback performs a full View.from_events rebuild instead of an
incremental update, so disk-synced events are not missed by the view.
Thread 3 – Fix condense() race condition:
Move _on_event(condensation_request) inside the `with self._state:` block
in condense(), so the view mutation cannot race with a concurrent run().
Tests:
- test_direct_eventlog_append_also_updates_view: verifies the on_append
callback keeps the view in sync for direct EventLog.append calls.
- test_view_rebuilds_when_eventlog_syncs_extra_events: simulates another
writer and verifies the full rebuild path.
Co-authored-by: openhands <openhands@all-hands.dev>
all-hands-bot
left a comment
There was a problem hiding this comment.
The three issues raised in the previous review have all been addressed and resolved — the _on_append callback on EventLog, the synced_count tracking for disk-synced events, and the condense() race are all fixed. The plumbing is clean: the single _sync closure in _wire_view_sync handles both the incremental O(1) path and the full-rebuild fallback for disk-sync scenarios, and the test suite covers the hot-path invariant (zero enforce_properties calls), parity with View.from_events, condensation application, fork, persistence resume, and the direct-eventlog-append bypass.
Two observations below worth considering before merge.
This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation.
all-hands-bot
left a comment
There was a problem hiding this comment.
✅ QA Report: PASS
Verified the new ConversationState cached-view plumbing by running SDK conversation flows; the cache stayed aligned with the event log across normal appends, condensation, cold reload, fork, stale log sync, and malformed-history recovery.
Does this PR achieve its stated goal?
Yes. The PR set out to add incremental cached View plumbing with full rebuilds reserved for cold load, fork, and error recovery while preserving existing event-log behavior. The base branch showed the existing send_message path produced the same event-log behavior but had no state.view / append_event API; on the PR branch, the same path kept state.view in sync. Additional runtime checks showed hot-path appends made 0 enforce_properties calls, explicit rebuild_view() did run enforcement, reload/fork views matched logs, stale event-log sync rebuilt correctly, and malformed-history recovery rebuilt once before emitting CondensationRequest.
| Phase | Result |
|---|---|
| Environment Setup | ✅ make build completed; SDK imports and scripts ran via uv run python. |
| CI Status | ⏳ Most checks green; only image build jobs plus qa-changes/pr-review were still in progress when checked. |
| Functional Verification | ✅ SDK scenarios exercised the changed behavior directly; no functional issues found. |
Functional Verification
Test 1: Existing Conversation.send_message behavior plus new cached view
Step 1 — Establish baseline without the PR:
Ran git checkout origin/main && OPENHANDS_SUPPRESS_BANNER=1 uv run python -u /tmp/qa_state_view_scenario.py:
initial_event_count=0
has_state_view=False
has_append_event=False
after_send_event_count=2
last_event_type=MessageEvent
last_event_source=user
view_event_count=UNAVAILABLE
This confirms the previous SDK path appended the user message to the event log, but no cached view API existed.
Step 2 — Apply the PR's changes:
Checked out incremental-view-state at 65f3f9976e987bb06ae7f788ed8c4d8074495d1e.
Step 3 — Re-run with the PR:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python -u /tmp/qa_state_view_scenario.py:
initial_event_count=0
has_state_view=True
has_append_event=True
after_send_event_count=2
last_event_type=MessageEvent
last_event_source=user
view_event_count=2
view_last_event_type=MessageEvent
view_matches_log=True
This shows existing event-log behavior remained intact and the new cached view tracked the same events through the real SDK send_message entry point.
Test 2: Hot-path appends avoid full enforcement; explicit rebuild enforces
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python -u /tmp/qa_hot_path_scenario.py:
event_count_after_sends=3
view_count_after_sends=3
enforce_calls_after_sends=0
enforce_calls_after_rebuild=1
rebuild_replaced_view=True
view_matches_log_after_rebuild=True
This confirms the PR's core performance claim: normal SDK appends maintained the cached view without running enforce_properties, while rebuild_view() did run enforcement and produced a fresh cache matching the log.
Test 3: Condensation updates the cached view but remains persisted in the log
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python -u /tmp/qa_condensation_scenario.py:
view_count_before_condensation=3
log_count_after_condensation=4
view_count_after_condensation=2
forgotten_event_removed_from_view=True
condensation_persisted_in_log=True
This shows a Condensation event is retained in the event log while the cached view reflects the forgotten event removal.
Test 4: Cold reload and fork rebuild the cached view from persisted events
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python -u /tmp/qa_persist_fork_scenario.py:
original_event_count=2
original_view_count=2
original_view_matches_log=True
reloaded_event_count=2
reloaded_view_count=2
reloaded_view_matches_log=True
fork_event_count=2
fork_view_count=2
fork_view_matches_log=True
fork_matches_source=True
This verifies the documented cold-load and fork entry points: after resume and fork, the cached view matched the persisted/copied event log.
Test 5: A stale live handle resyncs when another handle appended events
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python -u /tmp/qa_two_handle_sync_scenario.py:
after_conv1_initial_log=2
after_conv1_initial_view=2
after_conv2_append_log=3
after_conv2_append_view=3
after_conv1_second_append_log=4
after_conv1_second_append_view=4
conv1_view_matches_log=True
This exercises the event-log sync path a user could hit with two live conversation handles sharing persisted storage; the stale handle rebuilt enough state to keep the cached view aligned.
Test 6: Malformed-history recovery rebuilds before requesting condensation
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python -u /tmp/qa_malformed_recovery_scenario.py:
rebuild_calls=1
emitted_condensation_request=True
emitted_event_types=['CondensationRequest']
This confirms the error-recovery entry point calls rebuild_view() exactly once before emitting the condensation retry request.
Issues Found
None.
This review was generated by an AI agent (OpenHands) on behalf of the user.
all-hands-bot
left a comment
There was a problem hiding this comment.
🟡 Taste Rating: Acceptable — good direction, but cache synchronization still has lock-order holes.
I found two important concurrency issues that can let ConversationState.view diverge from the event log under existing write paths.
[RISK ASSESSMENT]
- [Overall PR]
⚠️ Risk Assessment: 🟡 MEDIUM
Derived mutable cache in core conversation state can silently affect future message preparation/condenser behavior if synchronization is not airtight.
VERDICT: ❌ Needs follow-up before approval.
This review was generated by an AI agent (OpenHands) on behalf of the user.
Was this automated review useful? React with 👍 or 👎 to this review to help us measure review quality.
Workflow run: https://github.com/OpenHands/software-agent-sdk/actions/runs/26181992831
…optimization Thread 6 – Move _on_append inside EventLog file lock: The callback now fires while the file lock is still held, so concurrent appends execute their callbacks in the same order as their writes. This prevents the view from diverging from the persisted event order. Thread 7 – Move run() error emission under state lock: The except handler in run() now wraps both the execution_status mutation and the ConversationErrorEvent emission in `with self._state:`, so the view update in _on_event cannot race with a concurrent send_message()/run(). Thread 4 – Suppress callback during fork bulk copy: The fork loop now suppresses the _on_append callback during the n-event bulk copy (set_on_append(None)), since rebuild_view() on the next line discards the incremental work. _wire_view_sync() is called after rebuild_view() to restore the callback. Thread 5 – Thread-safety contract documentation: Added docstring notes to set_on_append() and _wire_view_sync() making the locking contract explicit: the callback is serialized by the EventLog file lock, but callers must additionally hold the ConversationState lock to prevent races with non-append readers. Co-authored-by: openhands <openhands@all-hands.dev>
all-hands-bot
left a comment
There was a problem hiding this comment.
The three concurrency correctness issues from the previous rounds — event-log write bypass, disk-sync missed events, and the condense() / run() race — are all cleanly resolved in this version. The _on_append callback firing inside the file lock, synced_count-driven rebuild, and the state-lock expansions in condense() and the run() exception path are exactly the right fixes.
Two points worth addressing before merge, plus one small test gap:
This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation.
all-hands-bot
left a comment
There was a problem hiding this comment.
⚠️ QA Report: PASS WITH ISSUES
Functional QA confirms the PR’s cached ConversationState.view plumbing works through real SDK usage; CI is not fully green at review time.
Does this PR achieve its stated goal?
Yes. I exercised the SDK as a user would by creating conversations, sending messages, appending events, resuming persisted conversations, forking, and triggering malformed-history recovery. On the PR branch, state.view, append_event, and rebuild_view are present; the cached view stays synchronized with the event log, condensation is reflected in the view, cold resume/fork preserve the rebuilt view, hot-path appends did not call enforce_properties, and malformed-history recovery called rebuild_view() once before emitting CondensationRequest.
| Phase | Result |
|---|---|
| Environment Setup | ✅ make build completed successfully; no test suite, linters, or pre-commit runs were executed. |
| CI Status | Merge Multi-Arch Manifests (java) and unresolved-review-threads are failing; python/golang manifest merge checks are cancelled; QA/review workflows are still in progress. |
| Functional Verification | ✅ SDK behavior matched the PR goal across baseline comparison, cached view operations, hot-path enforcement behavior, resume/fork, and recovery handling. |
Functional Verification
Test 1: Baseline vs PR cached-view surface and conversation behavior
Step 1 — Establish baseline on origin/main:
Ran git checkout origin/main && OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_view_cache_exercise.py:
{
"after_resume": {"events_len": 2},
"after_send_message": {"events_len": 2},
"has_append_event": false,
"has_rebuild_view": false,
"has_view_property": false,
"sdk_import": "ok"
}This shows the existing SDK could create/resume a conversation and enqueue a real user message, but had no cached-view API surface.
Step 2 — Apply the PR changes:
Checked out incremental-view-state at 42f382a9c77bd99d8c6c72e8cd57fdecb438ab7a.
Step 3 — Re-run the same SDK exercise on the PR branch:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_view_cache_exercise.py:
{
"has_append_event": true,
"has_rebuild_view": true,
"has_view_property": true,
"after_send_message": {
"events_len": 2,
"view_len": 2,
"event_ids_equal_view_ids": true
},
"after_manual_appends_and_condensation": {
"events_len": 5,
"view_len": 3,
"manual_1_in_view": false,
"manual_2_in_view": true
},
"after_resume": {
"events_len": 5,
"view_len": 3,
"view_ids_equal_fresh_log_ids": true
},
"after_fork": {
"fork_events_len": 5,
"fork_view_len": 3
}
}This confirms the new cached-view surface exists, user send_message() keeps events and view aligned, direct append_event/state.events.append keep the cache updated, condensation removes the forgotten event from the view while leaving it in the log, and persisted resume/fork both produce a synchronized cached view.
Test 2: Hot-path append avoids full enforcement, explicit rebuild enforces
Step 1 — Establish baseline:
The baseline branch has no ConversationState.append_event/rebuild_view API to exercise, so this behavior is new in the PR.
Step 2 — Apply the PR changes:
Used incremental-view-state at 42f382a9c77bd99d8c6c72e8cd57fdecb438ab7a.
Step 3 — Exercise the new write path with instrumentation:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_hot_path_enforce.py:
{
"counts_after_create": 0,
"counts_after_appends": 0,
"counts_after_rebuild": 1,
"events_len": 5,
"view_len_after_appends": 5,
"view_len_after_rebuild": 5
}This confirms five real SDK event appends updated the cached view without invoking View.enforce_properties, while an explicit rebuild_view() did invoke enforcement once.
Test 3: Malformed-history recovery rebuilds before condensation retry
Step 1 — Establish baseline on origin/main:
Ran git checkout origin/main && OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_malformed_recovery.py:
{
"emitted_condensation_request": true,
"emitted_event_types": ["CondensationRequest"],
"events_len_after_recovery": 2,
"has_rebuild_view": false,
"rebuild_calls": 0,
"view_len_after_recovery": null
}This shows the existing recovery path emitted a condensation request, but there was no cached view or rebuild hook to refresh first.
Step 2 — Apply the PR changes:
Checked out incremental-view-state at 42f382a9c77bd99d8c6c72e8cd57fdecb438ab7a.
Step 3 — Re-run with the PR behavior:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_malformed_recovery.py:
{
"emitted_condensation_request": true,
"emitted_event_types": ["CondensationRequest"],
"events_len_after_recovery": 2,
"has_rebuild_view": true,
"rebuild_calls": 1,
"view_len_after_recovery": 2
}This confirms the malformed-history path now rebuilds the cached view exactly once before emitting the condensation retry request.
Issues Found
- 🟡 CI status: Functional QA found no product behavior issues, but CI is not fully green:
Merge Multi-Arch Manifests (java)andunresolved-review-threadsare failing, with python/golang manifest merge checks cancelled and QA/review workflows still in progress.
Final verdict: PASS WITH ISSUES — the PR achieves its stated SDK behavior goal, but merge readiness still depends on resolving/triaging the non-green CI checks.
This QA review was created by an AI agent (OpenHands) on behalf of the user.
all-hands-bot
left a comment
There was a problem hiding this comment.
🟡 Taste Rating: Acceptable — the cache plumbing is close, but I found two remaining synchronization holes that can leave state.view divergent.
[RISK ASSESSMENT]
- [Overall PR]
⚠️ Risk Assessment: 🟡 MEDIUM — derived mutable cache in core conversation state needs airtight initialization and lock discipline before future hot-path use.
VERDICT: ❌ Needs follow-up before approval. This review was generated by an AI agent (OpenHands) on behalf of the user.
Was this automated review useful? React with 👍 or 👎 to this review to help us measure review quality.
Workflow run: https://github.com/OpenHands/software-agent-sdk/actions/runs/26185966291
… fork safety Thread A – Replace O(N) rebuild with incremental sync: The _sync callback now applies only the synced_count missed events incrementally rather than rebuilding the entire view via View.from_events. Lock hold is now O(synced_count) instead of O(N). Thread B – try/finally in fork bulk copy: rebuild_view() + _wire_view_sync() now run in a finally block so the _on_append callback is unconditionally restored even if a deep-copy raises during the bulk event copy. Thread C – CondensationRequest flag test: Added test_condensation_request_sets_flag_incrementally to verify the incremental path sets unhandled_condensation_request without adding the event to the view (not LLMConvertible). Thread D – Orphaned events in fresh-create path: If events/ has files but base_state.json is missing (crash / partial cleanup), the fresh-create path now calls rebuild_view() to populate the cached view from pre-existing events. Corrupted orphans degrade gracefully to an empty view with a warning. Added regression test. Thread E – close() session-end hooks under state lock: run_session_end() in close() now runs inside `with self._state:` so hook-emitted events cannot race with concurrent run()/send_message(). Updated _wire_view_sync docstring to list close() and run() error handler as call sites that satisfy the lock contract. Co-authored-by: openhands <openhands@all-hands.dev>
# Conflicts: # openhands-sdk/openhands/sdk/conversation/impl/local_conversation.py
all-hands-bot
left a comment
There was a problem hiding this comment.
✅ QA Report: PASS
Verified the PR’s SDK plumbing by exercising cached ConversationState.view behavior through real Conversation usage, persistence resume, fork, direct event appends, condensation, and malformed-history recovery.
Does this PR achieve its stated goal?
Yes. The PR set out to add a cached ConversationState.view that stays synchronized with the event log, add sanctioned append/rebuild paths, and reserve full rebuilds for cold load/fork/error recovery while preserving existing conversation behavior. In real SDK scripts, main had no state.view/append_event; the PR branch exposed both, kept state.view synchronized after send_message, append_event, direct state.events.append, and Condensation, rebuilt correctly after persistence resume and fork, and rebuilt a deliberately corrupted cached view during malformed-history recovery before emitting CondensationRequest.
| Phase | Result |
|---|---|
| Environment Setup | ✅ make build completed successfully; no test suite, linters, or pre-commit runs were executed. |
| CI Status | 🟡 Queried via gh pr checks: 15 success, 9 in progress, 3 queued, 2 skipped at the time of QA. |
| Functional Verification | ✅ SDK behavior matched the PR’s stated plumbing goals. |
Functional Verification Evidence
Test 1: Cached view surface and incremental updates
Step 1 — Establish baseline on origin/main:
Ran git checkout origin/main && OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_view_cache_behavior.py:
HEAD is now at 3996252c feat: add async LLM completion support across the SDK (#3284)
state_has_view=False
state_has_append_event=False
after_send_message_event_count=2
after_send_message_view_ids=<no state.view>
This confirms the base branch does not expose the new cached view or append API, so there is no cached projection to inspect after a real Conversation.send_message(...) operation.
Step 2 — Apply the PR changes:
Checked out incremental-view-state at 4410b99f7f115731cd213c00c41ae705e8238d70.
Step 3 — Re-run with the PR:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_view_cache_behavior.py:
state_has_view=True
state_has_append_event=True
after_send_message_event_count=2
after_send_message_view_ids=['1cb7f2a5-d979-4c95-b24a-bf18d9be62c8', 'ae6fa3ad-c263-4e67-888d-63d0d564a56f']
after_append_event_count=3
after_append_event_view_ids=['1cb7f2a5-d979-4c95-b24a-bf18d9be62c8', 'ae6fa3ad-c263-4e67-888d-63d0d564a56f', '9f8bd7b1-e9cf-4e0f-ab4e-c7ffa3339825']
after_direct_append_count=4
after_direct_append_view_ids=['1cb7f2a5-d979-4c95-b24a-bf18d9be62c8', 'ae6fa3ad-c263-4e67-888d-63d0d564a56f', '9f8bd7b1-e9cf-4e0f-ab4e-c7ffa3339825', '4f6c226c-94b7-4aa0-8c6a-1c943802acb7']
after_condensation_event_count=5
after_condensation_view_ids=['ae6fa3ad-c263-4e67-888d-63d0d564a56f', '9f8bd7b1-e9cf-4e0f-ab4e-c7ffa3339825', '4f6c226c-94b7-4aa0-8c6a-1c943802acb7']
rebuild_preserves_view_ids=True
This shows the PR adds the intended surface and keeps the cached view updated through the real conversation callback path, the new append_event path, direct EventLog.append, condensation application, and explicit rebuild.
Test 2: Cold-load resume and fork rebuild paths
Step 1 — Baseline:
As shown above, main has no state.view, so cold-load/fork cache rebuilding is not available before this PR.
Step 2 — Apply the PR changes:
Used the PR checkout at 4410b99f7f115731cd213c00c41ae705e8238d70.
Step 3 — Exercise persisted resume and fork with the PR:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python -u /tmp/qa_view_cache_lifecycle.py:
created_conversation_id=c3f62895-2d16-44a0-aa3d-780779f39728
original_event_count=2
original_view_matches_events=True
resumed_event_count=2
resumed_view_matches_original=True
resumed_view_matches_events=True
fork_event_count=2
fork_view_matches_source=True
fork_view_matches_events=True
This confirms the cached view is rebuilt correctly on persisted conversation resume and on fork, matching the underlying event log and source view.
Test 3: Malformed-history recovery rebuild path
Step 1 — Baseline:
On main, there is no cached state.view to rebuild during malformed-history recovery. The PR’s new behavior is the added rebuild before emitting a condensation retry request.
Step 2 — Apply the PR changes:
Used the PR checkout at 4410b99f7f115731cd213c00c41ae705e8238d70.
Step 3 — Exercise recovery with a request-handling condenser:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python -u /tmp/qa_malformed_recovery.py:
view_corrupted_before_step=True
LLM raised malformed conversation history error, triggering condensation retry with condensed history: qa malformed history
condensation_request_emitted=True
view_rebuilt_after_malformed=True
rebuilt_event_count=2
This intentionally corrupted the cached view after a real send_message(...); when the LLM raised LLMMalformedConversationHistoryError, Agent.step(...) rebuilt the cached view from the event log and emitted CondensationRequest, matching the PR’s recovery goal.
Issues Found
None.
This QA review was created by an AI agent (OpenHands) on behalf of the user.
all-hands-bot
left a comment
There was a problem hiding this comment.
Clean, well-scoped plumbing PR. The incremental-update invariant (_sync callback math, lock ordering, enforce_properties reserved for three explicit entry points) is correct and thoroughly documented. Test coverage is solid. Two items below worth addressing before merge.
This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation.
all-hands-bot
left a comment
There was a problem hiding this comment.
🟡 Taste Rating: Acceptable — sync cache plumbing is coherent, but async malformed-history recovery missed the same repair path.
[RISK ASSESSMENT]
- [Overall PR]
⚠️ Risk Assessment: 🟡 MEDIUM — derived conversation-history cache touches condenser/error-recovery behavior and future eval-sensitive hot paths.
VERDICT: ❌ Needs follow-up before approval. One important async parity issue inline.
This review was generated by an AI agent (OpenHands) on behalf of the user.
Was this automated review useful? React with 👍 or 👎 to this review to help us measure review quality.
Workflow run: https://github.com/OpenHands/software-agent-sdk/actions/runs/26188223827
… astep() parity Thread A – Nested try/finally in fork bulk copy: If rebuild_view() raises, _wire_view_sync() was silently skipped because a single finally block aborts on its first exception. Nested try/finally ensures the callback is unconditionally restored. Thread B – Orphan test exercises ConversationState.create(): Replaced the manual mimicry of the orphan guard with a real ConversationState.create(persistence_dir=tmpdir) call against a temp directory containing orphaned event files on disk. Any regression in the production code path is now caught. Thread C – rebuild_view() parity in astep(): The malformed-history handler in astep() now calls state.rebuild_view() before emitting CondensationRequest, mirroring the sync step() path. Added async regression test with acompletion/aresponses overrides on the test LLM. Co-authored-by: openhands <openhands@all-hands.dev>
all-hands-bot
left a comment
There was a problem hiding this comment.
✅ QA Report: PASS
The PR achieves its plumbing goal: ConversationState now exposes a cached view that stays synchronized through real SDK operations, while rebuild remains an explicit recovery/cold-load path.
Does this PR achieve its stated goal?
Yes. I exercised the SDK as a user would by creating conversations, sending messages, appending events, condensing history, resuming persisted conversations, and triggering malformed-history recovery with a deterministic LLM. Compared with origin/main, the new cached view API is present on the PR branch, updates through both public send_message() and event append paths, avoids enforce_properties on the hot append path, rebuilds explicitly, and is repopulated on resume.
| Phase | Result |
|---|---|
| Environment Setup | ✅ make build completed successfully; no tests/lint/pre-commit run locally. |
| CI Status | 🟡 Most checks are green; Review Thread Gate/unresolved-review-threads is failing and agent-server-tests/cross-tests are still queued, with QA/PR-review jobs in progress. |
| Functional Verification | ✅ SDK scripts verified cached view sync, public send-message behavior, persistence resume, condensation handling, direct EventLog append sync, and malformed-history recovery. |
Functional Verification
Test 1: Cached view API and incremental updates
Step 1 — Establish baseline on origin/main:
Ran git checkout --detach origin/main && OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_state_view_feature.py:
baseline_head=3996252c
state_type=ConversationState
has_view_property=False
has_append_event=False
has_rebuild_view=False
cached_view_api_available=False
baseline_exit_status=2
This confirms the baseline has no ConversationState.view, append_event, or rebuild_view plumbing.
Step 2 — Apply PR changes:
Checked out 1e1d4f4837b76bac9a9a1a77997c49b13ff0fdd0.
Step 3 — Re-run with the PR:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_state_view_feature.py:
has_view_property=True
has_append_event=True
has_rebuild_view=True
after_append_log_len=3
after_append_view_len=3
hot_path_enforce_calls=0
after_request_unhandled=True
after_condensation_view_len=1
after_direct_append_log_len=6
after_direct_append_view_len=2
enforce_calls_before_rebuild=0
rebuild_replaced_instance=True
enforce_calls_after_rebuild=1
persisted_before_close_view_len=1
resumed_view_len=1
resumed_event_log_len=1
resumed_after_append_view_len=2
resumed_after_append_log_len=2
cached_view_api_available=True
This shows the PR delivers the intended incremental cache behavior: appends update both log and view, condensation is reflected in the view, direct state.events.append(...) also keeps the cache current, hot-path appends did not invoke enforce_properties, explicit rebuild_view() did invoke it and replaced the cached instance, and a persisted conversation resumed with its cached view populated.
Test 2: Public Conversation.send_message() path
Step 1 — Establish baseline on origin/main:
Ran git checkout --detach origin/main && OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_send_message_view.py:
baseline_head=3996252c
has_state_view=False
events_after_send_message=2
send_message_baseline_exit_status=0
This confirms the existing public message path still records events, but had no cached view to inspect on main.
Step 2 — Apply PR changes:
Checked out 1e1d4f4837b76bac9a9a1a77997c49b13ff0fdd0.
Step 3 — Re-run with the PR:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_send_message_view.py:
has_state_view=True
events_after_send_message=2
view_after_send_message=2
view_last_role=user
view_last_text=QA hello through public send_message
This verifies the real user-facing conversation callback path now keeps state.view synchronized when a user sends a message, matching the PR’s claim that LocalConversation._default_callback uses the new append path.
Test 3: Malformed-history recovery rebuild path
Step 1 — Baseline context:
On origin/main, Test 1 shows there is no cached view or rebuild_view() method, so there is no recovery cache to rebuild.
Step 2 — Apply PR changes:
Checked out 1e1d4f4837b76bac9a9a1a77997c49b13ff0fdd0.
Step 3 — Exercise malformed-history recovery:
Ran OPENHANDS_SUPPRESS_BANNER=1 uv run python /tmp/qa_malformed_recovery.py, using a deterministic SDK LLM subclass that raises LLMMalformedConversationHistoryError and a condenser that handles condensation requests:
has_view_property=True
has_rebuild_view=True
emitted_condensation_request=True
view_rebuilt_on_malformed_recovery=True
view_len_after_recovery=2
recovery_status=0
This confirms the malformed-history path emits a CondensationRequest and the cached view instance is rebuilt before recovery continues.
Issues Found
None.
This review was created by an AI agent (OpenHands) on behalf of the user.
all-hands-bot
left a comment
There was a problem hiding this comment.
The synchronization work from prior rounds is complete and correct. The _on_append/_wire_view_sync design cleanly serializes view mutations inside the EventLog file lock; all three explicit rebuild entry points (cold load, fork, malformed-history recovery) are properly wired and tested, and the async astep malformed-history parity test is now present. All 15 previously raised threads are resolved.
Two small points worth considering before merge.
This review was generated by an AI agent (OpenHands) on behalf of the user through OpenHands Automation.
| new_len = len(self._events) | ||
| start = new_len - synced_count - 1 | ||
| for i in range(start, new_len - 1): | ||
| self._view.append_event(self._events[i]) |
There was a problem hiding this comment.
🟡 Suggestion: self._events[i] is called here while the EventLog file lock is held (the _on_append callback fires inside with self._fs.lock(...)). This is safe today because EventLog._get_single_item reads from the FileStore directly without re-acquiring the lock. But the safety is an implicit structural assumption — a future maintainer who adds locking inside __getitem__ (e.g., for a thread-safe read variant) would silently introduce a deadlock.
A short inline comment would make this assumption explicit:
| self._view.append_event(self._events[i]) | |
| # Safe inside the EventLog file lock: _get_single_item | |
| # reads via FileStore directly, no re-entrant lock. | |
| self._view.append_event(self._events[i]) |
| state._wire_view_sync() | ||
| # Guard against orphaned event files from a previous incomplete | ||
| # cleanup or crash: base_state.json is missing (fresh path) but | ||
| # events/ may already contain files. Without this rebuild the | ||
| # cached view would start empty while the log is non-empty, and | ||
| # future appends would never pick up the pre-existing events. | ||
| # If the orphaned files are corrupted we fall back to an empty | ||
| # view — they will still surface errors on explicit iteration. | ||
| if len(state._events) > 0: | ||
| try: | ||
| state.rebuild_view() | ||
| except Exception: | ||
| logger.warning( | ||
| "Failed to rebuild view from orphaned events for " | ||
| "conversation %s; starting with empty view", | ||
| state.id, | ||
| ) |
There was a problem hiding this comment.
🟡 Suggestion: The fresh-create path wires the callback before building from orphaned events (wire → rebuild), while the resume path above does the opposite (rebuild → wire). Both are correct — no concurrent access can occur during construction — but the inconsistency can mislead a future reader.
Moving _wire_view_sync() to after the orphan-rebuild block would match the resume path's "build then wire" discipline:
| state._wire_view_sync() | |
| # Guard against orphaned event files from a previous incomplete | |
| # cleanup or crash: base_state.json is missing (fresh path) but | |
| # events/ may already contain files. Without this rebuild the | |
| # cached view would start empty while the log is non-empty, and | |
| # future appends would never pick up the pre-existing events. | |
| # If the orphaned files are corrupted we fall back to an empty | |
| # view — they will still surface errors on explicit iteration. | |
| if len(state._events) > 0: | |
| try: | |
| state.rebuild_view() | |
| except Exception: | |
| logger.warning( | |
| "Failed to rebuild view from orphaned events for " | |
| "conversation %s; starting with empty view", | |
| state.id, | |
| ) | |
| # Guard against orphaned event files from a previous incomplete | |
| # cleanup or crash: base_state.json is missing (fresh path) but | |
| # events/ may already contain files. Without this rebuild the | |
| # cached view would start empty while the log is non-empty, and | |
| # future appends would never pick up the pre-existing events. | |
| # If the orphaned files are corrupted we fall back to an empty | |
| # view — they will still surface errors on explicit iteration. | |
| if len(state._events) > 0: | |
| try: | |
| state.rebuild_view() | |
| except Exception: | |
| logger.warning( | |
| "Failed to rebuild view from orphaned events for " | |
| "conversation %s; starting with empty view", | |
| state.id, | |
| ) | |
| state._wire_view_sync() |
all-hands-bot
left a comment
There was a problem hiding this comment.
🟡 Taste Rating: Acceptable — the incremental view plumbing is close, but I found two remaining cache/ordering correctness issues.
[RISK ASSESSMENT]
- [Overall PR]
⚠️ Risk Assessment: 🟡 MEDIUM — derived conversation-history cache touches condenser/error-recovery paths and future eval-sensitive hot paths.
VERDICT: ❌ Needs follow-up before approval. This review was generated by an AI agent (OpenHands) on behalf of the user.
Was this automated review useful? React with 👍 or 👎 to this review to help us measure review quality.
Workflow run: https://github.com/OpenHands/software-agent-sdk/actions/runs/26189342817
| disk_length = self._count_events_on_disk() | ||
| synced_count = 0 | ||
| if disk_length > self._length: | ||
| synced_count = disk_length - self._length |
There was a problem hiding this comment.
🟠 Important: synced_count is based on _count_events_on_disk(), but that helper counts any event-*.json file while _sync_from_disk() only indexes contiguous files matching EVENT_NAME_RE. A stale/corrupt file or index gap can therefore report missed events that are not actually readable; _wire_view_sync() then indexes those missing entries after the new event has already been written, so append() raises and leaves the persisted log and cached view inconsistent. Please derive synced_count from the actual indexed delta after syncing (and avoid inflating _length from the raw file count), or pass the synced events/indices explicitly.
|
|
||
| # Force the agent to take a single step to process the condensation request | ||
| # This will trigger the condenser if it handles condensation requests | ||
| with self._state: |
There was a problem hiding this comment.
🟠 Important: This lock does not make condense() wait for an in-flight arun() on the same event-loop thread. FIFOLock is reentrant by thread, and arun() holds it across await self.agent.astep(...), so another asyncio task can re-enter here, append a CondensationRequest, and run a synchronous agent.step() while the async step is still awaiting. That violates the method's wait guarantee and can order condensation events ahead of the in-flight step's response in both the event log and cached view. Please guard/coordinate with _arun_task before emitting the request (reject, interrupt/wait, or provide a task-aware async path).
|
Superseded by #3335, which takes a different approach to the same problem. The push-based callback design here (eagerly syncing the cached View via #3335 replaces the callback mechanism with a pull-based watermark: the View is lazily brought up to date when #3328 (the follow-up that swaps This comment was posted by an AI agent (OpenHands) on behalf of @csmith49. |
Refs: #3053
Summary
Lays the groundwork for the incremental
Viewproposed in #3053.ConversationStatenow owns a cachedViewthat is updated in lockstep with the event log via a newappend_eventwrite path. The expensiveView.from_eventsrebuild (which runsenforce_properties) is reserved for three explicit entry points: cold load, fork, and error recovery.This PR is intentionally scoped to plumbing only —
prepare_llm_messagesis untouched andAgent.stepstill callsView.from_events(state.events)on every iteration. A tiny follow-up will swap that single call site tostate.view.eventsto harvest the perf win, but keeping it out of this PR means there is zero behavioral change for existing callers and the diff stays easy to review.Changes
openhands-sdk/openhands/sdk/conversation/state.py_view: View = PrivateAttr(default_factory=View)— derived, not serialized.viewread-only property.append_event(event)— the only sanctioned write path; appends to_eventsand incrementally updates_view. O(1) forLLMConvertibleEvents, O(view) only when applying aCondensation. Does not runenforce_properties.rebuild_view()— fullView.from_eventsre-derivation; invoked only on cold load, fork, and error recovery.ConversationState.create()resume path callsrebuild_view()after attaching theEventLog.openhands-sdk/openhands/sdk/conversation/impl/local_conversation.py_default_callbackcallsself._state.append_event(e)instead ofself._state.events.append(e), so the cached view stays current on every step.append_eventper copied event, then a singlerebuild_view()at the end (same posture as cold load).openhands-sdk/openhands/sdk/context/condenser/base.pyCondenserBase.condense()docstring now explicitly says implementations must treat the input view as read-only, since it may now be a cached projection owned byConversationState. Existing condensers in this repo already honor this — none mutate the input view in place — but the contract is now documented for future implementers.openhands-sdk/openhands/sdk/agent/agent.pyAgent.step()'sLLMMalformedConversationHistoryErrorhandler callsstate.rebuild_view()before emittingCondensationRequest, so the condenser operates on a freshly enforced view if the incremental one is the source of the malformed history.Where
rebuild_view()is wiredThree explicit entry points, matching the "only run enforce when..." list in #3053:
ConversationState.create()resume path.LocalConversation.fork(...)after deep-copying events.Agent.step()malformed-history handler.Normal agent steps stay on the O(1) incremental path with zero
enforce_propertiesinvocations.Tests
tests/sdk/conversation/test_state_view_cache.py(new, 7 tests):append_eventupdates both the log and the view.View.from_events(state.events)after many appends.Condensationevents are applied incrementally.append_eventcallsenforce_properties0 times (verified by monkey-patch counter).rebuild_viewdoes runenforce_properties.rebuild_viewreplaces the cached instance with one derived from the log.tests/sdk/agent/test_agent_context_window_condensation.py(one new test):test_agent_rebuilds_view_on_malformed_history_recovery— assertsrebuild_viewis called exactly once before the condensation retry on the malformed-history path.Verification run locally
uv run pre-commit runon all changed files: clean.tests/sdk/agent/+tests/sdk/conversation/+tests/sdk/context/: 1228 passed, 1 deselected (pre-existing unrelatedtest_acp_agentfailure onmain).Backward compatibility
View,View.from_events,View.append_event,View.enforce_properties,CondenserBase.condense(view, ...),ConversationState.events.ConversationState.view,ConversationState.append_event,ConversationState.rebuild_view.Agent.step: it still callsprepare_llm_messages(state.events, ...)as today. The cached view is maintained but unused on the hot path; consumers can opt in later.Follow-ups (separate PRs)
Agent.stepto read fromstate.view.events, harvesting the perf win this PR sets up.OPENHANDS_VIEW_VERIFYparity assertion that re-runsenforce_propertieson a copy after each step and logs any divergence without modifying the view.This PR was created by an AI agent (OpenHands) on behalf of the user.
@csmith49 can click here to continue refining the PR
Agent Server images for this PR
• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server
Variants & Base Images
eclipse-temurin:17-jdknikolaik/python-nodejs:python3.13-nodejs22-slimgolang:1.21-bookwormPull (multi-arch manifest)
# Each variant is a multi-arch manifest supporting both amd64 and arm64 docker pull ghcr.io/openhands/agent-server:1e1d4f4-pythonRun
All tags pushed for this build
About Multi-Architecture Support
1e1d4f4-python) is a multi-arch manifest supporting both amd64 and arm641e1d4f4-python-amd64) are also available if needed