Skip to content

feat(acp): save concrete model defaults in Canvas#730

Open
simonrosenberg wants to merge 2 commits into
mainfrom
simon/acp-model-defaults
Open

feat(acp): save concrete model defaults in Canvas#730
simonrosenberg wants to merge 2 commits into
mainfrom
simon/acp-model-defaults

Conversation

@simonrosenberg
Copy link
Copy Markdown
Contributor

@simonrosenberg simonrosenberg commented May 22, 2026

Summary

  • add Canvas-local suggested ACP model lists and default models for Claude Code, Codex, and Gemini CLI
  • seed ACP settings/onboarding saves with concrete acp_model values instead of leaving the model blank
  • display ACP model names from runtime fields or configured acp_model while keeping ACP conversations out of the OpenHands LLM profile switcher
  • add tests covering provider defaults, settings saves, conversation adapter fallback behavior, and chat input gating

Validation

  • npm ci
  • npm run typecheck
  • npm test -- __tests__/constants/acp-providers.test.ts __tests__/routes/agent-settings.test.tsx __tests__/components/onboarding/choose-agent-step.test.tsx __tests__/api/agent-server-adapter.test.ts __tests__/components/features/chat/components/chat-input-model.test.tsx __tests__/components/features/chat/components/chat-input-actions.test.tsx __tests__/components/features/chat/switch-profile-button.test.tsx
  • npm run build
  • git diff --check && git diff --cached --check

Notes

This avoids the common Canvas-created ACP path showing default (recommended) by never saving an empty model for built-in providers. SDK runtime model fields are still preferred when present, then Canvas falls back to the configured acp_model.


🐳 Docker images for this PR

GHCR package: https://github.com/OpenHands/agent-canvas/pkgs/container/agent-canvas

Component Value
Image ghcr.io/openhands/agent-canvas
Architectures amd64, arm64
Agent Server ghcr.io/openhands/agent-server:1.23.0-python
Automation openhands-automation==1.0.0a3
Commit 9d1b7c1700f5172ed8f57bbec6e712be4f2f4a4d

Pull (multi-arch manifest)

# Multi-arch manifest — Docker automatically pulls the correct architecture
docker pull ghcr.io/openhands/agent-canvas:sha-9d1b7c1

Run

docker run -it --rm \
  -p 8000:8000 \
  ghcr.io/openhands/agent-canvas:sha-9d1b7c1

All tags pushed for this build

ghcr.io/openhands/agent-canvas:sha-9d1b7c1-amd64
ghcr.io/openhands/agent-canvas:simon-acp-model-defaults-amd64
ghcr.io/openhands/agent-canvas:pr-730-amd64
ghcr.io/openhands/agent-canvas:sha-9d1b7c1-arm64
ghcr.io/openhands/agent-canvas:simon-acp-model-defaults-arm64
ghcr.io/openhands/agent-canvas:pr-730-arm64
ghcr.io/openhands/agent-canvas:sha-9d1b7c1
ghcr.io/openhands/agent-canvas:simon-acp-model-defaults
ghcr.io/openhands/agent-canvas:pr-730

About Multi-Architecture Support

  • Each tag (e.g., sha-9d1b7c1) is a multi-arch manifest supporting both amd64 and arm64
  • Docker automatically pulls the correct architecture for your platform
  • Individual architecture tags (e.g., sha-9d1b7c1-amd64) are also available if needed

@vercel
Copy link
Copy Markdown

vercel Bot commented May 22, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
agent-canvas Ready Ready Preview, Comment May 22, 2026 3:39pm

Request Review

Copy link
Copy Markdown
Contributor

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taste Rating: 🟡 Acceptable - Solves a real UX problem with clean implementation, but requires verification of model defaults

[EVAL-RISK CONCERN]

This PR changes what model is saved for new ACP conversations:

  • Before: acp_model: null → falls back to provider's default
  • After: acp_model: "claude-sonnet-4-6" (or other Canvas default)

Question: Do the Canvas defaults match the actual provider defaults?

The hardcoded defaults are:

  • Claude Code: claude-sonnet-4-6
  • Codex: gpt-5.5/medium
  • Gemini CLI: gemini-2.5-pro

If these don't match what the providers would use when acp_model is null, this PR changes agent behavior and could affect benchmarks. The test name suggests they should match ("seeds built-in ACP diffs with the provider default model"), but the test only verifies that buildAcpAgentSettingsDiff uses provider.default_model - it doesn't verify that value actually matches the real provider's default.

Recommendation: Before merging, verify that these Canvas defaults match the actual provider defaults (what model would run if acp_model were null). If they don't match, this needs eval validation.

[IMPROVEMENT OPPORTUNITIES]

The implementation is solid, but a few suggestions for future work:

  1. Model list maintenance: The hardcoded model lists in acp-providers.ts will need manual updates when providers change their offerings. Consider documenting this maintenance requirement or adding a script to detect drift.

  2. Evidence for UI changes: This PR modifies three user-facing surfaces (agent settings dropdown, onboarding, chat input display) but includes no screenshots or videos. While the test coverage is good, visual verification would help confirm the UI renders correctly.

  3. Agent settings complexity: The agent-settings.tsx file grew significantly with the dropdown/custom model logic. The state management (switching between dropdown and text input, syncing with provider selection) is clear but could be extracted into a separate component in the future.

[RISK ASSESSMENT]

  • [Overall PR] ⚠️ Risk Assessment: 🟡 MEDIUM

Changes user-facing model display and what model is saved for new ACP conversations. Well-tested frontend implementation with comprehensive coverage. Main risk is the uncertainty about whether Canvas defaults match provider defaults, which could affect agent behavior. No security, dependency, or breaking change concerns. Recommend verifying model defaults before merge.

VERDICT:
Needs verification: Confirm Canvas model defaults match provider defaults to ensure no unintended behavior change

KEY INSIGHT:
The fallback chain (runtime fields → configured model → sentinel check) is elegant and future-proof, but the initial seed values need verification to avoid silent behavior changes.


Was this automated review useful? React with 👍 or 👎 to this review to help us measure review quality.
Workflow run: https://github.com/OpenHands/agent-canvas/actions/runs/26296373486

@github-actions
Copy link
Copy Markdown
Contributor

📸 Snapshot Test Report

Warning

Snapshot comparison step crashed (timeout, OOM, or runner error) — diff results below may be incomplete or absent.
Check the CI logs for the full error output (look for the "Run snapshot comparison" step).

❌ 1 snapshot differ from the main branch baseline. Add the update-snapshots label to acknowledge intentional changes.

Category Count
🔴 Changed 1
🆕 New 0
✅ Unchanged 72
Total 73

How to resolve:

  • Unintentional diffs — the baselines on main may have moved since this branch was created. Merge the latest main into this branch and re-run CI.
  • Intentional changes — add the update-snapshots label. CI will pass and the new screenshots become the baseline when this PR merges.
🔴 Changed snapshots (1)

backends-extended

backend-after-switch

Expected (main) Actual (PR) Diff
expected actual diff
✅ Unchanged snapshots (72)

archived-conversation

  • conversation-panel-with-archived-badges
  • conversation-view-archived
  • conversation-view-sandbox-error

automations

  • automations-delete-modal
  • automations-list-active-inactive
  • automations-no-automations
  • automations-search-no-results

backends-extended

  • backend-add-blank-disabled
  • backend-add-cloud-advanced-open
  • backend-add-cloud-no-key-disabled
  • backend-add-cloud-with-key-enabled
  • backend-add-form-partially-filled
  • backend-add-invalid-url-disabled
  • backend-add-local-ready
  • backend-add-name-only-disabled
  • backend-add-two-column-layout
  • backend-add-whitespace-host-disabled
  • backend-cancel-nothing-saved
  • backend-dropdown-two-backends
  • backend-edit-prefilled
  • backend-manage-after-removal
  • backend-manage-two-listed
  • backend-remove-cancelled
  • backend-remove-confirmation
  • backend-switch-overlay

backends

  • backend-add-modal
  • backend-manage-modal
  • backend-selector-open

changes-tab

  • changes-deleted-file
  • changes-diff-viewer
  • changes-empty

collapsible-thinking

  • reasoning-content-collapsed
  • reasoning-content-expanded
  • think-action-collapsed
  • think-action-expanded

mcp-page

  • mcp-custom-server-1-editor-open
  • mcp-custom-server-2-url-filled
  • mcp-custom-server-3-all-filled
  • mcp-custom-server-4-installed
  • mcp-custom-server-editor
  • mcp-empty-installed
  • mcp-search-filtered
  • mcp-slack-install-1-marketplace
  • mcp-slack-install-2-modal
  • mcp-slack-install-3-filled
  • mcp-slack-install-4-installed

onboarding

  • onboarding-step-0-choose-agent
  • onboarding-step-1-check-backend
  • onboarding-step-2-setup-llm
  • onboarding-step-3-say-hello

projects-workspace-browser

  • projects-workspace-browser

settings-page

  • add-backend-modal
  • analytics-consent-modal
  • home-screen
  • settings-app-page
  • settings-page

settings-secrets

  • secrets-add-form-filled
  • secrets-add-form
  • secrets-after-save
  • secrets-delete-confirm
  • secrets-list

settings-verification

  • condenser-settings
  • verification-settings-off
  • verification-settings-on

sidebar

  • sidebar-collapsed
  • sidebar-conversation-panel
  • sidebar-filter-menu

skills-page

  • skills-empty
  • skills-loaded
  • skills-no-match
  • skills-search-filtered
  • skills-type-filter

Generated by the Snapshot Tests workflow. This comment was created by an AI agent (OpenHands) on behalf of the repo maintainers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants