Skip to content

fix(starter): route /api/models to report-worker instead of orchestrator WS#37

Open
borisnieuwenhuis wants to merge 4 commits into
masterfrom
vanta-node-starter
Open

fix(starter): route /api/models to report-worker instead of orchestrator WS#37
borisnieuwenhuis wants to merge 4 commits into
masterfrom
vanta-node-starter

Conversation

@borisnieuwenhuis
Copy link
Copy Markdown
Contributor

@borisnieuwenhuis borisnieuwenhuis commented Feb 18, 2026

The /api/models rewrite in next.config.ts pointed to MODEL_ORCHESTRATOR_URL which is a WebSocket endpoint, causing Upgrade Required errors on REST requests from the frontend.

Routes to the report-worker's /reports/models REST API instead.

Analysis

Here's what I found after reading both the PR and the full codebase:

Architecture

The model-orchestrator runs two separate servers:

┌────────────────────┬───────────────────────┬──────────┬────────────────────────────────────────┐
│ Server │ Port │ Protocol │ Purpose │
├────────────────────┼───────────────────────┼──────────┼────────────────────────────────────────┤
│ WebSocket server │ configurable (e.g. │ WS │ Streams model state changes in │
│ │ 8000) │ │ real-time │
├────────────────────┼───────────────────────┼──────────┼────────────────────────────────────────┤
│ Local Deploy REST │ 8001 (hardcoded) │ HTTP │ CRUD for models — only started in │
│ API │ │ │ local/dev mode │
└────────────────────┴───────────────────────┴──────────┴────────────────────────────────────────┘

The next.config.ts rewrite /api/models → MODEL_ORCHESTRATOR_URL/models currently hits the WS
server, which explicitly rejects non-upgrade requests with 426 Upgrade Required.

What the WS server provides

Connect to ws://:/<crunch_id> and you get:

  • init event on connect — all running models for that crunch
  • update event on state change — the changed models
  • Data shape per model:
      { deployment_id, model_id, infos: { model_name, cruncher_id, cruncher_name, ... }, state, ip,  port }                                                                                             
    

What the models table needs

The ModelsTable currently renders: ID, model_name, cruncher_name, status, desired_state (with a
toggle switch), plus log URIs for the logs dialog.

The WS messages are missing several fields the table depends on:

  • desired_state (needed for the running/stopped toggle)
  • statusMessage (shown in tooltip)
  • builder_log_uri / runner_log_uri (needed for log viewer)
  • Granular status (the WS only sends the runner state, not build statuses like BUILDER_BUILDING,
    BUILDER_FAILED, etc.)

The crunch_id problem

The WS requires /<crunch_id> in the connection path. The starter app currently has no concept of
which crunch it belongs to — it just calls GET /models and gets everything back.

My recommendation

A pure WS replacement won't work today without backend changes, because the WS messages don't
carry enough data for the table. There are two realistic paths:

Option A — Hybrid (best UX, more work)

  1. Enrich the WS init/update messages in the orchestrator to include desired_state, status,
    statusMessage, and log URIs (same shape as the REST GET /models response)
  2. Create a useModelsWebSocket hook in the frontend that connects to
    ws://<MODEL_ORCHESTRATOR_URL>/<crunch_id>, populates state from init, and merges update events
  3. Keep REST mutations (add/update/delete) routed to the local deploy REST API on port 8001 via
    the Next.js rewrite
  4. Add a NEXT_PUBLIC_CRUNCH_ID env var (or fetch it from a new endpoint) so the WS knows which
    crunch to subscribe to

This gives you instant real-time updates instead of the current 30-second polling, and the table
stays fully functional.

Option B — Accept PR #37 (quick fix, no real-time)

Route /api/models to the report-worker's REST endpoint. The table keeps polling every 30 seconds
like it does now. Simple, works, but no streaming.

My suggestion

Option A is the right direction, but it requires a small backend change first: enrich the WS
_create_model_state_message in websocket_server.py to include the missing fields (desired_state,
status, statusMessage, log URIs). That's ~15 lines of Python. After that, the frontend hook is
straightforward.

…tor WS

The /api/models rewrite pointed to MODEL_ORCHESTRATOR_URL which is a
WebSocket endpoint, causing 'Upgrade Required' errors on REST requests.
Route to the report-worker's /reports/models REST API instead.
@vercel
Copy link
Copy Markdown

vercel Bot commented Feb 18, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Actions Updated (UTC)
coordinator-webapp Ignored Ignored Feb 19, 2026 2:32pm

Request Review

The Next.js container doesn't have Docker access. Proxy SSE log
streams to the report-worker's /logs/{container} endpoint instead.
The model orchestrator serves the full model list (with builder_log_uri
and runner_log_uri) as well as the log proxy endpoints. The report-worker's
model list is still accessible via /reports/models through the catch-all
API rewrite.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant