Skip to content

Wire ExecuteCode code mode in ChatKit runtime#160

Open
jlewi wants to merge 10 commits intomainfrom
codex/dev/jlewi/codemode
Open

Wire ExecuteCode code mode in ChatKit runtime#160
jlewi wants to merge 10 commits intomainfrom
codex/dev/jlewi/codemode

Conversation

@jlewi
Copy link
Copy Markdown
Collaborator

@jlewi jlewi commented Apr 1, 2026

Summary

  • wire code-mode tool execution into the ChatKit runtime path
  • add a dedicated codeModeExecutor for ExecuteCode tool calls
  • update responsesDirectChatkitFetch to advertise and handle ExecuteCode function calls
  • integrate the execution flow in ChatKitPanel
  • add and extend runtime tests for ExecuteCode schema and tool-call chaining behavior

Why

  • enables ChatKit/web to execute code-mode tool calls generated by the model
  • keeps function-call handling explicit and test-covered in the direct responses runtime

Validation

  • pnpm exec vitest run src/lib/runtime/responsesDirectChatkitFetch.test.ts src/lib/runtime/codeModeExecutor.test.ts
  • pnpm exec vitest run src/components/ChatKit/ChatKitPanel.test.tsx

@jlewi jlewi marked this pull request as ready for review April 3, 2026 22:39
@jlewi jlewi requested a review from sourishkrout as a code owner April 3, 2026 22:39
@jlewi
Copy link
Copy Markdown
Collaborator Author

jlewi commented Apr 3, 2026

This should be mergeable.
We should merge https://github.com/runmedev/runme#1095 at the same time other codex will stop working.
That PR is blocked on upgrading golang in ci.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f108d70e92

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +163 to +167
await Promise.race([
kernelRun,
new Promise<void>((_resolve, reject) => {
timer = setTimeout(() => {
reject(new Error(`ExecuteCode timed out after ${timeoutMs}ms`))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Terminate timed-out ExecuteCode runs

This timeout branch only rejects Promise.race and never stops the underlying kernelRun, so code can keep executing after the tool already returned a timeout error. Any script that outlives timeoutMs can still perform notebook mutations in the background, which makes the conversation state diverge from the reported failure.

Useful? React with 👍 / 👎.

Comment on lines +269 to +271
clientError.length > 0
? `Tool execution failed: ${clientError}`
: toOutputString(outputValue)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve tool output when clientError is set

When clientError is present, this code replaces the tool result with a generic failure string and drops result.output. The new ExecuteCode flow returns partial stdout/stderr on failures (for example, timeout after initial logs), so discarding output here hides useful diagnostics from the follow-up model turn.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant