Skip to content

Improve Codex Manipulation Of Notebooks #154

@jlewi

Description

@jlewi

Some of the issues I've observed

  • Codex ends up adding/modifying the wrong notebook

    • I think this happens if you switch tabs
    • I think our toolcalls might implicitly assume you are working with the current notebook so if you switch tabs while codex is working it messes things up
  • codex gets cell syntax wrong

    • I think there are enums for cellType that it frequently gets wrong (uses an int not the string or vice versa)
    • This makes modifications very slow because it has to do multiple trials to get it correct
  • Poor job searching Google Drive

I think we'd like to move away from multiple tool calls and just have a single toolcall to let codex execute code. We could then build out suitable libraries. These libraries could also be used by users.

How could we build a suitable sandbox to safely execute agentic code?

  • Google Drive - restrict to readonly access via scopes
  • Data exfiltration - Block network access except to Google
  • Allow Read/Write to open notebooks - these are just read/writes to indexed DB
    • We could always make them undoable

https://github.com/runmedev/web/blob/main/docs-dev/design/0310_appkernel_sandbox.md

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions