Conversation
🦋 Changeset detectedLatest commit: ef258fa The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request establishes a new Model Context Protocol (MCP) server, enabling robust programmatic interaction with the Platforma Desktop application. It introduces a suite of tools that allow external clients, such as AI assistants, to seamlessly manage projects, control block execution, access internal states, retrieve logs, and directly manipulate the desktop user interface. This enhancement significantly boosts the automation capabilities and extensibility of the Platforma Desktop platform. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces a comprehensive MCP server for Platforma Desktop, including a new package pl-mcp-server, extensive integration tests, and detailed implementation documentation. The changes are well-structured and cover a wide range of functionalities from project and block management to UI interaction. My review focuses on improving robustness, correctness in edge cases, and configuration accuracy. I've identified a potential build issue in package.json and a few areas in the server implementation where error handling and Unicode support could be enhanced.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #1518 +/- ##
==========================================
- Coverage 54.14% 54.08% -0.06%
==========================================
Files 254 254
Lines 14567 14567
Branches 3028 3028
==========================================
- Hits 7887 7879 -8
- Misses 5663 5676 +13
+ Partials 1017 1012 -5 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
> "implement step 1 and commit"
- New package @milaboratories/pl-mcp-server in lib/node/pl-mcp-server/
- PlMcpServer class: Streamable HTTP transport on localhost:{port}/{secret}/mcp
- Secret path validation (404 for wrong paths), Origin header validation
- Dummy ping tool proving end-to-end MCP connectivity
- Integration test package in tests/mcp-server/ with withMcpServer helper
- Tests: ping tool, tool listing, wrong secret 404, wrong path 404
- Added @modelcontextprotocol/sdk to workspace catalog
…mpat "fix MCP server tests failing due to zod version incompatibility" SDK v1.27.1 requires zod@^3.25 (imports zod/v3 subpath), but workspace pins zod@~3.23.8. Downgraded to v1.22.0 (last version with zod@^3.23.8 as regular dep). Added zod-to-json-schema@~3.24.1 override to prevent transitive dep from resolving to 3.25.x.
…mpat "upgrade zod instead of downgrading MCP SDK" Reverts SDK downgrade from previous commit. Upgrades zod from ~3.23.8 to ~3.25.76 which provides the /v3 subpath export required by MCP SDK v1.27.1. Removes zod-to-json-schema override. Full project build and type check passes (165/165 tasks).
"avoid version mismatch when used in desktop app" Desktop app has its own pl-middle-layer version. Using peer dep ensures the MCP server uses the host's version, avoiding incompatible private property declarations in TypeScript.
"fix type check failure in mcp-server-tests" pnpm doesn't resolve peer deps to workspace packages automatically. Adding pl-middle-layer as both peer dep (for external consumers) and dev dep (workspace:* for monorepo) ensures TypeScript sees a single MiddleLayer type.
"fix session rejection when Claude Code reconnects" Single-transport design bound to one session — reconnecting clients were rejected. Now creates a new McpServer+transport per session, tracked by session ID. Each client gets its own session. Transports are cleaned up on close.
"implement step 3: project management tools for MCP server" Adds 5 project management tools to the MCP server: - list_projects: lists all projects with IDs, labels, status - create_project: creates a new project, returns projectId - open_project: opens a project for editing - close_project: closes an opened project - delete_project: permanently deletes a project Uses resourceIdToString for project IDs consistent with desktop app. All tools use Zod input schema validation via MCP SDK. Integration tests cover full lifecycle and multi-project scenarios.
"wire project lifecycle events to desktop app UI" Added PlMcpServerCallbacks interface with onProjectCreated, onProjectOpened, onProjectClosed, onProjectDeleted hooks. Desktop app wires these to worker events that refresh the project list and navigate the UI when MCP tools modify projects.
"enable AI assistants to see the desktop app screen" Adds captureScreenshot callback to PlMcpServerCallbacks and a capture_screenshot MCP tool that returns the current window as a base64 PNG image. Desktop app wires this via invokeParentMethod → webContents.capturePage().
"implement step 4: block management tools for MCP server" Adds 4 block management tools: - add_block: add a block from registry (from-registry-v2) or local dev (dev-v2) - remove_block: delete a block from a project - run_block: start block execution (auto-starts stale upstreams) - stop_block: stop a running block Uses getOpenedProject helper to resolve projectId → Project. Integration tests cover add/remove lifecycle and tool listing.
"implement step 5: block state read/write tools for MCP server" Adds 3 block state tools: - get_project_overview: returns all blocks with status, canRun, stale, errors - get_block_state: returns block's user data (__data) and outputs - set_block_data: updates block data via mutateBlockStorage (triggers args derivation) Integration tests verify overview retrieval, data roundtrip (set → get).
"enable AI assistants to interact with desktop app UI directly" Adds 5 UI interaction tools via Electron's sendInputEvent and executeJavaScript APIs: - click: click at x,y coordinates (single or double) - type_text: type text into focused element - press_key: press keyboard keys with optional modifiers - scroll: scroll at a given position - execute_js: run JavaScript in renderer for DOM queries
"document coordinate scaling, execute_js pattern, and hybrid testing workflow" Covers the critical coordinate system issue (device pixels vs CSS pixels), the getBoundingClientRect pattern for precise element targeting, and the practical workflow for testing real bioinformatics blocks via UI tools.
"list blocks from configured registries with optional name filter"
"navigate desktop UI to display a specific block"
"simplify Step 6: two-phase wait in one tool"
"add answers to all 5 open questions"
| export type LiveLogHandle = `${typeof LIVE_LOG_PREFIX}${string}`; | ||
|
|
||
| /** Handle of the ready logs of a program. */ | ||
| export type ReadyLogHandle = `log+ready://log/${string}`; | ||
| export type ReadyLogHandle = `${typeof READY_LOG_PREFIX}${string}`; | ||
|
|
| const overview = await project.overview.getValue(); | ||
| if (!overview) continue; |
There was a problem hiding this comment.
are you sure that we don't add some delay?
| import type { PlMcpServerCallbacks } from "../server"; | ||
|
|
||
| export interface AuthorMarker { | ||
| authorId: string; |
| * Returns token estimate, or a string like ">1234 (truncated at 10000 nodes)" | ||
| * if the object graph is too large to fully traverse. | ||
| */ | ||
| export function estimateTokens(value: unknown, nodeLimit = NODE_LIMIT): number | string { |
There was a problem hiding this comment.
why types.ts include some logic?
| * Returns token estimate, or a string like ">1234 (truncated at 10000 nodes)" | ||
| * if the object graph is too large to fully traverse. | ||
| */ | ||
| export function estimateTokens(value: unknown, nodeLimit = NODE_LIMIT): number | string { |
There was a problem hiding this comment.
let's test functions with more or less complex logic
| context: Record<string, unknown>, | ||
| timeout: number, | ||
| ): unknown { | ||
| return runInNewContext(`(${expression})`, context, { |
There was a problem hiding this comment.
can you explain this concern? what do you want to achieve?
There was a problem hiding this comment.
Forgot to move this to the QuickJS VM. Thank you!
Fixed in ef258fa
There was a problem hiding this comment.
Block data could be huge, no HUGE. We don't want to load full json into the context. Thats why we have transform argument. It allows claude inject JS to query exact data.
This https://github.com/milaboratory/platforma/blob/feature/mcp-server/lib/node/pl-mcp-server/src/tools/block-state.ts#L43 will help you.
…Json - Rename isAnyLogHandle → isLogHandle (drop "Any" from new names) - Call getSpec before getShape (getShape triggers join calculation) - Simplify vectorToJson: push pTableValue() directly, no if-else - Add requiresCreatePTable: 2 to test MiddleLayer capabilities
| organization: string, | ||
| name: string, | ||
| version: string, | ||
| ) => Promise<unknown>; |
| login?: string; | ||
| }>; | ||
| /** Disconnect from current server. */ | ||
| disconnect?: () => Promise<void>; |
There was a problem hiding this comment.
you mean onDisconnect ?
| } | ||
|
|
||
| /** Set or update the MiddleLayer instance (e.g. after connecting to a server). */ | ||
| setMiddleLayer(ml: MiddleLayer | null | undefined) { |
There was a problem hiding this comment.
do we really need this method? mb better have readonly ml?
| this.ml = ml ?? null; | ||
| } | ||
|
|
||
| get url(): string { |
…kens logic - Brand LiveLogHandle/ReadyLogHandle with Branded<> (#2) - Add 500ms minimum delay between polls in await_block_done (#3) - Remove duplicate AuthorMarker, import from pl-middle-layer (#4) - Move estimateTokens/summarizeOutputs from types.ts to tokens.ts (#6) - Tests already exist and pass from tokens.ts (#7)
node:vm is not a security sandbox — expressions can escape and access Node APIs. QuickJS (via quickjs-emscripten, already in the monorepo) runs in a WASM sandbox with no access to Node APIs, filesystem, or process. Memory limited to 16MB, execution interrupted via deadline. Data is marshaled via JSON in/out. safeEval is now async (QuickJS init is lazy). Shared runtime across evaluations, fresh context per call. Moved safeEval from types.ts to sandbox.ts.
Summary
lib/node/pl-mcp-server/) with Streamable HTTP transportTest plan
Greptile Summary
This PR introduces a new
@milaboratories/pl-mcp-serverpackage that exposes Platforma Desktop's project/block management capabilities as an MCP (Model Context Protocol) server over Streamable HTTP transport, enabling AI agents to drive the desktop application programmatically. The implementation is well-structured — tools are cleanly separated by domain, the multi-session transport lifecycle is correctly handled (including the race-condition guard added in the latest commit), and the test suite covers the key CRUD/pipeline/state paths with proper setup/teardown.A few issues were found:
data-query.ts— dead pFrame detection condition:lowerPath.endsWith(".pFrame")is applied aftertoLowerCase(), so the uppercaseFensures the condition is alwaysfalse. Harmless today becauselowerPath.includes("pframe")already covers the common case, but the dead branch is misleading.data-query.ts— O(n²)vectorToJsoninquery_table:vectorToJsonis called once per cell (rows × columns) rather than once per column. At the 1000-row limit with a wide table this materialises the full column array for every single cell, which is wasteful and should be fixed before large tables are queried.logs.ts— silent data loss whenshouldUpdateHandleistrue: Live log handles can signal that the handle is stale viashouldUpdateHandle. The current code silently omits the entry from results rather than still decodingresponse.dataor surfacing a staleness hint, making log reads silently return nothing for live handles.data-query.ts— identical ternary branches: Two locations uses.type === "column" ? s.spec.name : s.spec.name, which is a no-op ternary likely left from a copy-paste.types.ts—getOpenedProjecttyped asPromise<any>: This disables TypeScript's checks on all opened-project method calls across the block/state/log/await tools.Confidence Score: 3/5
lib/node/pl-mcp-server/src/tools/data-query.tsandlib/node/pl-mcp-server/src/tools/logs.tsneed the most attention before these tools are used against real data.Important Files Changed
transports.set) and top-level error handler are solid. Port retry logic correctly preserves the request handler across recreated servers.lowerPath.endsWith(".pFrame")never fires because the path is already lowercased; (2)vectorToJsonis called once per cell rather than once per column causing O(n²) work at max 1000-row limit; (3) two ternary branches with identicals.spec.nameexpressions indicate a likely copy-paste error.shouldUpdateHandleistrue, log data available inresponse.datais discarded and the key is silently omitted from results, giving callers no indication that logs exist but the handle is stale.Promise.race+AbortSignal.timeout+setTimeouttriple is slightly redundant but functionally correct.getOpenedProjectis typed asPromise<any>, disabling type-checking across all block/state/log/await tools that consume the returned object.getFreePortcorrectly exported for direct-HTTP tests in server.test.ts.Comments Outside Diff (3)
lib/node/pl-mcp-server/src/server.ts, line 830-832 (link)delete_projectusesentry.idinstead ofentry.ridAll other project operations (
open_project,close_project) callml.openProject(entry.rid)/ml.closeProject(entry.rid)using theResourceId. Onlydelete_projectcallsml.deleteProject(entry.id). Ifentry.idis a different type (e.g., a plain string database identifier rather than theResourceId), this will pass the wrong argument todeleteProjectand likely fail at runtime or silently delete nothing. Compare with the pattern used everywhere else:Prompt To Fix With AI
lib/node/pl-mcp-server/src/server.ts, line 971-973 (link)refreshStatebefore awaiting overviewlist_projectsandresolveProjectboth callml.projectList.refreshState()before awaiting the value, ensuring fresh data.get_project_overviewdirectly callsproject.overview.awaitStableValue()without first triggering a refresh. This may return a stale cached value, especially immediately after mutations (e.g.,set_block_data).Consider adding a refresh step:
Prompt To Fix With AI
lib/node/pl-mcp-server/src/server.ts, line 1009-1060 (link)await_block_donedoes not handle all terminal error statesThe polling loop only checks for
"Done"and"Limbo"as terminal states. If there are other terminal/error states in the Platforma calculation status enum (e.g.,"RunningError","Failed", or any future additions), the loop will keep polling until the fulltimeoutelapses rather than returning immediately with the error condition.This is a usability issue — an AI agent waiting for a failed block will silently wait the full timeout (default 2 minutes) instead of being told immediately that the block failed. Consider exhaustively listing all non-
Runningstates that should short-circuit the loop, or inverting the check to only continue looping on explicitly"Running"/"NotCalculated"states.Prompt To Fix With AI
Prompt To Fix All With AI
Reviews (2): Last reviewed commit: "fix: address PR review comments" | Re-trigger Greptile