PlanExeOrg · neoneye · Mar 2, 2026 · Mar 1, 2026 · Mar 1, 2026 · Mar 1, 2026
diff --git a/docs/plan.md b/docs/plan.md
@@ -52,7 +52,9 @@ Schedule this weakness for the next A/B test improvement iteration.
 
 **Boost initial prompt:** The `initial prompt` has the biggest impact on the generated plan, if it's bad then the final plan is bad.
 If it's well written, concise, there is a higher chance for a realistic/feasible plan.
-Currently I use AIs to write the initial prompt for me by first having a long conversation about the topic,
+Using the current PlanExe MCP interface, and the initial prompts are of high quality, where the LLM looks at some reference prompts of what does a well formatted plan look like, and asks the user follow up questions to gather sufficient details.
+However when using the UI and users entering the prompt manually, the initial prompts are of low quality.
+Before MCP, I use AIs to write the initial prompt for me by first having a long conversation about the topic,
 and showing examples of other initial prompts that have worked well in the past.
 It may by a small tweak to the initial prompt and it yields a better plan.
 It may be an entire rewrite of the initial prompt.
@@ -93,20 +95,28 @@ Obtain info about what resources the user has available, and if they are willing
 
 # Secondary issues
 
-## Polishing of MCP flow via planexe.org
+## MCP - Multiple API keys
 
-As of 2026-feb-18, I'm focusing on improving MCP.
+As of 2026-mar-02, the user can only have 1 api key.
+Now I have quite a few places where PlanExe is integrated, and 
+I want to see stats on what api key is being used, and cost.
+Resetting the api key, and all the integrations dies.
+
+
+## MCP - Polishing of MCP flow via planexe.org
+
+As of 2026-mar-02, I'm focusing on improving MCP.
 PlanExe is already working with OpenClaw.
 But it's not as smooth as I would like.
 
 The user adds credits here. Start with 5 USD, so you can create around 3 plans.
-https://home.planexe.org/
+[https://home.planexe.org/](https://home.planexe.org/)
 
 The agents use the api here. When AI agents connect to the MCP interface, the credits are consumed. Between 1-2 USD per plan creation.
-https://mcp.planexe.org/mcp
+[https://mcp.planexe.org/mcp](https://mcp.planexe.org/mcp)
 
 
-## MCP with BYOK
+## MCP - BYOK
 
 Doing inference in the cloud cost money.
 Users can BYOK (Bring your own key), and choose what models they want to use.
@@ -128,11 +138,6 @@ Get step-by-step debugging working again.
 Now that I have switched to Docker, I have multiple python projects in the same repo, that use different incompatible packages.
 
 
-## GitHub CI that runs tests
-
-The hard thing is getting the venv's working.
-
-
 ## Table of content
 
 Currently the generated report has expandable/collapsible sections. There is an overwhelming amount of content inside each sections.

diff --git a/mcp_cloud/AGENTS.md b/mcp_cloud/AGENTS.md
@@ -4,11 +4,58 @@ Scope: Model Context Protocol (MCP) server that provides a standardized interfac
 for AI agents and developer tools to interact with PlanExe. Communicates with
 `worker_plan_database` via the shared Postgres database (`database_api` models).
 
+## Module map
+
+| File | Purpose | Key exports |
+|------|---------|-------------|
+| `app.py` | Re-export facade for backward compatibility | 60+ symbols from all sub-modules |
+| `http_server.py` | FastAPI HTTP wrapper: routes, middleware stack, FastMCP mount, SSE endpoint | `app` (FastAPI instance) |
+| `handlers.py` | MCP tool handlers dispatched by `TOOL_HANDLERS` dict | `handle_plan_create`, `handle_plan_status`, etc. |
+| `db_setup.py` | Flask + SQLAlchemy init, constants, request DTOs | `app` (Flask), `db`, `mcp_cloud_server`, `PLANEXE_SERVER_INSTRUCTIONS` |
+| `db_queries.py` | Sync DB queries (run via `asyncio.to_thread()`) | `_create_plan_sync`, `_get_plan_status_snapshot_sync`, `get_plan_state_mapping` |
+| `tool_models.py` | Pydantic input/output models for all 9 tools | `PlanCreateInput`, `PlanCreateOutput`, `PlanStatusOutput`, etc. |
+| `schemas.py` | Auto-generate JSON Schema from tool_models via `.model_json_schema()` | `TOOL_DEFINITIONS` (list of `ToolDefinition` dataclasses) |
+| `auth.py` | API key SHA-256 hashing + user resolution | `_hash_user_api_key`, `_resolve_user_from_api_key` |
+| `download_tokens.py` | Signed HMAC-SHA256 download tokens + URL builders | `generate_download_token`, `validate_download_token`, `_get_download_base_url` |
+| `sse.py` | SSE stream generator with connection tracking | `plan_progress_stream`, `_track_sse_connection`, `SSEConnectionLimitError` |
+| `worker_fetchers.py` | Fetch artifacts (report, zip) from worker HTTP / local FS / DB | `fetch_artifact_from_worker_plan`, `fetch_user_downloadable_zip` |
+| `model_profiles.py` | Load LLM models per profile from `llm_config/*.json` | `_get_model_profiles_sync` |
+| `example_prompts.py` | Load example prompts from catalog or built-in fallbacks | `_load_mcp_example_prompts` |
+| `zip_utils.py` | Zip extraction, legacy sanitization, SHA-256 | `list_files_from_zip_snapshot`, `_sanitize_legacy_zip_snapshot` |
+| `http_utils.py` | Strip redundant `content` when `structuredContent` exists | `strip_redundant_content` |
+| `dotenv_utils.py` | Load `.env` from mcp_cloud/ or repo root | `load_planexe_dotenv` |
+| `config.py` | Flask config (`SQLALCHEMY_TRACK_MODIFICATIONS = False`) | — |
+
+## Import graph
+
+```
+app.py (facade — re-exports everything)
+├── db_setup.py (Flask, DB, constants)
+├── auth.py (key hashing, user lookup)
+├── db_queries.py (sync queries; uses db_setup.app context)
+├── zip_utils.py (zip ops; uses db_setup.app context)
+├── worker_fetchers.py (HTTP + local + DB fallbacks; uses zip_utils)
+├── model_profiles.py (profile loading from llm_config/)
+├── download_tokens.py (HMAC tokens, context var for base URL)
+├── example_prompts.py (catalog loading)
+├── schemas.py (auto-generated from tool_models.py)
+└── handlers.py (tool handlers; imports all above)
+
+http_server.py (top-level entry point)
+├── app.py (re-export facade)
+├── tool_models.py (type annotations for FastAPI)
+├── auth.py (validate_api_key_secret)
+├── download_tokens.py (validate_download_token_secret)
+├── sse.py (SSE implementation)
+└── http_utils.py (content stripping)
+```
+
 ## Guidelines
 - Keep database access wired through `database_api.planexe_db_singleton.db`;
   do not create new engine/session instances here.
-- Preserve the startup sequence in `mcp_cloud/app.py`:
+- Preserve the startup sequence in `mcp_cloud/db_setup.py`:
   `.env` loading, logging setup, Flask app config, then `db.init_app(app)`.
+  `app.py` is a thin re-export facade; actual init happens in `db_setup.py`.
 - Maintain the DB connection logic:
   - Prefer `SQLALCHEMY_DATABASE_URI` when set.
   - Otherwise build from `PLANEXE_POSTGRES_*` (see root `AGENTS.md` for keys).
@@ -31,6 +78,43 @@ for AI agents and developer tools to interact with PlanExe. Communicates with
 - Forbidden imports: `worker_plan.app`, `worker_plan_internal`, `frontend_*`,
   `open_dir_server`.
 
+## Async/sync boundary
+
+All handlers in `handlers.py` are `async`. All database operations in `db_queries.py`
+are sync (require Flask app context). Bridge pattern:
+
+```python
+result = await asyncio.to_thread(_sync_db_function, args)
+```
+
+`db_queries.py` functions check `has_app_context()` and open one if needed, so they
+work from both sync and async callers.
+
+## Context variables
+
+Two `contextvars.ContextVar` instances are set per-request by the middleware in
+`http_server.py` and cleared in the `finally` block:
+
+- `_download_base_url_ctx` (in `download_tokens.py`): Base URL for building download
+  and SSE URLs. Set from request origin for `/mcp` and `/sse/` paths. Read by
+  `_get_download_base_url()` — used in `handlers.py` to build `download_url` and
+  `sse_url` fields.
+- `_authenticated_user_api_key_ctx` (in `http_server.py`): Authenticated user's raw
+  API key. Injected into `plan_create` and `plan_list` arguments by FastMCP wrappers.
+
+## Schema auto-generation
+
+`schemas.py` generates JSON Schema from `tool_models.py` Pydantic classes:
+
+1. Define input/output models in `tool_models.py` using `BaseModel` + `Field()`
+2. `schemas.py` calls `.model_json_schema()` on each class
+3. Wraps output schemas in `oneOf` for error/success variants
+4. Packages into `ToolDefinition` dataclasses in `TOOL_DEFINITIONS` list
+5. `handlers.py` returns `TOOL_DEFINITIONS` to MCP clients via `handle_list_tools()`
+
+When adding a field to a tool response: update `tool_models.py` model → schemas
+auto-update → no manual schema changes needed.
+
 ## plan_create contract
 - Expose `model_profiles` as the discovery tool for profile selection.
 - `model_profiles` must report profile guidance and currently available models after class whitelist filtering.
@@ -41,6 +125,24 @@ for AI agents and developer tools to interact with PlanExe. Communicates with
   - `model_profile` (`baseline`, `premium`, `frontier`, `custom`)
   - `user_api_key` (optional)
 
+## HTTP middleware stack
+
+The middleware in `http_server.py` processes requests in this order:
+
+1. **`_NormalizeMcpPath`** (ASGI middleware): Rewrites `/mcp` → `/mcp/` at scope level
+   (avoids 307 redirect that breaks Smithery)
+2. **`CORSMiddleware`** (FastAPI built-in): Added first, handles OPTIONS preflight
+3. **`enforce_api_key`** (HTTP middleware via `BaseHTTPMiddleware`): Auth, body size,
+   rate limiting, context var setup
+   - Paths requiring auth: `/mcp`, `/download`
+   - **`/sse/` is intentionally excluded** — `BaseHTTPMiddleware` pipes the response
+     body through an internal `anyio.MemoryObjectStream`; for long-lived SSE streams
+     this keeps the middleware's task-group alive indefinitely and starves concurrent
+     requests. The SSE endpoint handles auth inline instead.
+   - Download tokens are self-authenticating (signed HMAC, no API key needed)
+   - Sets `_download_base_url_ctx` for `/mcp` paths
+   - Strips redundant `content` from `/mcp` JSON responses on the way out
+
 ## MCP Protocol
 - The server communicates over stdio (standard input/output) following the MCP protocol.
 - Tools are registered via `@mcp_cloud.list_tools()` and handled via `@mcp_cloud.call_tool()`.
@@ -52,11 +154,13 @@ for AI agents and developer tools to interact with PlanExe. Communicates with
 ## Authentication Policy
 - PlanExe MCP cloud authentication is API-key header based.
 - Canonical client header is `X-API-Key: pex_...`.
+- API key sources (checked in order): `Authorization: Bearer {key}`, `X-API-Key` header, `api_key` query param.
+- Validation order: `PLANEXE_MCP_API_KEY` (shared secret) → `UserApiKey` table lookup (SHA-256 hash).
 - OAuth is not supported for the MCP API. Do not document, imply, or advertise OAuth support.
 - In docs and user-facing error/help text, instruct clients to use `X-API-Key` custom headers.
 - Keep the auth split used for connector health checks:
   - Unauthenticated discovery/handshake is allowed for:
-    - MCP methods: `initialize`, `notifications/initialized`, `tools/list`, `prompts/list`, `resources/list`, `resources/templates/list`, `ping`
+    - MCP methods: `initialize`, `notifications/initialized`, `tools/list`, `prompts/list`, `prompts/get`, `resources/list`, `resources/templates/list`, `ping`
     - Probe compatibility: `GET/HEAD/POST /mcp`, `GET/HEAD /mcp/`, and `GET /mcp/tools`
   - `tools/call` without API key is allowed **only** for free setup tools:
     - `example_plans`
@@ -65,17 +169,57 @@ for AI agents and developer tools to interact with PlanExe. Communicates with
   - All other tool invocations (for example `plan_create`) must remain API-key protected.
 - Keep auth-denial logging explicit (`Auth rejected: ...`) with method/path/user-agent and parsed JSON-RPC methods to make Railway debugging easier.
 
+## SSE endpoint
+
+`GET /sse/plan/{plan_id}` streams real-time plan progress as Server-Sent Events.
+This is a complementary REST endpoint, **not** an MCP transport replacement.
+
+Implementation in `sse.py`:
+- `plan_progress_stream(plan_id, disconnect_event)`: Async generator that polls
+  `_get_plan_status_snapshot_sync()` every ~3s and yields SSE-formatted events.
+- Deduplication: only emits `status` events when `state` or `progress_percentage` changes.
+- Heartbeat every ~20s of silence (keeps reverse proxies alive).
+- `complete` event on terminal state (completed/failed), then generator closes.
+- 60-minute absolute timeout.
+- Connection tracking via `_track_sse_connection(client_id)` async context manager:
+  per-client limit (5) and server-wide limit (200). Raises `SSEConnectionLimitError` (HTTP 429).
+
+Auth: `/sse/` paths require API key (handled inline in the endpoint, NOT via the
+`enforce_api_key` middleware — see HTTP middleware stack section for rationale).
+URL building: `handlers.py` adds `sse_url` to `plan_create` and `plan_status` responses
+using `_get_download_base_url()` from `download_tokens.py`.
+
+Event types: `status`, `heartbeat`, `complete`, `error`.
+
+Config env vars: `PLANEXE_SSE_POLL_INTERVAL`, `PLANEXE_SSE_HEARTBEAT_INTERVAL`,
+`PLANEXE_SSE_MAX_DURATION`, `PLANEXE_SSE_MAX_CONNECTIONS`, `PLANEXE_SSE_MAX_TOTAL_CONNECTIONS`.
+
+## Download token system
+
+`download_tokens.py` generates signed, time-limited HMAC-SHA256 tokens scoped to
+a specific plan + artifact. Token format: `{expiry_unix_ts}.{hmac_hex}`.
+
+- HMAC message: `plan_id:filename:expiry` — prevents token reuse across artifacts.
+- Secret priority: `PLANEXE_DOWNLOAD_TOKEN_SECRET` → `PLANEXE_API_KEY_SECRET` → per-process random (dev only).
+- Default TTL: 900 seconds (15 minutes), configurable via `PLANEXE_DOWNLOAD_TOKEN_TTL`.
+- Tokens are stateless (no server-side storage or revocation tracking).
+
+The same `_get_download_base_url()` function is used to build both `download_url`
+(in `plan_file_info`) and `sse_url` (in `plan_create`/`plan_status`).
+
+## Download URL environment behavior
+- `plan_file_info.download_url` and `sse_url` are built from `PLANEXE_MCP_PUBLIC_BASE_URL` when set.
+- If `PLANEXE_MCP_PUBLIC_BASE_URL` is unset in HTTP mode, use request host/scheme
+  (via `_download_base_url_ctx` context var set in middleware).
+- If no public base URL is available, `download_url` and `sse_url` may be absent;
+  document this and guide operators to set `PLANEXE_MCP_PUBLIC_BASE_URL`.
+
 ## HTTP Compatibility and Crawler Endpoints
 - Keep `/mcp` -> `/mcp/` redirect behavior for slashless clients/probers.
 - Keep CORS headers on early error responses (401/403/429/etc.) so browser inspectors do not fail with opaque CORS errors.
 - Keep `PLANEXE_MCP_CORS_ORIGINS` parsing tolerant to quoted CSV and JSON-array env formats.
 - Keep `GET /robots.txt` available (200) for crawler health checks and metadata discovery.
-- FastMCP session lifecycle lines like `Terminating session: None` are expected informational logs; do not treat them as application failures solely based on Railway’s log-level labeling.
-
-## Download URL environment behavior
-- `plan_file_info.download_url` should be built from `PLANEXE_MCP_PUBLIC_BASE_URL` when set.
-- If `PLANEXE_MCP_PUBLIC_BASE_URL` is unset in HTTP mode, use request host/scheme.
-- If no public base URL is available, `download_url` may be absent; document this and guide operators to set `PLANEXE_MCP_PUBLIC_BASE_URL`.
+- FastMCP session lifecycle lines like `Terminating session: None` are expected informational logs; do not treat them as application failures solely based on Railway's log-level labeling.
 
 ## mcp_local integration
 - `mcp_local` runs on the user's machine and forwards tool calls to this server over HTTP.
@@ -128,8 +272,13 @@ for AI agents and developer tools to interact with PlanExe. Communicates with
 
 ## Testing
 - Automated tests exist under `mcp_cloud/tests/`.
+- Tests require Docker dependencies (flask_sqlalchemy, mcp, database_api) — they
+  run inside the Docker container, not from the host Python directly.
 - If you change MCP tool behavior, state mapping, or tool surface, update/add unit
   tests close to the changed logic.
+- Test pattern: `unittest.TestCase` + `unittest.mock.patch` on `mcp_cloud.handlers.*`
+  and `mcp_cloud.sse.*` to mock DB calls without a real database.
 - Run focused tests from repo root, for example:
   - `python -m unittest mcp_cloud.tests.test_tool_surface_consistency`
   - `python -m unittest mcp_cloud.tests.test_plan_status_tool`
+  - `python -m pytest mcp_cloud/tests/test_sse.py -v`