diff --git a/.claude-plugin/marketplace.json b/.claude-plugin/marketplace.json
index 6ac5a7f..e690104 100644
--- a/.claude-plugin/marketplace.json
+++ b/.claude-plugin/marketplace.json
@@ -6,13 +6,13 @@
     "url": "https://github.com/jjackson"
   },
   "metadata": {
-    "version": "0.13.277"
+    "version": "0.13.285"
   },
   "plugins": [
     {
       "name": "ace",
       "source": "./",
-      "version": "0.13.277",
+      "version": "0.13.285",
       "description": "AI Connect Engine — orchestrates the CRISPR-Connect lifecycle from idea through app building, Connect setup, LLO management, and closeout"
     }
   ]
diff --git a/.claude-plugin/plugin.json b/.claude-plugin/plugin.json
index d58b3c6..ae2c79f 100644
--- a/.claude-plugin/plugin.json
+++ b/.claude-plugin/plugin.json
@@ -1,6 +1,6 @@
 {
   "name": "ace",
-  "version": "0.13.277",
+  "version": "0.13.285",
   "description": "AI Connect Engine — orchestrates the CRISPR-Connect lifecycle from idea through app building, Connect setup, LLO management, and closeout",
   "author": {
     "name": "Jonathan Jackson",
diff --git a/CHANGELOG.md b/CHANGELOG.md
index 49c8b83..73be37d 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -5,6 +5,20 @@ All notable changes to the ACE plugin will be documented in this file.
 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
 and the plugin follows [semantic versioning](https://semver.org/spec/v2.0.0.html).
 
+## 0.13.285 — 2026-05-19
+
+**Add `file_path` mode to `ocs_upload_collection_files` — close the b64 context wedge that stalled Phase 5 twice.**
+
+Two consecutive `ace:ocs-setup` dispatches on `leep-paint-collection/20260517-1515` hit stream-idle timeouts (one at ~30 min / 49 tool calls, second at ~114 min / 44 tool calls) without writing any Drive artifacts. Session-log bisect (see `docs/learnings/2026-05-19-ocs-upload-b64-context-wedge.md`) pinned the cause: the agent built its RAG content pack on disk (~67 KB), `base64`-encoded it via Bash, then **`Read` the resulting `.b64` chunks back into its own context** so it could emit them as the `ocs_upload_collection_files` tool_use `input.files[].content` field. Generating 100s of KB of b64 as output tokens stalls model generation either mid-emit or on the next turn. No OCS slowness, no auth churn, no QA loop — pure output-token budget exhaustion.
+
+Fix: `ocs_upload_collection_files` extended to accept `file_path` as an alternative source per file. The MCP reads the file server-side, no b64 ever crosses the agent's context. Each file MUST supply EXACTLY ONE of `content` (legacy inline b64) or `file_path` (absolute filesystem path); mixed or missing sources fail fast with a named error citing the offending file.
+
+Refactor: the file-decoding logic moved into `decodeUploadCollectionFileSource`, exported for unit-testability. 7 new vitest cases (UTF-8 text via file_path, arbitrary binary via file_path, inline content legacy mode, missing source, both sources, ENOENT propagation, error names the offending file). All pass.
+
+Skill-side guidance: Phase 5 `ocs-content-pack` and any future skill calling this atom SHOULD use `file_path` for any payload > ~1KB. For files on Drive, `drive_download_binary` into a tmp path first, then pass that as `file_path` — keeps the b64 entirely out of agent context.
+
+`docs/learnings/2026-05-12-boundary-probe-registry.md` updated with the new Shipped probe + a new pending row generalizing the audit ("every MCP atom whose input schema takes a `string` that may carry > ~10KB of payload should have a `_path` companion"; existing examples: `commcare_upload_multimedia.file_bytes_path`, `commcare_patch_xform.new_xform_xml_path`).
+
 ## 0.13.277 — 2026-05-18
 
 **Mirror Vellum's slug/name separation in the Nova architect brief (follow-up to 0.13.274).**
diff --git a/VERSION b/VERSION
index 35500ec..7c73e04 100644
--- a/VERSION
+++ b/VERSION
@@ -1 +1 @@
-0.13.277
+0.13.285
diff --git a/docs/learnings/2026-05-12-boundary-probe-registry.md b/docs/learnings/2026-05-12-boundary-probe-registry.md
index 434ea9c..c24721c 100644
--- a/docs/learnings/2026-05-12-boundary-probe-registry.md
+++ b/docs/learnings/2026-05-12-boundary-probe-registry.md
@@ -21,6 +21,7 @@ ACE has a pattern called **boundary probes**: load-bearing client-side pre-fligh
 | `mobile_probe_maestro_driver` | `mcp/mobile-server.ts:143` (atom) → `mcp/mobile/client.ts:169` (impl); auto-invoked by `mobile_ensure_avd_running` | Maestro driver gRPC health check + auto-heal. Catches the "AVD up, Maestro driver wedged" case where every recipe times out without a recipe-side error. | PR #233 — commit `8b6e4f0` ("auto-heal Maestro driver in mobile_ensure_avd_running"). |
 | `connect_preflight_learn_app_user` | `mcp/connect/backends/commcare-preflight.ts` (atom impl) + `mcp/connect-server.ts` (wiring); recommended caller `skills/connect-opp-setup/SKILL.md` Step 7.5 | Auth / domain / user-conflict failures on the CCHQ side of `POST /users/start_learn_app/` — rotated API key, archived domain, CCHQ outage, already-linked-to-different-ConnectID user. Surfaces as structured `{ok, action, reason}` outcome before Phase 6 boots the AVD instead of as a runtime client-side noop. | PR #249 (commit `8677225`). |
 | `app-release` CCZ slug-length projection | `mcp/connect/backends/commcare.ts` (`SLUG_LENGTH_LIMIT`, `simulateConnectSync.oversized_slugs`, `max_slug_length`) + `skills/app-release/SKILL.md` § Step 6 (BLOCKER gate) + `skills/pdd-to-{learn,deliver}-app/SKILL.md` (architect-brief REQUIRED clause) | Connect's `LearnModule.slug` / `DeliverUnit.slug` are `SlugField()` with the default `max_length=50`. Nova's `compile_app` derives slugs as `module_<index>_<slugified-name>`; module names ≥ ~40 chars overflow. The DB INSERT raises Postgres `DataError: value too long for type character varying(50)`, which falls through `program/api/views.py:102`'s narrow except and surfaces as HTTP 500 with empty body from `connect_create_opportunity`. Same shape as the 2026-05-12 `short_description` 50-char trap but at the CCZ extract path rather than the serializer — so the *generalized serializer-vs-model length probe* (still pending below) would NOT have caught it. This is a sibling probe at a different boundary. | `docs/learnings/2026-05-17-connect-slug-length-50-char-trap.md` + reproducer in `leep-paint-collection/20260517-1515` Phase 4 (module name "Stage 2: Sample Preparation, Drying, Bagging, Shipment" → slug `module_6_stage_2_sample_prep_drying_bagging_shipment`, 52 chars). |
+| `ocs_upload_collection_files` `file_path` mode | `mcp/ocs-server.ts` (`decodeUploadCollectionFileSource`) + tests at `test/mcp/ocs/unit/upload-collection-files-decoder.test.ts` | Output-token budget exhaustion: caller-supplied `content` (base64) inputs > ~10KB stall model generation mid-tool_use-emit, surfacing as `API Error: Stream idle timeout - partial response received` with no actionable diagnostic. `file_path` mode lets the MCP read + b64-encode server-side so the agent never holds the payload as output tokens. Exactly-one-source-per-file invariant enforced (rejects both / neither). Companion examples already shipped at this same class boundary: `commcare_upload_multimedia.file_bytes_path` and `commcare_patch_xform.new_xform_xml_path`. | `docs/learnings/2026-05-19-ocs-upload-b64-context-wedge.md` + reproducer in `leep-paint-collection/20260517-1515` Phase 5 (two consecutive `ace:ocs-setup` dispatches at ~30min/49 calls + ~114min/44 calls both stalled mid-b64-emit on a ~67KB PDD payload; no Drive artifacts written). |
 
 ## Pending probes
 
@@ -31,6 +32,7 @@ Class-level preventers we know are needed but haven't shipped:
 | **Selector-map currency probe** — `bin/ace-doctor --preflight` cross-checks recipes in `mcp/mobile/recipes/static/` vs `mcp/mobile/selectors/<APK>.yaml` for the deployed APK version | Recipes go stale when Connect APK ships UI changes; current symptom is silent `btn_start` no-op at recipe runtime. Same class as `cloud_emu` but for selector-map vs deployed-APK skew. | Implied by `CLAUDE.md` ("`REPLACE_*` selectors that must be filled via `maestro studio` against the Connect APK before live runs") + the 2026-04-30 `btn_start` noop refuted in commit `caba0b8`. |
 | **`mobile_resolve_selectors` at Phase 2 authoring gate** — shift-left of the Phase 5 selector-resolution gate into `app-test-cases` | Same selector-currency class as above but a *producer-side* preventer (catch at authoring time, not at runtime). Currently the only check is the Phase 5 recipe-execution gate; an authoring-gate probe would fail closed before any mobile run. | Sibling of selector-map currency; surfaces when authoring touches a recipe whose selector map hasn't been re-resolved for the current APK. |
 | **Generalized serializer-vs-model length probe** — pattern-match across commcare-connect's `CharField` definitions, surface mismatches at MCP startup | The `short_description` 50-char trap is one instance; any other field where DRF serializer `max_length` exceeds the model `max_length` is the same bug class. A static scan over commcare-connect's `models.py` + `serializers.py` would surface all candidates as Zod caps. **Note 2026-05-17:** the slug-length trap (now Shipped above) was the SAME class but at a *different* boundary — the slug isn't sent through any serializer, it's derived server-side from CCZ XML. A truly generalized probe should walk every `Char`/`SlugField` in commcare-connect's models AND check both serializer-fed paths and CCZ-extracted paths (the latter is what `app-release` Step 6 now does for the slug case specifically). | Generalization of `docs/learnings/2026-05-12-connect-opp-short-description-50-char-trap.md § Generalization` + `docs/learnings/2026-05-17-connect-slug-length-50-char-trap.md`. |
+| **Generalized "MCP atom accepts large payload as string param" audit** — pattern-match every MCP atom whose input schema takes a `string` field that may carry > ~10KB of base64 / XML / JSON, and add a `_path` companion field where missing | The 2026-05-19 b64-context wedge (now Shipped above) is one instance. Companion atoms that already follow the pattern correctly: `commcare_upload_multimedia.file_bytes_path`, `commcare_patch_xform.new_xform_xml_path`. Still missing: `drive_upload_binary.content` (inline b64 only); `drive_create_file.content` and `drive_update_file.content` (text, but unbounded — > 100KB Drive docs could stall). A static scan of MCP server tool schemas would enumerate every remaining wedge candidate. | Generalization of `docs/learnings/2026-05-19-ocs-upload-b64-context-wedge.md § Generalization`. |
 
 ## Pattern characteristics
 
diff --git a/docs/learnings/2026-05-19-ocs-upload-b64-context-wedge.md b/docs/learnings/2026-05-19-ocs-upload-b64-context-wedge.md
new file mode 100644
index 0000000..fbd2629
--- /dev/null
+++ b/docs/learnings/2026-05-19-ocs-upload-b64-context-wedge.md
@@ -0,0 +1,100 @@
+# `ocs_upload_collection_files` Inline-Base64 Context Wedge (Phase 5 Stream-Idle Timeout)
+
+**Status:** Mitigated in ACE v0.13.279 via `file_path` mode on `ocs_upload_collection_files`. Old inline `content` mode preserved for tiny strings (back-compat) but discouraged via tool description.
+
+**Origin:** `leep-paint-collection` run `20260517-1515` Phase 5 — two consecutive `ace:ocs-setup` subagent dispatches hit stream-idle timeouts (one at ~30 min / 49 tool calls, second at ~114 min / 44 tool calls) without writing any Drive artifacts. Both stalled at the same point.
+
+## What was framed as the bug (early hypotheses, in order of refutation)
+
+1. **RAG indexing wedge** — `ocs_wait_for_collection_indexing` polling forever. Refuted: the wedge happened before any indexing call.
+2. **Per-prompt QA loop** — `ocs_send_test_message` taking minutes each. Refuted: Phase 5 never reached QA in either dispatch.
+3. **Auth / re-login churn** — Playwright session expired, atoms retrying. Refuted: no auth errors in either transcript.
+4. **OCS server slowness** — Refuted: every atom that ACTUALLY ran completed in seconds. `ocs_clone_chatbot` ~13s, `ocs_create_collection` ~2s, `ocs_upload_collection_files` ~4s.
+
+## What the bug actually is
+
+**Model-generation stall caused by inflated agent context.** Both dispatches followed the same pattern:
+
+1. Agent built the RAG content pack on disk (PDD + summaries + test prompts ≈ 50-67 KB combined).
+2. Agent ran `base64 <file >file.b64` via Bash to encode it (correct).
+3. **Agent then `Read` the resulting `.b64` files in quarters back into its own context** so it could emit the b64 string as part of the `ocs_upload_collection_files` tool_use `input.files[].content` field.
+4. Next assistant turn stalled mid-emission with `API Error: Stream idle timeout - partial response received`. The stall happened either WHILE emitting the b64 (dispatch 2 had a 15-minute mid-stall before one upload landed) OR AFTER the upload returned cleanly but with the b64 still in context (dispatch 2 ran for another 90 minutes after a successful upload before terminating).
+
+The root cause: the `ocs_upload_collection_files` MCP atom's `content` field required a base64 string the agent had to generate as output tokens. For non-trivial RAG payloads (10s of KB → 100s of KB of b64 ASCII), generating that many output tokens in a single tool_use input either stalls outright or accumulates enough context to stall the next turn.
+
+## Proof (session-log evidence)
+
+Source: `~/.claude/projects/-Users-jjackson-emdash-worktrees-ace-emdash-e2e-leep-paint-vsvc9/10b0a209-02b1-48ac-9c49-1a4a0309db96/subagents/agent-{a9539456f8738dccc,aabc8f9d0efda8e30}.jsonl`.
+
+Dispatch 1 (`a9539456f8738dccc`): last 10 tool calls all `Read` calls on b64-chunk tmp files (`/tmp/b64_0_q1..q4.txt`). Final assistant text reads "Excellent. Now I have all 4 b64 chunks... The cleanest path: build the JSON in...". Next token never arrives. Stream-idle terminator. **22m 42s silent gap** between the last tool call and the timeout.
+
+Dispatch 2 (`aabc8f9d0efda8e30`): same prefix shape (b64 chunks, Read calls), then one `ocs_upload_collection_files` succeeded with 3 files in 4s. Last assistant text reads "3 of 4 uploaded. Now upload the 4th." Next token never arrives. **1h 30m silent gap** before stream-idle. The successful upload's b64 was still in context, sufficient to stall the next turn.
+
+OCS atoms never reached `ocs_wait_for_collection_indexing`, `ocs_set_chatbot_system_prompt`, `ocs_set_chatbot_pipeline`, or any QA step.
+
+## Fix shipped (ACE v0.13.279)
+
+`mcp/ocs-server.ts` — `ocs_upload_collection_files` extended to accept `file_path` as an alternative source per file. The MCP reads the file server-side, no b64 ever crosses the agent's context. New exclusivity rule enforced server-side: each file MUST supply EXACTLY ONE of `content` (legacy inline b64) or `file_path` (absolute filesystem path). Mixed or missing sources fail fast with a named error citing the offending file.
+
+Refactor: the file-decoding logic moved into a standalone exported helper `decodeUploadCollectionFileSource` so it's unit-testable in isolation.
+
+7 new vitest cases in `test/mcp/ocs/unit/upload-collection-files-decoder.test.ts` covering:
+
+- `file_path` reads UTF-8 text bytes verbatim
+- `file_path` reads arbitrary binary bytes verbatim
+- `content` (legacy) decodes inline b64
+- Missing source → typed error naming the file
+- Both sources → typed error naming the file
+- ENOENT on missing file_path propagates cleanly
+
+## Skill / agent-side guidance
+
+Phase 5 `ocs-content-pack` + any future skill that calls `ocs_upload_collection_files` SHOULD use `file_path` for any payload > ~1KB. The pattern:
+
+```ts
+// Write the content to a tmp file via Bash. Never Read it back.
+await Bash(`echo "$content" > /tmp/leep-rag/pdd-summary.md`);
+// Or: drive_download_binary into a tmp path for files already on Drive.
+await Bash(`drive_download_binary ... | base64 -d > /tmp/leep-rag/pdd.md`);
+
+// Then upload by reference:
+await ocs_upload_collection_files({
+  collection_id: 123,
+  files: [{
+    name: 'pdd.md',
+    file_path: '/tmp/leep-rag/pdd.md',  // absolute path; MCP reads + b64s server-side
+    mime_type: 'text/markdown',
+  }],
+});
+```
+
+DO NOT `Read` the `.b64` files. DO NOT `Read` the original markdown files into context if all you're going to do is re-emit them through the upload tool — that's the wedge.
+
+## Generalization
+
+This is a different shape from the 50-char slug trap (#347/#1195) and the `short_description` 50-char trap (`docs/learnings/2026-05-12-connect-opp-short-description-50-char-trap.md`):
+
+| | short_description / slug trap | b64-context wedge |
+|---|---|---|
+| Layer | Connect DB column / serializer | Agent context / model generation |
+| Failure shape | Opaque HTTP 500 with empty body | Stream-idle timeout (no error response, just stall) |
+| Pre-fix preventer | Column width / serializer validation | None |
+| Post-fix preventer | Zod cap / CCZ projection gate | MCP atom accepts file_path (caller never holds payload) |
+| Class | Postgres column overflow | Output-token budget exhaustion |
+
+**Generalized boundary-probe candidate (for the registry):** any MCP atom whose input schema accepts large binary-or-encoded content as a string parameter is a wedge candidate. The systemic fix is to give every such atom a `file_path` (or `drive_file_id`) alternate source so the agent never holds the payload as output tokens. Audit candidates today:
+
+- `commcare_upload_multimedia` — already has `file_bytes_path` (correct pattern; this PR's `file_path` adoption matches it)
+- `drive_upload_binary` — currently inline `content` only; same wedge class
+- `ocs_upload_collection_files` — fixed by this PR
+- `commcare_patch_xform` — has both `new_xform_xml` and `new_xform_xml_path`; correct
+- `drive_create_file` / `drive_update_file` — content is typically text (markdown, YAML), so the wedge bound is higher but still real for >100KB docs
+
+The registry entry under § Shipped probes would name "input-payload size at MCP atom boundary" as the class, with this PR + `commcare_upload_multimedia`'s `file_bytes_path` + `commcare_patch_xform`'s `new_xform_xml_path` as the existing instances.
+
+## See also
+
+- `docs/learnings/2026-05-12-boundary-probe-registry.md` — registry update will add this as Shipped probe.
+- `mcp/ocs-server.ts` § `ocs_upload_collection_files` — the fix.
+- `test/mcp/ocs/unit/upload-collection-files-decoder.test.ts` — the tests.
+- Session log subagent transcripts (above) — the bisect evidence.
diff --git a/mcp/ocs-server.ts b/mcp/ocs-server.ts
index 8084515..297f97e 100644
--- a/mcp/ocs-server.ts
+++ b/mcp/ocs-server.ts
@@ -267,23 +267,26 @@ server.tool(
 
 server.tool(
   'ocs_upload_collection_files',
-  'Upload files to an existing Collection. Files will be chunked and embedded asynchronously. chunk_size and chunk_overlap are optional (default 800/400, matching the upstream NM Bot collection); if omitted the upload still works but uses the defaults.',
+  'Upload files to an existing Collection. Each file MUST supply EXACTLY ONE source: `file_path` (local filesystem path — MCP reads + base64-encodes server-side, preferred for any payload >1KB) OR `content` (caller-supplied base64 — legacy inline mode, only sensible for tiny strings). Mixing both, or supplying neither, fails fast. The file_path mode exists because emitting megabytes of base64 in the tool_use input wedges model generation (stream-idle timeout) — class-level preventer for the 2026-05-19 Phase 5 wedge (`docs/learnings/2026-05-19-ocs-upload-b64-context-wedge.md`). For files that live on Drive, `drive_download_binary` to a tmp path first, then pass that as `file_path` — keeps the b64 entirely out of agent context. Files will be chunked and embedded asynchronously. chunk_size and chunk_overlap are optional (default 800/400, matching the upstream NM Bot collection).',
   {
     collection_id: z.number(),
     files: z.array(z.object({
       name: z.string(),
-      content: z.string().describe('Base64-encoded file content'),
+      content: z.string().optional().describe(
+        'Base64-encoded file content. Legacy inline mode — use file_path for anything > ~1KB to avoid stalling model generation on large b64 tool_use inputs.',
+      ),
+      file_path: z.string().optional().describe(
+        'Local filesystem path. MCP reads the bytes + base64-encodes server-side, so the agent never holds the b64 in context. Pass an absolute path; relative paths resolve against the MCP subprocess CWD which is rarely predictable. Preferred for any payload > 1KB.',
+      ),
       mime_type: z.string(),
     })),
     chunk_size: z.number().optional().describe('Chunk size in tokens. Default 800.'),
     chunk_overlap: z.number().optional().describe('Chunk overlap in tokens. Must be < chunk_size. Default 400.'),
   },
   async (args) => {
-    const decoded = args.files.map((f) => ({
-      name: f.name,
-      content: Buffer.from(f.content, 'base64'),
-      mime_type: f.mime_type,
-    }));
+    const decoded = await Promise.all(
+      args.files.map((f) => decodeUploadCollectionFileSource(f)),
+    );
     return result(
       await composite.uploadCollectionFiles({
         collection_id: args.collection_id,
@@ -295,6 +298,42 @@ server.tool(
   },
 );
 
+/**
+ * Resolve a single `ocs_upload_collection_files` file-input entry to its
+ * decoded `Buffer` regardless of source (file_path read or inline b64 decode),
+ * enforcing exactly-one-source-per-file. Exported for unit-testability.
+ *
+ * Class-level preventer for the 2026-05-19 Phase 5 wedge: the inline `content`
+ * (base64) path forces the agent to emit megabytes of b64 in its tool_use
+ * input, which stalls model generation on any payload past ~10KB. The
+ * file_path path keeps the bytes on disk, with the MCP doing the b64 work
+ * server-side, so the agent never holds the encoded form in context.
+ */
+export async function decodeUploadCollectionFileSource(f: {
+  name: string;
+  content?: string;
+  file_path?: string;
+  mime_type: string;
+}): Promise<{ name: string; content: Buffer; mime_type: string }> {
+  const { readFile } = await import('node:fs/promises');
+  const hasContent = f.content !== undefined;
+  const hasPath = f.file_path !== undefined;
+  if (!hasContent && !hasPath) {
+    throw new Error(
+      `ocs_upload_collection_files: file "${f.name}" missing source — supply exactly one of content / file_path.`,
+    );
+  }
+  if (hasContent && hasPath) {
+    throw new Error(
+      `ocs_upload_collection_files: file "${f.name}" supplies both content and file_path — pick one.`,
+    );
+  }
+  const bytes = hasPath
+    ? await readFile(f.file_path!)
+    : Buffer.from(f.content!, 'base64');
+  return { name: f.name, content: bytes, mime_type: f.mime_type };
+}
+
 server.tool(
   'ocs_wait_for_collection_indexing',
   'Poll until the specified files in a Collection have been indexed (chunked + embedded). Pass the file_ids returned by ocs_upload_collection_files.',
diff --git a/package.json b/package.json
index d4b9986..7ab3dfc 100644
--- a/package.json
+++ b/package.json
@@ -1,6 +1,6 @@
 {
   "name": "ace",
-  "version": "0.13.277",
+  "version": "0.13.285",
   "description": "AI Connect Engine - orchestrator for building Connect Opps using AI",
   "type": "module",
   "scripts": {
diff --git a/test/mcp/ocs/unit/upload-collection-files-decoder.test.ts b/test/mcp/ocs/unit/upload-collection-files-decoder.test.ts
new file mode 100644
index 0000000..fe53004
--- /dev/null
+++ b/test/mcp/ocs/unit/upload-collection-files-decoder.test.ts
@@ -0,0 +1,101 @@
+/**
+ * Unit tests for `decodeUploadCollectionFileSource` — the file-input decoder
+ * shared by `ocs_upload_collection_files`. Asserts the exactly-one-source
+ * invariant, the file_path read path, and the inline-content b64 decode path.
+ *
+ * Class-level preventer documented in `docs/learnings/2026-05-19-ocs-upload-b64-context-wedge.md`.
+ */
+import { describe, it, expect, beforeAll, afterAll } from 'vitest';
+import { writeFile, mkdtemp, rm } from 'node:fs/promises';
+import { tmpdir } from 'node:os';
+import { join } from 'node:path';
+import { decodeUploadCollectionFileSource } from '../../../../mcp/ocs-server.js';
+
+describe('decodeUploadCollectionFileSource', () => {
+  let tmpDir: string;
+  let absTextPath: string;
+  let absBinPath: string;
+  const utf8Sample = '## LEEP PDD\n\nMulti-stage paint-collection survey.';
+  const binSample = Buffer.from([0xde, 0xad, 0xbe, 0xef, 0x00, 0xff, 0x42]);
+
+  beforeAll(async () => {
+    tmpDir = await mkdtemp(join(tmpdir(), 'ocs-upload-decoder-'));
+    absTextPath = join(tmpDir, 'pdd.md');
+    absBinPath = join(tmpDir, 'photo.bin');
+    await writeFile(absTextPath, utf8Sample, 'utf8');
+    await writeFile(absBinPath, binSample);
+  });
+
+  afterAll(async () => {
+    await rm(tmpDir, { recursive: true, force: true });
+  });
+
+  it('reads file_path bytes verbatim (UTF-8 text)', async () => {
+    const decoded = await decodeUploadCollectionFileSource({
+      name: 'pdd.md',
+      file_path: absTextPath,
+      mime_type: 'text/markdown',
+    });
+    expect(decoded.name).toBe('pdd.md');
+    expect(decoded.mime_type).toBe('text/markdown');
+    expect(decoded.content.toString('utf8')).toBe(utf8Sample);
+  });
+
+  it('reads file_path bytes verbatim (arbitrary binary)', async () => {
+    const decoded = await decodeUploadCollectionFileSource({
+      name: 'photo.bin',
+      file_path: absBinPath,
+      mime_type: 'application/octet-stream',
+    });
+    expect(decoded.content.equals(binSample)).toBe(true);
+  });
+
+  it('decodes inline base64 content (legacy mode)', async () => {
+    const b64 = Buffer.from('hello world', 'utf8').toString('base64');
+    const decoded = await decodeUploadCollectionFileSource({
+      name: 'inline.txt',
+      content: b64,
+      mime_type: 'text/plain',
+    });
+    expect(decoded.content.toString('utf8')).toBe('hello world');
+  });
+
+  it('throws when neither source supplied', async () => {
+    await expect(
+      decodeUploadCollectionFileSource({
+        name: 'orphan.md',
+        mime_type: 'text/markdown',
+      }),
+    ).rejects.toThrow(/missing source/);
+  });
+
+  it('throws when both sources supplied', async () => {
+    await expect(
+      decodeUploadCollectionFileSource({
+        name: 'both.md',
+        content: Buffer.from('a', 'utf8').toString('base64'),
+        file_path: absTextPath,
+        mime_type: 'text/markdown',
+      }),
+    ).rejects.toThrow(/both content and file_path/);
+  });
+
+  it('error message names the offending file (debuggability)', async () => {
+    await expect(
+      decodeUploadCollectionFileSource({
+        name: 'pdd-summary-for-leep.md',
+        mime_type: 'text/markdown',
+      }),
+    ).rejects.toThrow(/"pdd-summary-for-leep\.md"/);
+  });
+
+  it('propagates ENOENT from the underlying fs read', async () => {
+    await expect(
+      decodeUploadCollectionFileSource({
+        name: 'absent.md',
+        file_path: join(tmpDir, 'does-not-exist.md'),
+        mime_type: 'text/markdown',
+      }),
+    ).rejects.toThrow(/ENOENT/);
+  });
+});