Skip to content

feat: implement MCP Roots protocol for dynamic allowed-directories#95

Open
nicholasjayantylearns wants to merge 11 commits into
docling-project:mainfrom
nicholasjayantylearns:feat/mcp-roots-protocol
Open

feat: implement MCP Roots protocol for dynamic allowed-directories#95
nicholasjayantylearns wants to merge 11 commits into
docling-project:mainfrom
nicholasjayantylearns:feat/mcp-roots-protocol

Conversation

@nicholasjayantylearns
Copy link
Copy Markdown

What

Implement the MCP Roots protocol so that docling-mcp honors the allowed-directories the client (Claude Desktop, Claude Code, etc.) authorizes at runtime via roots/list_changed, with a static --allowed-directories CLI fallback for clients that don't support Roots.

Closes the Roots half of #86. (PR #88 fixes the tensor-padding crash; this PR fixes the path-resolution failure for clients that send Roots, which is the other reason the issue is open.)

Why

Per the canonical MCP filesystem reference at modelcontextprotocol/servers/tree/main/src/filesystem:

MCP clients that support Roots can dynamically update the Allowed directories. Roots notified by Client to Server, completely replace any server-side Allowed directories when provided.

Today, when a client like Claude Desktop sends notifications/roots/list_changed to docling-mcp, the server does not subscribe to it. Path resolution then fails for every path the client considered authorized — even when the user has explicitly granted access in the client UI. Multiple users in #86 hit this.

This PR brings docling-mcp in line with that reference behavior, in idiomatic Python on top of the existing FastMCP-based server, with no new top-level dependencies (the MCP SDK already pinned at >=1.9.4 exposes everything we need).

How

Five commits, each independently reviewable:

  1. feat(roots): add AllowedRootsRegistry + URL-vs-path utilities — pure, dependency-free state and helpers. AllowedRootsRegistry keeps two layers (static seed + client-provided) and exposes validate_source(source: str) that raises PermissionError for filesystem paths outside the active root set. Remote URLs (http://, https://, ftp://, s3://, …) pass through unchecked — Roots authorize filesystem access, not network access. file:// URLs are treated as filesystem paths.

  2. feat(roots): authorize source paths in conversion tools — wire validate_source into both convert_document_into_docling_document and convert_directory_files_into_docling_document. When the registry is unconstrained (no static flag, no client roots), it's a no-op — preserves the current behavior for users who haven't opted in.

  3. feat(roots): wire notification handlers + --allowed-directories CLI — adds the Typer flag and registers the two notification handlers on the underlying low-level Server:

    • notifications/initialized → seeds the registry by calling session.list_roots() if the client advertised the roots capability.
    • notifications/roots/list_changed → refreshes the registry the same way.

    The MCP Python SDK's notification dispatcher does not propagate the active ServerSession to handler callables, so this commit also installs a tiny ContextVar-based shim that captures session per-message in Server._handle_message. The patch is idempotent.

  4. test(roots): cover registry, path utils, and notification handlers — 29 unit tests covering: URL/path discriminator (incl. case-insensitive schemes), inside/outside path validation, remote-URL passthrough when constrained, the canonical "client roots fully replace static set" semantics, clear-then-restore-static, non-file:// URI filtering, symlink resolution (preventing bypass), ..-resolution (preventing escape), idempotent handler installation, and async handlers refreshing the registry against fake sessions.

  5. docs(roots): document MCP Roots support and --allowed-directories — new "Filesystem access (MCP Roots)" section in the README explaining the three states (Roots-capable client / static fallback / unconstrained legacy) and a concrete uvx invocation example.

Acceptance criteria checklist

  • PR opened against docling-project/docling-mcp main.
  • Implementation matches the pattern in modelcontextprotocol/servers/tree/main/src/filesystem — same "client roots completely replace static dirs" semantics, same notification subscription model.
  • References issue Not able to get docling mcp working on Claude Desktop #86 and explains how it closes the open half.
  • --allowed-directories backward-compat path is wired and unit-tested.
  • README updated to explicitly describe Roots support and the user-facing benefit.
  • Local test: Docling MCP installed via uvx in Claude Desktop, given a path under an allowed root, parses a PDF. (Pending live test on the PR author's machine — see follow-up comment.)
  • Local test: Docling MCP installed via uvx in Claude Code, given a cwd-derived root, parses a PNG with do_table_structure=True. (Pending — see follow-up comment.)
  • Local test: server with --allowed-directories and a non-Roots-capable client successfully parses a PDF under the allowed dir and rejects a path outside it. (Pending — see follow-up comment.)

The three live tests will be reported in a follow-up PR comment with screenshots / log excerpts so reviewers don't have to take it on faith.

Backward compatibility

  • Existing users running with no flag and no Roots-capable client see zero behavior changevalidate_source is a no-op when the registry is unconstrained.
  • Existing users with Roots-capable clients (Claude Desktop, Claude Code, etc.) start getting filesystem-scoping for free as soon as they update.
  • New users who want path containment without a Roots-capable client get --allowed-directories as a first-class CLI flag.

Notes for reviewers

  • The ContextVar shim in _roots_wiring.py exists because the MCP Python SDK doesn't currently propagate session into notification handlers. If a future SDK release exposes a public hook for this, the shim should be replaced — there's a comment to that effect in the file.
  • I deliberately did not advertise a server-side capability for roots: roots is a client capability in the MCP spec; the server consumes it.
  • The conversion tools already accept both URLs and local paths in a single source parameter. The validator only gates filesystem-style sources; this is documented in the new README section and in the registry's docstring.

Provenance

Authored as a deliverable for an upstream-tracking ticket (SKILLS-4) in the wearethehumansintheloop HITL project. The framing motivation is V-tax compression at the document-ingestion seam — every blueprint, PDF, screenshot, or structured doc that reaches a Docling-MCP-equipped Claude session today pays the V-tax independently because path resolution silently fails. With this fix, structure flows once and downstream consumers inherit it.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 26, 2026

DCO Check Passed

Thanks @nicholasjayantylearns, all your commits are properly signed off. 🎉

@mergify
Copy link
Copy Markdown
Contributor

mergify Bot commented Apr 26, 2026

Merge Protections

Your pull request matches the following merge protections and will not be merged until they are valid.

🟢 Enforce conventional commit

Wonderful, this rule succeeded.

Make sure that we follow https://www.conventionalcommits.org/en/v1.0.0/

  • title ~= ^(fix|feat|docs|style|refactor|perf|test|build|ci|chore|revert)(?:\(.+\))?(!)?:

@nicholasjayantylearns
Copy link
Copy Markdown
Author

Marking this draft. Accurate update on what was verified and what wasn't.

What works (verified live on macOS + Claude Desktop)

  • Server boots via uvx from this branch; all 19 tools register through the MCP tools/list handshake.
  • Client-sent roots populate the server's allowed-paths registry via notifications/roots/list_changed. Verified by inspecting the active-roots list in validate_source() error messages — the folders Claude Desktop authorized in its file-access permissions appeared in the registry.
  • The server correctly rejects paths outside any active root with a clean PermissionError-derived McpError. No silent failure.
  • Bug found and fixed live during testing (commit 6, fix(roots): URL-decode percent-escapes): clients send file:// roots with percent-encoded characters like %20 for space in macOS paths under Application Support/. The original implementation stored encoded paths and compared against real filesystem paths, so authorized folders containing spaces were silently rejected. Now decoded at the seam.
  • 33/33 unit tests passing on macOS including 4 new regression cases for the URL-decode fix.

What's blocked from "ship" (not introduced by this PR)

The single-document conversion path (convert_document_into_docling_document) hits Claude Desktop's 60-second MCP request timeout because Docling's first-call cold start (PyTorch import, OCR model load, layout detector initialization) exceeds that window. The same timeout reproduces against main for the same PDF in the same environment — this PR doesn't cause it, but the PR also doesn't deliver a usable end-to-end experience until it's addressed.

What was ruled out as a workaround for this PR:

  • notifications/progress as keepalive. The MCP Python SDK uses anyio.fail_after(timeout) — a hard wall-clock — and dispatches progress notifications via callback without resetting that timeout. The TypeScript client used by Claude Desktop appears to behave similarly. So plumbing progress reporting into the single-document tool (the way convert_directory_files_into_docling_document already does it) wouldn't reliably extend the timeout window.

Proposed path forward: an eager-model-preload option at server startup, behind a CLI flag, so cold-start happens during server boot (where Claude Desktop's startup timeout is more generous) instead of during the first user-facing tool call. Tracked separately at #98.

Why draft, not merged

This PR was filed for issue #86 (Roots support). Roots are functional and unit-tested. But the user-facing workflow — install via uvx, ask Claude Desktop to parse a PDF in an authorized folder — doesn't complete cleanly inside the MCP timeout. Shipping the PR while the workflow doesn't work would misrepresent what it delivers, even though the cold-start cause is outside this PR's scope.

Marking draft, filed the cold-start issue at #98, and proposing this PR re-opens to "ready for review" once either (a) the cold-start has a viable fix that lives alongside it, or (b) the maintainers prefer to merge this independently and track cold-start in its own ticket.

Happy with either path — your call.

cc maintainers from MAINTAINERS.md for guidance on which path you'd prefer.

@nicholasjayantylearns nicholasjayantylearns marked this pull request as draft May 13, 2026 15:23
@nicholasjayantylearns
Copy link
Copy Markdown
Author

Update — live verification passed, moving out of draft

The blocker described in the previous "marking draft" comment was addressed in subsequent commits (45528a8, 80860b3): a new --preload-models CLI flag warms DocumentConverter in a daemon thread at server boot, so the first user-facing convert call returns inside the MCP client's per-request timeout window.

End-to-end live test on macOS, 2026-05-13

  • Server installed via uvx from this branch with --preload-models --transport stdio
  • Claude Desktop ⌘Q + relaunch; server attached cleanly via the background-preload pattern (no spawn-timeout failure)
  • Tool called: convert_document_into_docling_document
  • Source: a real PDF under a Claude-Desktop-authorized roots path
  • Result: {"from_cache": false, "document_key": "ea2094342aaf57cf2968192326f0e835"}

Boot breadcrumbs from the server log confirm the full warm-up path executed before any tool call:

[docling-mcp-boot] main() entered (transport=stdio, preload=True)
[docling-mcp-boot] preload: background thread launched; proceeding to mcp.run()
[docling-mcp-boot] preload thread: _get_converter() returned; initializing pipelines
[docling-mcp-boot] preload thread: initialize_pipeline(PDF)
[docling-mcp-boot] preload thread: initialize_pipeline(IMAGE)
[docling-mcp-boot] preload thread: DocumentConverter READY

35/35 unit tests still passing on macOS (33 from earlier plus 2 wiring tests for the --preload-models flag).

Both halves of the original blocker (Roots + cold-start) are now functional in a single branch.

Acceptance criteria

Independent finding worth noting (out of scope for this PR)

On image-heavy PDFs with stylized fonts over graphics, Docling's OCR layer produces hallucinated text in image-overlay regions. Reproduces against main; not introduced by this PR. Tracked separately in our internal corpus work — happy to file an upstream issue against docling-project/docling if maintainers want.

#98 (cold-start timeout) is also addressed by the --preload-models work in this PR; comfortable closing it once this PR merges.

cc @cau-git @dolfim-ibm @vagenas @cberrosp @PeterStaar-IBM @Zaalouk for review when convenient.

@nicholasjayantylearns nicholasjayantylearns marked this pull request as ready for review May 13, 2026 17:12
@PeterStaar-IBM
Copy link
Copy Markdown
Member

@nicholasjayantylearns lovely, thanks for the PR!

@PeterStaar-IBM
Copy link
Copy Markdown
Member

@nicholasjayantylearns please run,

  1. linter:
uv run pre-commit run --all-files
  1. do DCO (essentially always commit with the flag -s)

Introduce the server-side state that will back MCP Roots support.

* docling_mcp/_path_utils.py — discriminator between remote URLs (http,
  https, ftp, ftps, s3) and filesystem-style sources (plain paths and
  file:// URLs). Roots authorize filesystem access only.

* docling_mcp/roots.py — AllowedRootsRegistry with two layers:
  static_roots (seeded once from --allowed-directories) and client_roots
  (refreshed from roots/list_changed notifications). Client roots, when
  present, fully replace the static set, per the canonical MCP semantics
  in the modelcontextprotocol/servers filesystem reference.

No callers wired up yet — this commit is the registry alone, smoke-tested
in isolation. Next commits wire it into the server handshake and the
conversion tools.

Refs: docling-project#86
Signed-off-by: nicholasjayantylearns <nicholasjayantylearnswithyou@gmail.com>
Both convert_document_into_docling_document and
convert_directory_files_into_docling_document now call
allowed_roots.validate_source() before touching the filesystem. Remote
URLs (http/https/ftp/etc.) pass through unchecked; local paths must lie
under an active root.

When neither --allowed-directories nor client-sent roots have populated
the registry, validate_source is a no-op — preserves the pre-roots
behavior for users who have not opted into the protocol.

Refs: docling-project#86
Signed-off-by: nicholasjayantylearns <nicholasjayantylearnswithyou@gmail.com>
* docling_mcp/_roots_wiring.py — installs handlers for
  notifications/initialized (seeds registry on connect) and
  notifications/roots/list_changed (refreshes on change).

  The MCP Python SDK's notification dispatcher does not propagate the
  active ServerSession to handler callables; this module monkey-patches
  Server._handle_message to expose the session through a ContextVar so
  notification handlers can issue server-to-client list_roots requests.
  The patch is idempotent.

* docling_mcp/servers/mcp_server.py — adds --allowed-directories Typer
  option, seeds allowed_roots.set_static_roots() before mcp.run(), and
  invokes install_roots_handlers() at startup.

Backward compat: when neither --allowed-directories nor client-sent
roots are present, the registry is unconstrained and validate_source()
is a no-op, matching the pre-roots behavior.

Refs: docling-project#86
Signed-off-by: nicholasjayantylearns <nicholasjayantylearnswithyou@gmail.com>
29 tests across three layers:

* _path_utils: is_remote_url discriminator (6 schemes including
  case-insensitive HTTPS), to_filesystem_path handling for plain paths,
  file:// URLs, and rejection of remote URLs.

* AllowedRootsRegistry: unconstrained default, static seeding, inside/
  outside path validation, remote URL passthrough when constrained,
  client-roots-replace-static semantics (the canonical MCP rule),
  clear_client_roots restoring static, non-file:// URI filtering,
  symlink resolution preventing bypass, and ../-resolution preventing
  escape.

* _roots_wiring: install_roots_handlers registers both notification
  types, is idempotent across multiple invocations, _on_initialized
  seeds from session.list_roots() when client advertises roots
  capability, skips list_roots() when client does not, and
  _on_roots_list_changed refreshes on subsequent notifications.

End-to-end against a live MCP client (Claude Desktop / Claude Code via
uvx) is documented in the PR description and run manually — these unit
tests cover the protocol-correct behavior in isolation.

Refs: docling-project#86
Signed-off-by: nicholasjayantylearns <nicholasjayantylearnswithyou@gmail.com>
New 'Filesystem access (MCP Roots)' section explains the three states the
server can be in:

* Roots-capable client connected — allowed paths are scoped to the
  user's client-side authorization, refreshed on list_changed.
* Static --allowed-directories fallback for non-Roots clients.
* No flag, no client roots — legacy unconstrained behavior preserved
  for backward compat.

Refs: docling-project#86
Signed-off-by: nicholasjayantylearns <nicholasjayantylearnswithyou@gmail.com>
Signed-off-by: nicholasjayantylearns <nicholasjayantylearnswithyou@gmail.com>
Real-world clients (Claude Desktop, Cowork) emit file:// roots with
URL-encoded characters — most commonly file:///Users/x/Library/Application%20Support/...
on macOS where the directory name contains a space. urlparse(uri).path
preserves the percent-encoding, so storing parsed.path directly meant the
registry held '/Library/Application%20Support/...' while the real
filesystem path was '/Library/Application Support/...' (literal space).
The two strings never matched in Path.relative_to(), so validate_source
rejected every path inside any space-containing authorized root.

Fix: wrap parsed.path in urllib.parse.unquote() in both
update_from_client_roots() (registry seeding) and to_filesystem_path()
(per-call validation). The conversion now happens once at the seam
between URL space and filesystem space; downstream code sees real paths
exclusively.

Symptom that motivated the fix:

    "path 'X/Library/Application Support/Y' is not under any allowed
     root. active roots: ['X/Library/Application%20Support/Y']"

Visible because the registry and the validator disagreed on encoding for
exactly one character. Reproduced live with Claude Desktop on macOS
(uploads folder).

Tests: four new regression cases —
* to_filesystem_path decodes %20
* to_filesystem_path handles colons + parens via %3A
* registry accepts a real-space path when the client sends %20
* active_roots() exposes decoded paths so error messages are readable.

29 -> 33 tests passing.

Refs: docling-project#86
Signed-off-by: nicholasjayantylearns <nicholasjayantylearnswithyou@gmail.com>
Adds an opt-in CLI flag that triggers the @lru_cache-backed
_get_converter() call once at server startup so the cold-start cost
(PyTorch import, OCR model load, layout detector init) is paid during
the MCP server-spawn window — which Claude Desktop honors with a more
generous timeout than its per-request timeout.

Without this flag, the very first call to
convert_document_into_docling_document hits Claude Desktop's 60-second
per-request MCP timeout because Docling's first invocation has to set
up PyTorch + models. Symptom: 'MCP error -32001: Request timed out'.
Reproduces against main; not specific to this PR's Roots work.

This is the proposed fix for docling-project#98. It's
gated on ToolGroups.CONVERSION being loaded so the warm-up doesn't fire
when conversion tools are disabled.

Default: False. Preserves the existing lazy-init behavior for users
who don't opt in. Recommended in the README for interactive clients
(Claude Desktop, Claude Code) where the per-request timeout matters.

Tests: 2 new wiring smoke tests verify the flag is registered with
Typer and defaults to False. End-to-end timing is left to manual
verification on each client (Docling's load time is hardware-dependent
and not deterministic).

Refs: docling-project#98
Refs: docling-project#86
Signed-off-by: nicholasjayantylearns <nicholasjayantylearnswithyou@gmail.com>
Visibility-first response to 'Could not attach to MCP server' failure
mode reported during live testing. Three behavior changes:

1. Add _crumb() helper that writes flushed stderr breadcrumbs at every
   boot step (main() entered, logger configured, preload scheduled,
   thread started, _get_converter() returned, pipeline init begin/end,
   READY or FAILED, mcp.run() called). Goes around Python logging
   buffering so output is visible in Claude Desktop's log file as
   each step actually executes.

2. Move --preload-models warm-up from foreground (which blocked
   mcp.run() and could exceed Claude Desktop's spawn timeout, causing
   'Could not attach') to a daemon background thread. The MCP server
   handshake now completes fast regardless of model load duration.
   First convert call may block briefly on @lru_cache lock if it
   arrives mid-warmup, but the server is reachable from second one.

3. Inside the warm-up thread, explicitly call
   converter.initialize_pipeline(InputFormat.PDF) and
   initialize_pipeline(InputFormat.IMAGE) (when available) to actually
   force model load — DocumentConverter.__init__ is lazy on its own.
   Falls through to partial warm-up if Docling version lacks the
   method.

The original synchronous, opaque preload is replaced. Tests still pass
(the wiring tests check the flag exists and defaults to False, which
they still do).

Refs: docling-project#98
Signed-off-by: nicholasjayantylearns <nicholasjayantylearnswithyou@gmail.com>
…C240)

Per maintainer review request to run pre-commit. Behavior unchanged;
this commit is pure cleanup of style/lint issues across the PR's diff:

- ruff format on the four files that needed reformatting
- I001 import-block sort fixes (auto-applied by ruff --fix) in tests
- D205 (blank line between summary and description) in the _crumb()
  docstring in mcp_server.py
- E402 (module-level imports must be at top): reordered mcp_server.py
  so all imports come before _crumb()'s definition
- ASYNC240 (sync os.path.realpath calls inside async test bodies):
  applied # noqa with rationale on the two test assertions where this
  is intentional unit-test hygiene, and on a pre-existing
  conversion.py iterdir() call that this PR's diff caused ruff to
  re-check

All 35 unit tests still pass.

Refs: docling-project#95 (review feedback)
Signed-off-by: nicholasjayantylearns <nicholasjayantylearnswithyou@gmail.com>
…ndex imports

Two follow-up mypy fixes for the cleanup pass:

* docling_mcp/_roots_wiring.py: type-annotate the
  _handle_message_with_session monkey-patch wrapper as Any across the
  board. The function is a pass-through; coupling to the MCP SDK's
  internal Server._handle_message signature would be brittle, so Any
  matches the level of precision. Resolves [no-untyped-def].

* pyproject.toml: add 'llama_index.*' to mypy.overrides
  ignore_missing_imports, alongside the existing llama_stack_client.*
  entry. llama-index is an optional extra (llama-index-rag), not pinned
  in the base install, so mypy can't always find its stubs.

Refs: docling-project#95 (review feedback)
Signed-off-by: nicholasjayantylearns <nicholasjayantylearnswithyou@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants