🥾 Trail Docs

Natural language documentation retrieval for AI agents — via CLI.

Built & maintained by Arkology Studio

trail-docs turns markdown docs into a deterministic, citation-backed navigation index. It gives agents a faster trail through your documentation than reading every file or grepping in the dark.

Built for agents. Usable by humans. No LLM required in retrieval.

📈 Benchmark Snapshot (full-2026-03-10T09-22-28-301Z): trail-docs used 83.0% fewer tokens than Context7 and 52.7% fewer tokens than grep, with +0.1517 comprehension vs grep (+18.4% relative) and +0.4084 vs Context7.

npm install -g trail-docs

Why Trail Docs

AI agents navigating software libraries typically do one of two things:

Read everything — dump many files into context, burn tokens, hope for the best.
Grep and pray — rg "configureSSL" --type ts returns fragments across files. Good luck reconstructing intent and sequence.

Trail Docs offers a third path: build an index once, navigate evidence units under strict budgets.

Grep vs Trail Docs

# Grep: "here are some matching lines"
$ rg "refresh token" ./docs
./docs/auth/oauth.md:5:To use refresh tokens, send a POST request...
./docs/auth/oauth.md:9:grant_type=refresh_token
./docs/troubleshooting.md:42:refresh token expired
# Fragments across files. Agent must reconstruct context.

# Trail Docs v2: deterministic multi-hop retrieval with citations
$ trail-docs find "How do I use refresh tokens?" \
  --index .trail-docs/index.json --budget 350 --max-items 6 --json

$ trail-docs extract "How do I use refresh tokens?" \
  --from "auth/oauth#refresh-token,auth/oauth#token-endpoint" \
  --index .trail-docs/index.json --budget 700 --max-items 8 --json
{
  "query": "How do I use refresh tokens?",
  "items": [
    {
      "ref": "auth/oauth#refresh-token",
      "type": "step",
      "text": "Send POST /oauth/token with grant_type=refresh_token and refresh_token.",
      "citation_id": "MyProject@1.0.0:auth/oauth#refresh-token:7-11"
    }
  ],
  "budget_tokens": 700,
  "spent_tokens": 124,
  "remaining_tokens": 576
}
# Structured evidence. Exact line-level citation IDs. Budget-aware output.

Grep gives you fragments. Trail Docs gives you bounded, cited evidence units.

They are different tools. Grep answers "where is this string?" Trail Docs answers "what evidence should the agent read next?"

How it works under the hood

Trail Docs retrieval is purely algorithmic.

Indexing: parses markdown into docs + sections, extracts deterministic evidence_units[], and builds anchor_graph[] links.
Ranking: lexical/heading/symbol/action scoring with deterministic tie-breakers and token-cost penalties.
Selection: hard token budgets, duplicate suppression, stable ordering, explicit citations.

This means retrieval is reproducible and fast. Agents own reasoning; Trail Docs owns navigation and evidence packing.

Core Workflows

🥾 Trail 1: Local docs (your own project)

Index your docs and run the v2 navigation flow:

# Build index
trail-docs build --src . --library "MyProject" --version "1.0.0" \
  --out .trail-docs/index.json

# Hop 1: find start refs
trail-docs find "How do I deploy to production?" \
  --index .trail-docs/index.json --budget 400 --max-items 6 --json

# Hop 2: inspect one anchor and neighbors
trail-docs expand "deploy/runbook#production" \
  --index .trail-docs/index.json --budget 300 --max-items 5 --json
trail-docs neighbors "deploy/runbook#production" \
  --index .trail-docs/index.json --json

# Hop 3: extract final evidence from explicit refs
trail-docs extract "How do I deploy to production?" \
  --from "deploy/runbook#production,deploy/checklist#preflight" \
  --index .trail-docs/index.json --budget 800 --max-items 8 --json

🔭 Trail 2: Pre-install research (unknown library)

Evaluate a library's docs and API surface before adopting it:

# Discover candidates
trail-docs discover "axios" --provider npm --max-results 5 --json

# Fetch docs snapshot with pinned source metadata
trail-docs fetch "npm:axios" --json

# One-shot discover -> fetch -> build -> manifest
trail-docs prep "axios" --path .trail-docs --json

# One-shot URL ingestion
trail-docs index "https://raw.githubusercontent.com/axios/axios/v1.x/README.md" \
  --path .trail-docs --json

🗺️ Trail 3: API surface + callable guidance

Understand a library's shape without manually reading source files:

# Extract exported API + signatures
trail-docs surface npm:openai --json

# Resolve one callable/type
trail-docs fn "npm:openai#OpenAI.responses.create" --json

# Keep trail state as you investigate
trail-docs trail create --objective "evaluate auth + retries" --json
trail-docs trail add --trail trail_xxxxx --ref "api/auth#authentication" --index .trail-docs/index.json --json
trail-docs trail show --trail trail_xxxxx --json

Commands

Command	What it does
`bootstrap`	Generate markdown from codebase and build index
`build`	Build deterministic index from markdown
`list`	List indexed documents
`stats`	Index metadata and coverage
`discover`	Find external libraries/docs candidates
`fetch`	Fetch docs snapshot with pinned source metadata
`prep` / `index`	One-shot discover/fetch/build flow
`surface`	Extract exports, symbols, signatures
`fn`	Resolve callable/type with signature-level citations
`find`	Hop-1 retrieval: ranked start refs + top evidence units
`search`	Alias of `find`
`expand`	Hop-2 retrieval for one ref under token cap
`neighbors`	Graph neighbors (`heading_adjacent`, `intra_doc_link`, overlaps)
`extract`	Hop-3 query-conditioned evidence from explicit refs
`open`	Strict retrieval utility (`--mode units` default, or `--mode section`)
`cite`	Emit canonical citation details
`trail`	Persistent notebook state in `.trail-docs/trails/*.json`

All commands support --json for agent-friendly output.

Agent Integration

Trail Docs is designed to drop into agent workflows that can run shell commands.

Why CLI over MCP for this tool? It works immediately with most coding agents and automation environments: no protocol server required for core retrieval.

# Typical agent retrieval loop
trail-docs prep "some-library" --path .trail-docs --json
trail-docs find "How do I authenticate?" --index .trail-docs/index.json --budget 350 --max-items 6 --json
trail-docs extract "How do I authenticate?" --from "auth#overview,auth#token-refresh" \
  --index .trail-docs/index.json --budget 900 --max-items 8 --json

The --json flag is key: deterministic, parseable payloads without terminal scraping.

Project Config (`trail-docs.toml`)

Optional project-level defaults:

library = "MyProject"
index_path = ".trail-docs/index.json"
manifest_path = ".trail-docs"
output = "json"

[trust]
policy = "trail-docs.policy.json"

[federation]
indexes = [".trail-docs/index.json", "../plugin/.trail-docs/index.json"]

Run trail-docs --help for full flags.

JSON Output

Example (extract):

{
  "library": "MyProject",
  "version": "1.0.0",
  "query": "How do I configure SSL?",
  "refs": ["docs/security#ssl-setup"],
  "items": [
    {
      "ref": "docs/security#ssl-setup",
      "unit_id": "unit_abc123",
      "type": "step",
      "text": "Enable TLS and point to cert + key paths.",
      "citation_id": "MyProject@1.0.0:docs/security#ssl-setup:10-30",
      "token_estimate": 22,
      "why_matched": ["token:ssl", "command:type"],
      "score": 0.8125,
      "score_components": {
        "lexical": 0.9,
        "heading_boost": 0.4,
        "symbol_boost": 0,
        "command_boost": 1,
        "novelty_penalty": 0,
        "token_cost_penalty": 0.18
      }
    }
  ],
  "budget_tokens": 700,
  "spent_tokens": 142,
  "remaining_tokens": 558
}

Full schema: docs/json_output_schema.md

📊 Benchmark Snapshot

Latest comparable run:

full-2026-03-10T09-22-28-301Z (CI corpus set)
Same question set + judge across trail-docs, grep, context7

Tool	Mean Comprehension	Mean Tokens	Mean Latency
`trail-docs`	0.9767	418.33	6596.97 ms
`grep`	0.8250	884.00	3696.50 ms
`context7`	0.5683	2466.33	9983.95 ms

Pairwise (trail-docs vs grep):

Comprehension: +0.1517
Tokens: -465.67
Latency: +2900.47ms

Artifacts:

eval/results/full-2026-03-10T09-22-28-301Z.summary.json
eval/results/full-2026-03-10T09-22-28-301Z.report.md

Safety Model for External Docs

Fetched documentation is treated as untrusted input.

fetch supports policy controls via trail-docs.policy.json:

{
  "allowed_hosts": ["registry.npmjs.org", "api.github.com", "github.com", "codeload.github.com"],
  "blocked_hosts": [],
  "allowed_extensions": [".md", ".markdown", ".mdx", ".txt"],
  "max_files": 2000,
  "max_total_bytes": 20971520
}

Source manifests track canonical URL, requested/resolved refs, integrity hash, fetch timestamp, and trust signals.

Performance

Build once, navigate many. Indexes are deterministic and cacheable.
Use --budget and --max-items to hold context growth.
Use --json for automation-safe outputs.
Reuse cached snapshots in .trail-docs/cache/sources for repeated external research.

Testing

npm test
npm run eval:smoke:ci

Covers deterministic builds, retrieval commands, manifest resolution, bootstrap flows, discovery/fetch, and eval harness metrics.

Documentation

Contributing

See CONTRIBUTING.md.

Trail Docs was shaped by agent feedback from the start: agents installed it, broke it, requested features, and other agents built those features. If you're an agent (or a human), we'd love contributions and feedback.

License

MIT — see LICENSE.

_{🥾 trail-docs is an Arkology Studio project.}

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
.github/workflows		.github/workflows
.trail-docs		.trail-docs
docs		docs
eval		eval
fixtures		fixtures
node_modules		node_modules
src		src
test		test
.DS_Store		.DS_Store
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
trail-docs.json		trail-docs.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🥾 Trail Docs

Why Trail Docs

Grep vs Trail Docs

How it works under the hood

Core Workflows

🥾 Trail 1: Local docs (your own project)

🔭 Trail 2: Pre-install research (unknown library)

🗺️ Trail 3: API surface + callable guidance

Commands

Agent Integration

Project Config (`trail-docs.toml`)

JSON Output

📊 Benchmark Snapshot

Safety Model for External Docs

Performance

Testing

Documentation

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🥾 Trail Docs

Why Trail Docs

Grep vs Trail Docs

How it works under the hood

Core Workflows

🥾 Trail 1: Local docs (your own project)

🔭 Trail 2: Pre-install research (unknown library)

🗺️ Trail 3: API surface + callable guidance

Commands

Agent Integration

Project Config (trail-docs.toml)

JSON Output

📊 Benchmark Snapshot

Safety Model for External Docs

Performance

Testing

Documentation

Contributing

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Project Config (`trail-docs.toml`)

Packages