Static codebase + binary analyzer. One binary, ~500 actions, 18 source languages, sub-second cold-cache on 3K-file repos. No network, no servers, no databases, no API keys.
This README is your system prompt. Designed for AI agents: drop the entire file into your context (or fetch https://raw.githubusercontent.com/charleschenai/codemap/main/README.md) and you have everything you need — what codemap is, when to use it, how to install it, how to call every category of action, output schemas, exit codes, MCP setup. No further docs required for 95% of usage. Humans: see docs/HUMAN.md. Everyone else, keep reading.
| Problem | Codemap action | Why codemap (vs alternatives) |
|---|---|---|
| "What does this codebase do?" | summary --dir <path> |
Cross-file structural overview in one call. Beats reading files. |
| "Find unused functions / dead code" | dead-functions --dir <path> |
Call-graph reachability across modules. grep can't do this. |
| "Who calls function X?" | callers --dir <path> X |
True call graph (AST-aware), not a string match. |
| "What does function X depend on (transitively)?" | trace --dir <path> X |
Walks the dep graph. grep would only find direct refs. |
| "What changed between two commits?" | diff --dir <path> <ref1> <ref2> |
Semantic diff, not line diff. |
| "Find security issues" | audit --dir <path> |
Composite of taint + secret-scan + dep-tree + dead-deps. |
| "Where would a tainted input flow?" | taint --dir <path> --source <fn> --sink <fn> |
Path-sensitive, sanitizer-aware, alias-aware, cross-procedural. |
| "Reverse-engineer a binary" | bin-info <path/to/binary> |
PE/ELF/Mach-O parser. capa + YARA + signsrch + PEiD rules built in. |
| "Find cross-language coupling" | cross-lang --dir <path> |
Imports/calls that cross language boundaries. |
| "Natural-language: I don't know which action" | natural-query "<question>" --dir <path> |
Routes to the right action by Levenshtein + LLM (when --llm). |
- Editing files: codemap is read-only. Use Edit/Write directly.
- Running code: codemap doesn't compile or exec. Use bash.
- Live process state: codemap is static. Use
ps,lsof,ss. - Single-file grep: if you know the file,
grepis faster. - String search across few files: if N<5 files, just
grep.
curl -fsSL https://github.com/charleschenai/codemap/releases/latest/download/install.sh | sh
Detects arch (x86_64-linux, aarch64-linux, x86_64-macos, aarch64-macos), downloads matching tarball, installs to ~/.local/bin/codemap. No sudo. To install to /usr/local/bin: append -- --system.
From source: git clone https://github.com/charleschenai/codemap && cd codemap && ./install.sh.
codemap --version-detail
Prints:
codemap 8.0.1
git: 5c091ec
built: 2026-05-19T12:40:00Z
host: mater (aarch64-unknown-linux-gnu)
If the binary is older than expected, re-run install with --update.
Universal shape:
codemap <ACTION> [TARGET...] --dir <PATH> [--json] [--quiet] [other-flags]
| Flag | Purpose |
|---|---|
--dir <PATH> |
Required. Repo/dir to scan. Repeatable for multi-repo. |
--json |
Output JSON (parseable). Default is text (human-readable). |
--quiet |
Suppress scan/cache status messages on stderr. |
--no-cache |
Force re-scan, ignore .codemap/cache.bincode. |
--include-path <PATH> |
C/C++ include search path. |
--watch [SECS] |
Re-run every N seconds. |
For agents: always use --json and --quiet unless you specifically want text output.
codemap --help # full action list
codemap <action> --help # action-specific flags
codemap natural-query "find dead code" --dir <path> # NL routing
natural-query accepts plain English and returns the top routed action(s). For agents that aren't sure which action to call, this is the primary entry point.
~500 actions grouped by purpose. Full catalog at docs/ACTION_CATALOG.md. High-level groups:
| Category | Action count | Examples |
|---|---|---|
| Analysis | ~20 | summary, stats, trace, callers, hotspots, layers, health, decorators |
| Code intelligence | ~30 | complexity, import-cost, churn, api-diff, clones, entry-points, dead-functions |
| Dataflow / security | ~15 | data-flow, taint, slice, trace-value, sinks, secret-scan, audit, dep-tree |
| Graph theory | ~40 | pagerank, hubs, bridges, centrality (17 measures), community (Leiden), bellman-ford |
| Binary / RE | ~150 | elf-info, pe-imports, macho-info, bin-search, bin-disasm, bin-strings, bin-relocs |
| Schemas | ~10 | proto-schema, openapi-schema, graphql-schema, sql-extract, dbf-schema |
| Supply chain | ~10 | osv-scan, sbom-diff, license-check, cve-scan |
| Config-as-code | ~10 | k8s-scan, iac-scan, dockerfile-scan, ci-scan, oci-scan |
| ML / AI | ~10 | gguf-info, safetensors-info, onnx-info, cuda-info, pyc-info |
| LSP bridge | ~5 | lsp-symbols, lsp-references, lsp-calls, lsp-diagnostics, lsp-types |
| Web | ~5 | web-sitemap, js-api-extract (HAR/HTML input required) |
| Cross-language | ~5 | lang-bridges, gpu-functions, monkey-patches |
| Composite | ~10 | audit, compare, validate, changeset, handoff, pipeline |
| arXiv-derived | 15 | symex-concolic, pointer-analysis, abstract-interp, bin-search, loop-polyhedral, gpu-analyze, side-channel-detect, semantic-slice, symex-speculative, cegio, natural-query, synthesize, detect-memory-corruption, neural-decompile, patch-binary |
All --json outputs follow:
{
"ok": <boolean>,
"action": "<action-name>",
"dir": "<scanned-path>",
"result": <action-specific>,
"stats": { "files_scanned": N, "duration_ms": M, "cache_hits": K }
}
result shape varies per action. Action-specific schemas in docs/SCHEMAS.md.
| Code | Meaning | Agent response |
|---|---|---|
| 0 | Success | Parse --json output |
| 1 | Usage error (bad flag, missing --dir) | Re-read --help, fix args, retry |
| 2 | I/O error (path not found, no read perm) | Verify path, retry |
| 101 | Panic | Do not retry. File a bug at https://github.com/charleschenai/codemap/issues |
Other non-zero codes: action-specific. See <action> --help.
codemap ships an MCP server for Claude Code agents:
{
"mcpServers": {
"codemap": {
"command": "codemap",
"args": ["mcp-stdio"]
}
}
}Exposes the full action surface as MCP tools. Tool names match action names; args match CLI flags.
Each recipe: what the action does → command → sample output → when to use it.
For the complete flat list of action names see docs/ACTION_CATALOG.md.
Reports file count, languages, entry points, top modules, dispatch density. Single-call onboarding.
$ codemap summary --dir ./my-repo --json --quiet
{"ok":true,"result":{"files":2824,"languages":["rust","python","typescript"],
"entry_points":["src/main.rs","src/lib.rs"],"top_modules":["analysis","insights","cpg"]}}
Use when: new repo, "tell me what this does" before diving deeper.
Per-language LOC + file counts, function/class density, fan-in/fan-out distribution.
$ codemap stats --dir ./my-repo --json --quiet
{"ok":true,"result":{"rust":{"files":341,"loc":89432,"fns":2104},"python":{"files":52,"loc":4108}}}
Use when: comparing repos by size, reporting metrics, sanity-checking parse coverage.
Infers boundaries (web / service / data / infra) from import patterns + naming conventions.
$ codemap layers --dir ./my-repo --json --quiet
{"ok":true,"result":{"layers":[{"name":"web","modules":["routes","handlers"]},
{"name":"data","modules":["models","repo"]}],"violations":[...]}}
Use when: validating that "web shouldn't import from data" type architectural rules hold.
Surfaces "danger zone" code (high git churn + high cyclomatic complexity).
$ codemap hotspots --dir ./my-repo --json --quiet --top 10
{"ok":true,"result":{"hotspots":[{"file":"src/parser.rs","churn":48,"complexity":92,"score":4416}]}}
Use when: prioritizing refactor work, finding "where bugs live."
Lists exported functions/classes that other code can call from outside.
$ codemap entry-points --dir ./my-repo --json --quiet
{"ok":true,"result":{"entries":[{"name":"create_user","file":"api/users.rs","kind":"public_fn"}]}}
Use when: API documentation, understanding what's a stable contract.
Composite: dead code % + clippy/lint count + circular deps + missing tests. Single "is this repo healthy?" score.
$ codemap health --dir ./my-repo --json --quiet
{"ok":true,"result":{"score":78,"dead_code_pct":3.2,"circular_deps":2,"missing_tests":["api/users.rs::delete"]}}
Use when: quick "should we touch this codebase or not" gut-check.
Functions never called by any other function in the workspace.
$ codemap dead-functions --dir ./my-repo --json --quiet
{"ok":true,"result":{"dead":[{"file":"src/old.rs","function":"legacy_helper","line":42}]}}
Use when: cleanup PR, removing tech debt. Don't use for: identifying entry points (they're "dead" by call-graph but intentionally public).
Files no other file imports / uses.
$ codemap dead-files --dir ./my-repo --json --quiet
{"ok":true,"result":{"dead_files":["src/experimental/old_impl.rs","tools/debug.py"]}}
Use when: dead-import cleanup.
Packages in Cargo.toml/package.json/pyproject.toml that no source file imports.
$ codemap dead-deps --dir ./my-repo --json --quiet
{"ok":true,"result":{"dead":["serde_json (Cargo.toml)","lodash (package.json)"]}}
Use when: dep cleanup, reducing build time + attack surface.
McCabe complexity (branches+1). Catches "this function should be split."
$ codemap complexity --dir ./my-repo --json --quiet --top 10
{"ok":true,"result":{"top":[{"fn":"parse_expression","file":"parser.rs","cyclomatic":34,"lines":280}]}}
Use when: finding refactor candidates, code review automation.
Commits-touching-file count over a window.
$ codemap churn --dir ./my-repo --json --quiet --top 10
{"ok":true,"result":{"top":[{"file":"src/parser.rs","commits":78,"authors":12}]}}
Use when: combined with complexity for hotspots, ownership analysis.
Detects near-identical token sequences across files (copy-paste detection).
$ codemap clones --dir ./my-repo --json --quiet --min-tokens 50
{"ok":true,"result":{"clones":[{"size":120,"locations":[["a.rs:14","b.rs:22"]],"similarity":0.94}]}}
Use when: finding extraction candidates for shared functions.
Reports module cycles (a → b → c → a).
$ codemap circular --dir ./my-repo --json --quiet
{"ok":true,"result":{"cycles":[["src/a.rs","src/b.rs","src/a.rs"]]}}
Use when: untangling architecture before a refactor.
Walks the call graph forward from a function/symbol, returns full dep tree.
$ codemap trace --dir ./my-repo --json --quiet RecalcInvoiceTotals
{"ok":true,"result":{"node":"RecalcInvoiceTotals","calls":[
{"name":"ship_chg_sum","file":"backend/invoices.go:120","depth":1},
{"name":"format_money","file":"util/money.go:8","depth":2}]}}
Use when: impact analysis before changing a function, generating context for an LLM.
Reverse of trace. Returns the function's call sites + their callers.
$ codemap callers --dir ./my-repo --json --quiet validate_user
{"ok":true,"result":{"callers":[{"caller":"login","file":"auth.py:88","depth":1}]}}
Use when: "if I change this signature, what breaks?"
Combines callers + dataflow + tests touched. Most pessimistic estimate.
$ codemap blast-radius --dir ./my-repo --json --quiet --target User.id
{"ok":true,"result":{"functions":42,"tests":7,"endpoints":3,"db_columns":2}}
Use when: "what's the size of changing this thing?"
Function-level diff: added, removed, signature-changed, body-changed.
$ codemap diff --dir ./my-repo --json --quiet HEAD~5 HEAD
{"ok":true,"result":{"added":["validate_email"],"removed":["old_validator"],
"signature_changed":[{"fn":"create","before":"(name)","after":"(name,email)"}]}}
Use when: generating PR descriptions, understanding code review scope.
Like diff but specifically flags BREAKING vs additive changes to public API.
$ codemap api-diff --dir ./my-repo --json --quiet HEAD~5 HEAD
{"ok":true,"result":{"breaking":[
{"kind":"removed","fn":"OldAPI::v1_login"},
{"kind":"signature_change","fn":"create_user","before":"(name)","after":"(name,email)"}]}}
Use when: versioning decisions (semver minor vs major), CHANGELOG generation.
Maps the diff to every transitively-affected caller.
$ codemap diff-impact --dir ./my-repo --json --quiet HEAD~5 HEAD
{"ok":true,"result":{"impacted_fns":127,"impacted_files":34,"high_risk":["payment::charge"]}}
Use when: deciding test scope for a PR.
Runs taint + secret-scan + dead-deps + dep-tree + license-check in one pass.
$ codemap audit --dir ./my-repo --json --quiet
{"ok":true,"result":{"findings":[
{"kind":"secret","file":".env.sample","line":3,"pattern":"AWS_KEY"},
{"kind":"taint","source":"req.body","sink":"db.execute","path":[...]},
{"kind":"dep-vuln","package":"lodash","version":"4.17.20","cve":"CVE-2021-23337"}]}}
Use when: first-pass security review of an unfamiliar repo.
Tracks tainted values from source(s) to sink(s). Sanitizer-aware, alias-aware (e.g. safe = sanitize(x)), cross-procedural (parses wrapper bodies to detect hidden sanitizers).
$ codemap taint --dir ./my-repo --json --quiet --source 'req.query' --sink 'db.execute'
{"ok":true,"result":{"paths":[{"source":"req.query.id","sink":"db.execute(sql)",
"hops":["params.id","userId","query"],"sanitized":false}]}}
Use when: SQLi/XSS/SSRF detection, "is user input reaching this sink?"
Given a target variable/sink, return only the code that influences it.
$ codemap slice --dir ./my-repo --json --quiet --var 'password' --file auth.py
{"ok":true,"result":{"slice_lines":[12,15,22,30,42],"file":"auth.py"}}
Use when: narrowing what to read when chasing a bug.
Enumerates every db.execute, eval, exec, Runtime.exec, subprocess.shell=True, innerHTML=, etc.
$ codemap sinks --dir ./my-repo --json --quiet
{"ok":true,"result":{"sinks":[{"kind":"sql","file":"api/users.rs","line":88,"expr":"db.execute(query)"}]}}
Use when: building taint queries, audit checklist generation.
20+ patterns (AWS key, GitHub PAT, Slack token, Stripe live key, private keys, JWT, DB conn strings, etc.). Redacted output.
$ codemap secret-scan --dir ./my-repo --json --quiet
{"ok":true,"result":{"findings":[{"file":".env.sample","line":3,"kind":"aws_access_key","masked":"AKIA****REDACTED"}]}}
Use when: pre-commit hook, pre-publish audit.
Where does this variable's value come from? (def-use chain)
$ codemap data-flow --dir ./my-repo --json --quiet --target 'user_id'
{"ok":true,"result":{"origins":[{"file":"auth.py:88","expr":"req.cookies['session']"}]}}
Use when: "where does this magic value come from?"
Detects Flask/Express/Axum/FastAPI/Spring/Rocket route handlers. Lists path + method + handler.
$ codemap api-surface --dir ./my-repo --json --quiet
{"ok":true,"result":{"endpoints":[{"method":"POST","path":"/users","handler":"create_user","auth_required":false}]}}
Use when: generating OpenAPI from existing code, finding unauthenticated endpoints.
These run on codemap's internal call graph + import graph + AST graph.
NetworkX-style PageRank. High score = central + many incoming refs.
$ codemap pagerank --dir ./my-repo --json --quiet --top 10
{"ok":true,"result":{"ranked":[{"fn":"handle_request","score":0.082}]}}
Use when: finding "load-bearing" functions, prioritizing code review.
Functions/modules that depend on many others. Different from PageRank (which is about incoming).
$ codemap hubs --dir ./my-repo --json --quiet
{"ok":true,"result":{"hubs":[{"fn":"orchestrator","out_degree":47}]}}
Use when: finding god-objects, refactor targets.
Edges whose removal disconnects the graph. These are critical paths.
$ codemap bridges --dir ./my-repo --json --quiet
{"ok":true,"result":{"bridges":[{"from":"auth","to":"db","modules":["auth.rs","db.rs"]}]}}
Use when: identifying single points of failure in module coupling.
Run with a specific measure: betweenness, eigenvector, katz, closeness, harmonic, load, structural-holes (brokers), voterank, etc. All NetworkX standards.
$ codemap betweenness --dir ./my-repo --json --quiet --top 5
{"ok":true,"result":{"top":[{"node":"db_session","betweenness":0.34}]}}
Use when: finding modules that connect otherwise-separate subsystems.
Partitions the graph into densely-connected sub-communities.
$ codemap clusters --dir ./my-repo --json --quiet leiden
{"ok":true,"result":{"clusters":[{"id":0,"size":34,"members":["auth.rs","users.rs"]}]}}
Use when: discovering implicit module boundaries.
Returns the chain of imports/calls connecting source → target.
$ codemap paths --dir ./my-repo --json --quiet user_input db_write
{"ok":true,"result":{"path":["user_input","sanitize","query_builder","db_write"],"length":4}}
Use when: "how does X reach Y?"
Returns nodes within N hops of a target. Useful before deep analysis.
$ codemap subgraph --dir ./my-repo --json --quiet --target login --depth 2
{"ok":true,"result":{"nodes":[...],"edges":[...]}}
Use when: narrowing scope before more expensive analysis.
Classical shortest-path algorithms exposed for graph queries. See ACTION_CATALOG.md for full list.
Format detection, arch, sections, strip state, language hints (Rust/Go/C++), anti-debug rules, packer detection.
$ codemap bin-info /usr/local/bin/codemap --json --quiet
{"ok":true,"result":{"format":"ELF64","arch":"aarch64","rust":true,"strip":false,
"sections":34,"anti_debug":[],"packed":false}}
Use when: triage step 1 — "what is this binary?"
(arXiv 2507.15226 TinyLFU) Embedding-based + bucketed dedup. Finds functions shared across two binaries even when symbols are stripped.
$ codemap bin-search --json --quiet --left ./malware-a --right ./malware-b
{"ok":true,"result":{"shared":[{"fn":"hash_block","conf":0.97}],"only_left":[...],"only_right":[...]}}
Use when: malware family detection, identifying shared code across stripped binaries, version comparison.
Lists every DLL imported + every function exported.
$ codemap pe-imports ./sample.exe --json --quiet
{"ok":true,"result":{"imports":[{"dll":"kernel32.dll","functions":["VirtualAlloc","CreateProcessA"]}]}}
Use when: static behavioral profiling — what APIs does this binary depend on?
Ascii + utf16le + entropy-filtered.
$ codemap pe-strings ./sample.exe --json --quiet --min-len 8
{"ok":true,"result":{"strings":["http://c2.example.com","cmd.exe /c"]}}
Use when: triaging unknown binaries — strings often reveal C2 URLs, command lines, paths.
Functions added / removed / modified between two builds.
$ codemap binary-diff --json --quiet --left v1.exe --right v2.exe
{"ok":true,"result":{"added":["new_handler"],"removed":["legacy_proc"],"modified":["main"]}}
Use when: patch analysis, regression hunting in firmware.
PE that contains CLI/.NET — reads the metadata streams, lists types + methods.
$ codemap dotnet-meta ./sample.dll --json --quiet
{"ok":true,"result":{"assembly":"Sample.Dll","types":["Foo","Bar"],"methods_count":42}}
Use when: analyzing .NET malware or .NET 3rd-party libs.
Constant pool, method signatures, bytecode summaries.
Imports, exports, function table, memory layout.
Parses spec files and reports endpoints/types/operations.
$ codemap openapi-schema --dir ./api --json --quiet
{"ok":true,"result":{"paths":[{"method":"GET","path":"/users","operationId":"listUsers"}]}}
Use when: generating client code, checking spec consistency.
Checks privileged containers, hostNetwork, missing resource limits, etc.
$ codemap k8s-scan --dir ./k8s/ --json --quiet
{"ok":true,"result":{"findings":[{"rule":"K8S-001","resource":"Deployment/api","severity":"high","msg":"privileged=true"}]}}
Use when: auditing manifests before apply.
$ codemap iac-scan --dir ./infra/ --json --quiet
{"ok":true,"result":{"findings":[{"rule":"IAC-007","file":"main.tf","msg":"S3 bucket public-read ACL"}]}}
$ codemap dockerfile-scan --dir ./ --json --quiet
{"ok":true,"result":{"findings":[{"rule":"DKR-002","msg":"running as root","line":18}]}}
GitHub Actions, GitLab CI, Jenkinsfile, CircleCI, Azure Pipelines, Travis. Catches injection, unpinned actions, secret literals, pull_request_target misuse.
$ codemap ci-scan --dir ./.github/ --json --quiet
{"ok":true,"result":{"findings":[{"rule":"GH-003","file":"deploy.yml","msg":"unpinned action ref"}]}}
Per-layer manifest, layer-resident secrets (11 patterns), licenses, file/dir/symlink counts.
$ codemap oci-scan --dir ./image.tar --json --quiet --mode all
{"ok":true,"result":{"layers":[...],"secrets":[...],"licenses":[...]}}
Pulls SQL out of source code or .sql files. Schema + queries.
$ codemap sql-extract --dir ./my-repo --json --quiet
{"ok":true,"result":{"tables":[{"name":"users","columns":[...]}],"queries":[...]}}
Semver-range-aware.
$ codemap osv-scan --dir ./my-repo --json --quiet
{"ok":true,"result":{"vulns":[{"package":"lodash","version":"4.17.20","cve":"CVE-2021-23337"}]}}
Added, removed, upgraded, downgraded packages between two SBOMs.
$ codemap sbom-diff --left ./sbom-1.spdx.json --right ./sbom-2.spdx.json --json --quiet
{"ok":true,"result":{"added":[...],"removed":[...],"upgraded":[...]}}
Per-package license + compatibility verdict.
$ codemap license-check --dir ./my-repo --json --quiet
{"ok":true,"result":{"deps":[{"name":"foo","license":"GPL-3.0","compatible":false}]}}
Architecture, layer count, head count, quant level, vocab size.
$ codemap gguf-info ./model.gguf --json --quiet
{"ok":true,"result":{"arch":"llama","n_layers":32,"n_heads":32,"vocab_size":32000,"quant":"Q4_K_M"}}
Use when: "what model is this file?" Pre-load sanity check.
Tensor shapes, dtypes, total params.
$ codemap safetensors-info ./model.safetensors --json --quiet
{"ok":true,"result":{"tensors":291,"total_params":7240000000,"dtype":"float16"}}
Operators, inputs, outputs, opset.
$ codemap onnx-info ./model.onnx --json --quiet
{"ok":true,"result":{"opset":17,"ops":["Conv","Relu","MaxPool"],"inputs":[{"name":"x","shape":[1,3,224,224]}]}}
SM versions present, kernel symbols.
Magic number, marshalled code object, imports.
Detects PyO3 / napi / wasm-bindgen / JNI etc. — where languages interop.
$ codemap lang-bridges --dir ./my-repo --json --quiet
{"ok":true,"result":{"bridges":[{"kind":"pyo3","rust_fn":"create_user","py_module":"my_lib"}]}}
CUDA __global__, OpenCL kernels, Metal compute kernels, ROCm/HIP.
$ codemap gpu-functions --dir ./my-repo --json --quiet
{"ok":true,"result":{"kernels":[{"name":"matmul_kernel","framework":"cuda","file":"kernels.cu"}]}}
obj.method = new_fn, setattr, prototype patching.
Routers, registries, plugin maps. Finds the "switch statement that controls behavior."
Real symbol info, not AST-inferred. More accurate for typed languages.
$ codemap lsp-diagnostics --dir ./my-repo --json --quiet
{"ok":true,"result":{"diagnostics":[{"file":"src/main.rs","line":42,"severity":"error","msg":"E0308: mismatched types"}]}}
Use when: programmatic access to compiler/type-checker errors.
These implement specific research papers. Each works in MVP-scaffold form — verifying the integration points and graph data; full paper-grade results may need additional flags or tuning.
arXiv 2301.04862. Maps NL questions to codemap actions via Levenshtein + (optionally) LLM router.
$ codemap natural-query "find functions that handle authentication" --dir ./my-repo --json --quiet
{"ok":true,"result":{"routed_to":"callers","args":{"target":"login|auth|signin"}}}
Use when: the agent doesn't know which action to call. Always-safe entry point.
arXiv 1205.4951. Combines concrete + symbolic execution. Drives test inputs by negating path conditions to explore new branches.
$ codemap symex-concolic --dir ./my-repo --json --quiet --target validate_input
{"ok":true,"result":{"paths":[{"condition":"x > 0","example_input":"x=1"}]}}
Use when: generating test inputs that achieve branch coverage on a target function.
Computes points-to sets (which pointers can alias which memory). Field-sensitive + flow-insensitive + Tarjan SCC pre-pass for performance.
$ codemap pointer-analysis --dir ./my-repo --json --quiet
{"ok":true,"result":{"scope_vars":102000,"copy_constraints":132000,
"aliases":[{"ptr":"p","may_alias":["a","b"]}]}}
Use when: understanding aliasing for refactoring (rename a field safely), upstream of taint analysis.
arXiv 1309.5133. Computes invariants like "x is positive" over abstract states. Sign + parity domains shipped; user-pluggable.
$ codemap abstract-interp --dir ./my-repo --json --quiet --target check_bounds
{"ok":true,"result":{"invariants":[{"var":"i","sign":"pos","parity":"any"}]}}
Use when: proving safety properties (overflow-free arithmetic, non-null pointers).
Feautrier 1996 / Bondhugula 2008. Classifies loops as affine / non-affine / parallelizable / vectorizable.
$ codemap loop-polyhedral --dir ./my-repo --json --quiet
{"ok":true,"result":{"loops":[{"file":"matmul.c","line":12,"class":"affine","parallel":true}]}}
Use when: identifying loop-optimization opportunities before manual vectorization.
arXiv 2604.14825 Nautilus. Memory-bound vs compute-bound vs warp-divergence triage on CUDA kernels.
$ codemap gpu-analyze --dir ./kernels --json --quiet
{"ok":true,"result":{"kernels":[{"name":"gemm","class":"compute-bound","warp_divergence":"low"}]}}
Use when: GPU kernel optimization priority (don't tune memory if compute-bound, etc.).
arXiv 2301.03724. Detects code patterns vulnerable to Spectre-class timing attacks (branch-on-secret + dependent memory access).
$ codemap side-channel-detect --dir ./my-repo --json --quiet
{"ok":true,"result":{"findings":[{"file":"crypto.c","line":48,"kind":"branch_on_secret"}]}}
Use when: auditing crypto / privileged code for timing leaks.
arXiv 2507.18957 SLICEMATE. Static slice + LLM refinement.
$ codemap semantic-slice --dir ./my-repo --json --quiet --var 'auth_token'
{"ok":true,"result":{"slice":[...],"llm_refinement":"sanitization missing on line 88"}}
Use when: chasing a bug — narrow the code that influences a sink with LLM help.
arXiv 2203.16487. Faster symbolic execution via draft-model speculation.
$ codemap symex-speculative --dir ./my-repo --json --quiet --target parse
{"ok":true,"result":{"paths_explored":42,"speculation_accept_rate":0.71}}
Use when: faster symex when willing to trade some completeness for speed.
arXiv 1704.03738. Given taint paths, synthesizes the minimum input that triggers a vulnerability.
$ codemap cegio --dir ./my-repo --json --quiet --taint-result <prior-taint-output>
{"ok":true,"result":{"trigger":{"input":"' OR 1=1--","reaches_sink":true}}}
Use when: turning a taint finding into a proof-of-concept exploit input.
arXiv 1702.06334. Given input/output examples, generates code that produces the mapping. Static-pruned for performance.
$ codemap synthesize --json --quiet --examples '[(1,1),(2,4),(3,9)]'
{"ok":true,"result":{"program":"fn f(x) { x * x }"}}
Use when: spec-by-example, generating boilerplate from samples.
arXiv 2605.15097. Static detection of double-free, use-after-free, buffer overflow.
$ codemap detect-memory-corruption --dir ./my-repo --json --quiet
{"ok":true,"result":{"findings":[{"kind":"use_after_free","file":"alloc.c","line":42}]}}
Use when: C/C++ codebase audit for memory-safety bugs.
arXiv 2605.11501. Decompiles a binary function via neural model, recompiles, checks semantic equivalence.
$ codemap neural-decompile ./sample.exe --json --quiet --fn 0x401000
{"ok":true,"result":{"decompiled":"int main() { ... }","recompile_match":true}}
Use when: stripped-binary RE, want approximate source.
arXiv 2605.02121. Given a CVE/vuln location in a binary, generates patch instructions.
$ codemap patch-binary ./vuln.exe --json --quiet --cve CVE-2024-12345
{"ok":true,"result":{"patch_recipe":[{"offset":"0x401050","bytes":"90 90 90"}]}}
Use when: offensive/defensive binary patching when source unavailable.
See "Data flow & security" section above.
Single composite for "is this repo broken?"
$ codemap changeset --dir ./my-repo --json --quiet HEAD~10 HEAD
{"ok":true,"result":{"changes":{"feat":[...],"fix":[...],"refactor":[...]}}}
Distills repo state into a single MD doc (status + open issues + recent work + next-steps).
Run several actions in sequence, accumulate results.
$ codemap pipeline --dir ./my-repo --json --quiet --target 'audit:./,trace:main,hotspots:'
{"ok":true,"result":{"audit":{...},"trace":{...},"hotspots":{...}}}
Use when: scripted multi-step analysis.
codemap walks --dir, parses with tree-sitter, builds a file-level import graph and a function-level call graph, layers PE/ELF/Mach-O/WASM/Java binary parsers + x86/x64 disassembly, and exposes ~500 actions through a uniform CLI registry (inventory::submit!). Cache: .codemap/cache.bincode next to the scanned dir. Pure static. No daemons, no network access at analysis time.
codemap-core/— parsing, graph, algorithms, actionscodemap-cli/— thecodemapbinarycodemap-napi/— Node.js bindings (optional)docs/— REFERENCE.md, ACTION_CATALOG.md, SCHEMAS.md, HUMAN.mdinstall.sh— single install entry
MIT. See LICENSE.