jcg builds static Java call graphs, aggregates them across hierarchy levels, runs clustering, and serves an interactive graph UI.
This implementation provides:
- CLI:
analyze,serve,cluster,export - Stage-based pipeline with manifests and cache reuse
- Build-system detection (
gradle,maven, fallback) - Fast static call graph extraction from source (approximate)
- Aggregated graphs: method/class/package/module
- Clustering algorithm names:
leiden,louvain,infomap,lpa(with deterministic fallback) - UI + REST API for exploration and reclustering
- LLM retrieval artifacts (
context_index.jsonl, cluster summaries, entrypoints)
pip install -e .or run directly:
PYTHONPATH=src python -m jcg.cli --helpAnalyze a local project and serve UI:
jcg analyze --path /path/to/java-repo --mode class --serve --port 8765 --openAnalyze a remote repo URL:
jcg analyze --repo https://github.com/org/project --mode package --cluster-algos leiden,louvainServe an existing analysis:
jcg serve --input out/<project_fingerprint>Recluster an existing graph:
jcg cluster --input out/<project_fingerprint> --level class --algo leiden,louvain --resolution 1.2 --seed 7Export graphs:
jcg export --input out/<project_fingerprint> --format graphml --level alljcg is mostly local-first, but not fully offline in all modes.
What stays local:
- Static extraction, aggregation, clustering, and artifact generation run locally.
- Analysis outputs are written to local disk under
out/<project_fingerprint>/. - The built-in API/UI server binds to
127.0.0.1(localhost) only. - This package has no telemetry/analytics dependencies in
pyproject.toml.
Where network access can happen:
jcg analyze --repo ...runsgit clone/fetch/pullagainst a remote repo.- Gradle/Maven build steps may access remote artifact repositories and execute project build logic/plugins.
- The web UI currently loads Cytoscape from CDN:
https://unpkg.com/cytoscape@3.30.2/dist/cytoscape.min.js.
Recommended hardening for internal/company projects:
- Prefer local source input:
jcg analyze --path /path/to/repo. - Avoid executing project build scripts when not required:
--build-system none. - Run in a restricted network environment (egress firewall, sandbox/container) when handling sensitive code.
- Use Maven/Gradle offline modes and pre-populated local caches if you need build-assisted analysis.
- Vendor UI assets locally (replace CDN script with a local file) if a fully offline UI is required.
Threat model note:
jcgitself does not implement explicit upload of analysis artifacts.- If you analyze untrusted repos with build execution enabled, treat Gradle/Maven build scripts/plugins as arbitrary code from a security perspective.
Each run writes:
out/<project_fingerprint>/
run_manifest.json
analysis_index.json
stage-01-acquire/
stage-02-build/
stage-03-extract/
stage-04-aggregate/
stage-05-cluster/
exports/
Important files:
stage-03-extract/method_graph.jsonstage-04-aggregate/class_graph.jsonstage-04-aggregate/package_graph.jsonstage-04-aggregate/module_graph.jsonstage-05-cluster/clusters/<level>_<algo>_res...jsonstage-05-cluster/node2cluster/<level>_<algo>_res...jsonstage-05-cluster/cluster_graph/<level>_<algo>_res...jsonexports/context_index.jsonlexports/cluster_summaries.jsonexports/entrypoints.json
--build-system autodetects Gradle/Maven by project files.- If build fails,
jcgattempts fallbackjavaccompile. - If compilation is not possible, extraction still runs from source roots in degraded mode.
run_manifest.jsonrecords quality asfull,partial, ordegraded.
GET /api/metaGET /api/graphs?level=class&min_weight=2GET /api/clusters?level=class&algo=louvain&resolution=1.0&seed=42POST /api/cluster/recomputeGET /api/node/<id>?level=classGET /api/cluster/<id>?level=class&algo=louvain&resolution=1.0&seed=42GET /api/export/context?cluster_id=c0001
- Reflection, dynamic proxies, AOP weaving, and runtime codegen are not fully captured.
- The default extractor is a fast static approximation from source; it is not a whole-program sound analysis.
- Framework lifecycle callbacks may be under-approximated.
- Start with
--mode packageor--mode module. - Use filtering (
--exclude-packages) to remove utility-heavy namespaces. - Raise min edge weight in UI before expanding details.
- Recluster at aggregated levels first (
packagethenclass).
Run:
PYTHONPATH=src python -m unittest discover -s tests -p 'test_*.py'