Skip to content

feat(callgraph): C/C++ explicit type inference engines#673

Merged
shivasurya merged 1 commit intomainfrom
shiva/cpp-type-inference
May 3, 2026
Merged

feat(callgraph): C/C++ explicit type inference engines#673
shivasurya merged 1 commit intomainfrom
shiva/cpp-type-inference

Conversation

@shivasurya
Copy link
Copy Markdown
Owner

Summary

Adds the explicit type-tracking foundation for C/C++ call-graph resolution.

  • graph/callgraph/resolution/c_types.goCTypeInferenceEngine, CFunctionScope, CVariableBinding. Tracks return types and per-function variable scopes drawn directly from source declarations.
  • graph/callgraph/resolution/cpp_types.goCppTypeInferenceEngine embeds the C engine and adds class method / class field indices, plus auto handling.

TypeInfo contract

Source Confidence Used for
declaration 1.0 Explicit types from source — return types, variable decls, class members
unresolved_auto 0.0 C++ auto x = ... placeholders awaiting Phase 2 deduction

Design notes

  • Embedding: CppTypeInferenceEngine embeds CTypeInferenceEngine by value, so every C-engine method (ExtractReturnType, GetScope, GetVariable, etc.) is callable on the C++ engine. The embedded Registry field aliases the C++ registry's CModuleRegistry facet so updates propagate.
  • Reassignment: each variable name keeps a slice of bindings; GetVariable returns the latest. GetAllBindings exposes the history for future flow analysis.
  • auto detection: exact equality on \"auto\". Modifiers like auto* and auto& are concrete types and keep full confidence — they survive the override branch and route through the C engine unchanged.
  • Void returns: explicitly dropped at registration so void functions never pollute downstream lookups (which gate on GetReturnType != nil).
  • Thread safety: four sync.RWMutex instances guard Scopes, ReturnTypes, ClassMethods, ClassFields. Snapshot accessors (GetAllReturnTypes, GetAllScopes) return defensive copies.
  • Lazy scope creation: ExtractVariableType creates the scope on first use; callers do not need to call AddScope before the first variable.

Test plan

  • go build ./...
  • go test ./... — full suite green (25 packages)
  • go test -race ./graph/callgraph/resolution/... — clean
  • go vet ./...
  • golangci-lint run ./graph/callgraph/resolution/ — 0 issues
  • Coverage on changed lines: c_types.go 100%, cpp_types.go 100%
  • Spec test cases covered: NewC*Engine, ExtractReturnType (success + void), AddReturnType, ExtractVariableType (basic + reassignment), GetScope miss, AddScope, concurrent access (-race), embedded methods, RegisterClassMethod (success + void/empty drops + redeclaration), RegisterClassField (success + empty drops), auto zero-confidence, auto*/auto& exact-match, complex C types (pointer/const/struct), complex C++ types (templates / refs / nested templates).

Stacked on

shiva/cpp-module-registry (#672)

@shivasurya shivasurya added enhancement New feature or request go Pull requests that update go code labels May 2, 2026
@shivasurya shivasurya self-assigned this May 2, 2026
@shivasurya shivasurya added enhancement New feature or request go Pull requests that update go code labels May 2, 2026
@safedep
Copy link
Copy Markdown

safedep Bot commented May 2, 2026

SafeDep Report Summary

Green Malicious Packages Badge Green Vulnerable Packages Badge Green Risky License Badge

No dependency changes detected. Nothing to scan.

View complete scan results →

This report is generated by SafeDep Github App

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 2, 2026

Code Pathfinder Security Scan

Pass Critical High Medium Low Info

No security issues detected.

Metric Value
Files Scanned 4
Rules 205

Powered by Code Pathfinder

@codecov
Copy link
Copy Markdown

codecov Bot commented May 2, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 85.26%. Comparing base (ee277fb) to head (1fc411c).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #673      +/-   ##
==========================================
+ Coverage   85.18%   85.26%   +0.07%     
==========================================
  Files         180      182       +2     
  Lines       26237    26399     +162     
==========================================
+ Hits        22351    22508     +157     
- Misses       3024     3028       +4     
- Partials      862      863       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown
Owner Author

shivasurya commented May 3, 2026

Merge activity

  • May 3, 1:15 PM UTC: A user started a stack merge that includes this pull request via Graphite.
  • May 3, 1:25 PM UTC: Graphite rebased this pull request as part of a merge.
  • May 3, 1:26 PM UTC: @shivasurya merged this pull request with Graphite.

@shivasurya shivasurya changed the base branch from shiva/cpp-module-registry to graphite-base/673 May 3, 2026 13:23
@shivasurya shivasurya changed the base branch from graphite-base/673 to main May 3, 2026 13:24
Add CTypeInferenceEngine and CppTypeInferenceEngine under the
resolution package. The engines index types that appear verbatim in
C/C++ source — function return types, variable declarations, class
method return types, and class field types — with Confidence=1.0 and
Source="declaration". No inference, deduction, or propagation; later
phases layer those on top.

Highlights:
  - Function-scoped variable bindings track reassignment history; the
    latest binding wins on lookup.
  - C++ engine embeds the C engine by value so callers can use
    ExtractReturnType / GetScope / GetVariable uniformly.
  - C++ `auto` is recognised by exact match and stored with
    Confidence=0.0, Source="unresolved_auto" so resolvers skip it
    until Phase 2 deduces a concrete type.
  - Thread-safe via two RWMutex pairs on the C engine and two
    additional pairs for class-method and class-field maps; verified
    under `go test -race`.

Sets up the typed receiver lookups consumed by PR-07 (C call graph)
and PR-08 (C++ call graph).

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@shivasurya shivasurya force-pushed the shiva/cpp-type-inference branch from 47d63d3 to 1fc411c Compare May 3, 2026 13:25
@shivasurya shivasurya merged commit d3ea40b into main May 3, 2026
6 checks passed
@shivasurya shivasurya deleted the shiva/cpp-type-inference branch May 3, 2026 13:26
shivasurya added a commit that referenced this pull request May 3, 2026
## Summary

Adds `BuildCCallGraph` — a four-pass algorithm that produces a `*core.CallGraph` for C projects, ready to merge into the unified graph alongside Python/Go.

| Pass | Purpose |
|---|---|
| 1 | Index every C `function_definition` under `\"<relpath>::<name>\"` and ensure the FQN is also in `registry.FunctionIndex` |
| 2 | Register explicit return types (skipping `void`) and emit `ParameterSymbol` entries for every named parameter |
| 3 | Walk parser-emitted edges (`function_definition → call_expression`) to extract one `CallSiteInternal` per call — no second AST traversal |
| 4 | Resolve targets in a definition-preferring order, then emit edges and `CallSite` records |

### Resolution order (Pass 4)

1. **Same-file definition** — common case (helper in same `.c`); deterministic and independent of include state.
2. **Global definition** — scan `registry.FunctionIndex[name]` for an FQN whose call-graph entry is a definition. Handles cross-`.c` calls.
3. **Same-file declaration** — accept a forward declaration when no definition exists project-wide.
4. **Declaration reachable through `#include`** — last resort so externs handed off to another translation unit still surface as edges.

Calls that don't match any source produce a `CallSite{Resolved: false, FailureReason: \"external_or_unresolved\"}` — stdlib calls (`printf`, `malloc`) and unknown function pointers remain visible to rule writers without polluting the edge set.

### Design notes

- **Edges from the parser**: every `parseCCallExpression` adds an edge from the enclosing function to the call node. The builder walks `OutgoingEdges` of each indexed function instead of doing byte-range containment, keeping Pass 3 deterministic and trivially testable.
- **Definition vs declaration**: the parser sets `Metadata[\"is_declaration\"]=true` on prototype/extern decls. `isDeclaration()` reads that key with a typed assertion so non-declaration nodes (no metadata) fall through correctly.
- **Recursion**: self-edges (`process → process`) are emitted as-is; the call graph already deduplicates via `AddEdge`.
- **Static functions**: same FQN-by-file mechanism — file-scope statics in different `.c` files map to disjoint FQNs.
- **Unique FunctionIndex entries**: Pass 1 dedupes against the registry's existing `FunctionIndex` so calling `BuildCCallGraph` after `BuildCModuleRegistry` is idempotent.

## Test plan

- [x] `go build ./...`
- [x] `go test ./...` — full suite green
- [x] `go vet ./...`
- [x] `golangci-lint run ./graph/callgraph/builder/` — 0 issues
- [x] Coverage on `c_builder.go` lines: ~89.6%
- [x] Spec scenarios covered:
  - Single-file `main()` → `add()` edge
  - Cross-file `.c` definition preferred over `.h` declaration
  - Header declaration via `#include` fallback
  - `printf` (stdlib) recorded as `Resolved:false` with failure reason
  - Recursive self-call
  - Same name in two `.c` files (file-scope statics)
  - Type engine populated; `void` returns dropped
  - Parameters indexed (anonymous params skipped)
  - Declarations skipped from Pass 2 type extraction
  - Merges cleanly into an empty unified graph
  - Non-C nodes ignored (mixed-language safety)
  - Anonymous / missing-file functions filtered
  - Empty-target call_expression produces no edge and no recorded site
  - Cross-`.c` global lookup works without an `#include`
  - Same-file forward declaration accepted when no definition exists

## Stacked on

`shiva/cpp-type-inference` (#673)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request go Pull requests that update go code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant