Skip to content

feat(cpp-phase2): PR-02 loader, resolver integration, linux validation#680

Open
shivasurya wants to merge 4 commits intoshiva/cpp-phase2-pr01-generatorfrom
shiva/cpp-phase2-pr02-loader
Open

feat(cpp-phase2): PR-02 loader, resolver integration, linux validation#680
shivasurya wants to merge 4 commits intoshiva/cpp-phase2-pr01-generatorfrom
shiva/cpp-phase2-pr02-loader

Conversation

@shivasurya
Copy link
Copy Markdown
Owner

@shivasurya shivasurya commented May 3, 2026

Summary

PR-02 of the C/C++ Phase 2 stdlib stack. Turns the per-header JSON manifests from PR-01 into a working stdlib resolution path inside Phase 1's call-graph builders. After this PR, pathfinder ci and pathfinder scan resolve printf, std::vector::push_back, std::move, etc. against the registries on disk.

Stacked on #679 — must merge AFTER PR-01. Base branch is shiva/cpp-phase2-pr01-generator; once #679 lands, this PR will auto-rebase onto main.

What's in here

  • core + registry: CStdlibLoader / CppStdlibLoader interfaces, file:// + HTTP-stubbed loaders mirroring go_stdlib_remote.go, platform detector (macros + path hints), 24h-TTL disk cache (consumed by PR-03's HTTP path), SystemIncludes index.
  • builder: resolveCCallTarget and resolveCppCallTarget gain a 3-tuple return; new lookupCStdlib, lookupCppStdlibMethod, lookupCppStdlibFreeFunction consult the loaders after Phase 1 resolution. Template parameters (T / U / V / K) are substituted into stdlib return types when applicable. Project definitions still shadow stdlib symbols.
  • cmd: --target and --stdlib-base-url flags on scan and resolution-report. Failures degrade to nil-loader-but-keep-scanning so a missing manifest never breaks a run.
  • testdata: testdata/c/stdlib/main.c and testdata/cpp/stl/main.cpp smoke fixtures.

Out of scope (per PR-02 spec)

  • HTTP loader implementation — PR-03 (calls return explicit "PR-03" errors today)
  • --diagnose-stdlib mode + resolution-report stdlib line — PR-04
  • Live large-codebase validation gate — runs as a manual step before PR-03

Verification

  • gradle buildGo — clean
  • go test ./... — all packages pass
  • golangci-lint run ./... — 0 issues
  • Coverage: 91.6% registry, 85.7% builder, 94.7% core; 100% on new cmd helpers (initClikeStdlib, loader builders, logger adapter)

Test plan

  • CI green
  • Manual smoke: pathfinder scan --project sast-engine/testdata/c/stdlib --stdlib-base-url=file://<pr01-output> --target=linux resolves all 5 calls in main.c
  • Manual smoke: same against testdata/cpp/stl/main.cpp

🤖 Generated with Claude Code

shivasurya and others added 3 commits May 3, 2026 18:38
Introduces the loader infrastructure that PR-02 will plug into the
Phase 1 call-graph resolvers:

- core: CStdlibLoader / CppStdlibLoader interfaces, SecurityTag on
  CallSite, and StdlibRegistry / StdlibCppRegistry hooks on the C and
  C++ module registries plus a SystemIncludes index.
- registry/c_stdlib_remote.go + cpp_stdlib_remote.go: dual-mode
  loaders (file:// active, HTTP stubbed for PR-03) with double-check
  locked header caches mirroring the Go stdlib loader.
- registry/clike_platform_detector.go: macro + path-hint based
  linux/darwin/windows detection, host-platform fallback.
- registry/clike_disk_cache.go: 24h-TTL on-disk cache wired for the
  PR-03 HTTP path; tested in isolation here.
- registry/c_module.go: BuildCSystemIncludeMap so the resolver can
  walk a caller file's <header> list.

Coverage: 91.4% on registry, 94.7% on core. HTTP fetch paths return
explicit "PR-03" errors and stay tested via stub assertions.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Extends Phase 1's resolution chain with a final stdlib lookup so
calls into <stdio.h>, <vector>, std::move, vector::push_back, etc.
become resolved edges with type, confidence, and security-tag
metadata.

C builder (c_builder.go):
- resolveCCallTarget signature → (string, bool, *CStdlibFunction).
- New lookupCStdlib walks SystemIncludes for the caller file and
  consults StdlibRegistry; first include with a matching symbol wins.
- buildCCallSite enriches the emitted CallSite from CStdlibFunction
  (TypeSource="stdlib", InferredType, TypeConfidence, SecurityTag).

C++ builder (cpp_builder.go):
- resolveCppCallTarget gains the same 3-tuple shape.
- lookupCppStdlibMethod uses the type engine to read the receiver
  type, canonicalises std::vector<int> → std::vector, and substitutes
  T/U/V/K placeholders into the return type when present.
- lookupCppStdlibFreeFunction handles std::move / std::swap via
  CppStdlibLoader.GetFreeFunction.
- C-shape calls (printf, malloc) from .cpp files keep flowing
  through the embedded C registry.

Project-internal resolution still wins (project printf shadows
stdlib printf); receiver-less or untyped calls fall back to the
unresolved path with no panics.

Coverage: 85.1% on builder package, including the new stdlib paths.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Wires the C/C++ stdlib loaders into the CLI surface and adds smoke
fixtures that exercise the full pipeline.

scan.go / resolution_report.go:
- New --target=linux|darwin|windows flag overrides platform
  auto-detection.
- New --stdlib-base-url flag selects the registry source. file://
  paths and bare local paths read from disk; http(s):// will be
  honoured by PR-03's HTTP loader. Empty value disables stdlib
  resolution and keeps Phase 1 behavior.
- initClikeStdlib boots both loaders via DetectClikeTarget +
  buildC{,pp}StdlibLoader, calls LoadManifest with a logger adapter,
  and degrades to nil-loader-but-keep-scanning on every failure mode
  so a missing manifest never breaks a scan.
- buildClikeCallGraphs takes a clikeStdlibConfig; the C and C++
  merge helpers inject the loaders into the freshly-built registries
  before invoking the call-graph builders.

testdata/c/stdlib/main.c + testdata/cpp/stl/main.cpp: small smoke
fixtures covering printf/malloc/strlen and vector::push_back /
std::move / std::printf for downstream e2e checks.

Coverage on the new cmd helpers: 100% across initClikeStdlib,
loadC{,pp}StdlibFromBase, buildC{,pp}StdlibLoader, and the logger
adapter.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@shivasurya shivasurya added enhancement New feature or request go Pull requests that update go code labels May 3, 2026
@shivasurya shivasurya self-assigned this May 3, 2026
@shivasurya shivasurya added enhancement New feature or request go Pull requests that update go code labels May 3, 2026
@safedep
Copy link
Copy Markdown

safedep Bot commented May 3, 2026

SafeDep Report Summary

Green Malicious Packages Badge Green Vulnerable Packages Badge Green Risky License Badge

No dependency changes detected. Nothing to scan.

View complete scan results →

This report is generated by SafeDep Github App

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 3, 2026

Code Pathfinder Security Scan

Pass Critical High Medium Low Info

No security issues detected.

Metric Value
Files Scanned 23
Rules 205

Powered by Code Pathfinder

@codecov
Copy link
Copy Markdown

codecov Bot commented May 3, 2026

Codecov Report

❌ Patch coverage is 89.56954% with 63 lines in your changes missing coverage. Please review.
✅ Project coverage is 85.61%. Comparing base (682fa02) to head (6204dda).

Files with missing lines Patch % Lines
...gine/graph/callgraph/registry/cpp_stdlib_remote.go 87.15% 7 Missing and 7 partials ⚠️
sast-engine/graph/callgraph/builder/cpp_builder.go 91.96% 6 Missing and 3 partials ⚠️
...ngine/graph/callgraph/registry/clike_disk_cache.go 83.92% 6 Missing and 3 partials ⚠️
sast-engine/cmd/scan.go 88.57% 6 Missing and 2 partials ⚠️
sast-engine/graph/callgraph/registry/c_module.go 61.90% 6 Missing and 2 partials ⚠️
...raph/callgraph/registry/clike_platform_detector.go 92.07% 4 Missing and 4 partials ⚠️
...engine/graph/callgraph/registry/c_stdlib_remote.go 93.40% 3 Missing and 3 partials ⚠️
sast-engine/graph/callgraph/builder/c_builder.go 96.96% 1 Missing ⚠️
Additional details and impacted files
@@                         Coverage Diff                         @@
##           shiva/cpp-phase2-pr01-generator     #680      +/-   ##
===================================================================
+ Coverage                            85.55%   85.61%   +0.06%     
===================================================================
  Files                                  196      200       +4     
  Lines                                28341    28912     +571     
===================================================================
+ Hits                                 24247    24754     +507     
- Misses                                3156     3195      +39     
- Partials                               938      963      +25     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Two changes that together push patch coverage past Codecov's
85.55% gate (84.98% → 85.7%):

1. White-box unit tests for the C++ stdlib helper suite:
   canonicalizeStdlibType, parseTemplateArgs, applyTemplateSubstitution,
   replaceWholeWord, substituteTemplateMethodReturn, plus the missing
   nil/empty-input guards in lookupCppStdlibMethod, lookupCStdlib,
   and lookupCppStdlibFreeFunction. All five helpers now hit 100%.

2. Bug fix uncovered while writing the K-alias test: the loop in
   applyTemplateSubstitution broke at V whenever args was shorter
   than 3, so K never ran. With map<K,V>-style return types written
   as "K", the placeholder stayed un-substituted. Drop the early
   break and rely on the per-iteration idx-bounds check.

3. clike_disk_cache_test.go gains an env-clearing test for the
   $HOME fallback branch in getStdlibCacheRoot (37.5% → 75%).

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request go Pull requests that update go code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant