[UPDATE PRIMITIVE] Auto-infer codeql_query_run format from @kind to enable result caching by Copilot · Pull Request #275 · advanced-security/codeql-development-mcp-server

Copilot · 2026-05-08T18:48:52Z

codeql_query_run's post-processor returned early when no format was supplied, so @kind problem / path-problem / graph queries skipped BQRS interpretation and were never written to the query results cache — despite the tool description already advertising kind-based defaults.

The post-processor now reads the query's @kind metadata and selects a default format when none is given (and no legacy evaluationFunction is in play), so SARIF/graphtext output is produced and cached automatically. The extracted metadata is also reused by interpretBQRSFile (passed as an optional argument) to avoid a second open/read of the same query file.

📝 Update Information

Primitive Details

Type: Tool
Name: codeql_query_run
Update Category: Bug Fix

⚠️ CRITICAL: PR SCOPE VALIDATION

This PR is for updating an existing MCP server primitive and must ONLY include these file types:

✅ ALLOWED FILES:

Server implementation files (server/src/**/*.ts)
Updated primitive implementations
Modified registration files (server/src/tools/*.ts)
Updated or new test files (server/test/**/*.ts)
Documentation updates (README.md, server docs)
Updated type definitions (server/src/types/*.ts)
Modified supporting library files (server/src/lib/*.ts)
Configuration updates if needed (package.json, tsconfig.json)

🚫 FORBIDDEN FILES:

Files unrelated to the primitive update
Temporary or test output files
IDE configuration files
Log files or debug output
Analysis or summary files

Rationale: This PR should contain only the files necessary to update and test the primitive.

🚨 PRs that include forbidden files will be rejected and must be revised.

🛑 MANDATORY PR VALIDATION CHECKLIST

BEFORE SUBMITTING THIS PR, CONFIRM:

ONLY server implementation files are included (plus CHANGELOG.md, the rebuilt server/dist/ bundle, and a new client/integration-tests/primitives/tools/codeql_query_run/auto_cache_problem_kind/ fixture)
NO temporary or output files are included
NO unrelated configuration files are included
ALL existing tests continue to pass (1486/1486 unit; affected client integration fixtures still pass)
NEW functionality is properly tested (unit + client integration)

Impact Scope: Localized — single function in result-processor.ts plus an optional pass-through parameter on interpretBQRSFile

Update Metadata

Breaking Changes: No
API Compatibility: Maintained — explicit format continues to take precedence; interpretBQRSFile gains an optional metadata argument with a backward-compatible default
Performance Impact: Neutral-to-positive — the inferred-format path now extracts query metadata exactly once per query_run invocation (previously twice)

🎯 Changes Description

Current Behavior

processQueryRunResults returns early whenever format is undefined and no legacy evaluationFunction is set, skipping BQRS interpretation, SARIF/graphtext rendering, and the auto-cache write that depends on the interpreted output. Callers had to pass format explicitly to get cached results, contradicting the documented default in the tool's format schema description.

Updated Behavior

When neither format nor evaluationFunction is provided, the processor inspects the query's @kind metadata and applies a default:

problem / path-problem → sarif-latest
graph → graphtext
other kinds (e.g. table) → unchanged (still returns early; nothing to interpret)

Inferred and explicit formats then flow through the existing interpret + cache path identically. The metadata extracted for inference is reused by interpretBQRSFile so the query file is opened/read at most once per query_run invocation.

Motivation

The tool description already promised this behavior; auto-caching is a prerequisite for downstream tools (query_results_cache_retrieve, false-positive triage prompts) to find results without forcing every caller to know the format conventions.

🔄 Before vs. After Comparison

Functionality Changes

// BEFORE: any unset format short-circuits before interpretation/caching;
// interpretBQRSFile separately re-reads the same query metadata.
if (!format && !evaluationFunction) {
  return result;
}
// ... resolve queryPath, interpret, cache ...
const interpretResult = await interpretBQRSFile(bqrsPath, queryPath, format, outFile, logger);
// inside interpretBQRSFile:
const metadata = await extractQueryMetadata(queryPath); // duplicate read

// AFTER: resolve queryPath first, extract metadata once, infer format from @kind,
// and pass the same metadata into interpretBQRSFile.
let queryMetadata: QueryMetadata | undefined;
if (queryPath) {
  queryMetadata = await extractQueryMetadata(queryPath);
}
let effectiveFormat = format as string | undefined;
if (!effectiveFormat && !evaluationFunction && queryMetadata) {
  if (queryMetadata.kind === 'problem' || queryMetadata.kind === 'path-problem') {
    effectiveFormat = 'sarif-latest';
  } else if (queryMetadata.kind === 'graph') {
    effectiveFormat = 'graphtext';
  }
}
if (!effectiveFormat && !evaluationFunction) return result;
// ... interpret + cache, reusing metadata ...
const interpretResult = await interpretBQRSFile(
  bqrsPath, queryPath, effectiveFormat, outFile, logger, queryMetadata,
);

API Changes

No tool schema changes. The format parameter on codeql_query_run remains optional with the same enum. Internally, interpretBQRSFile gains an optional metadata?: QueryMetadata parameter (default: extract on demand), preserving its existing call sites.

Output Format Changes

No structural change to tool output. For affected queries, the response now includes the existing "Query results interpreted successfully…" / "Results cached with key…" lines instead of returning the raw query run stdout untouched.

🧪 Testing & Validation

Test Coverage Updates

Existing Tests: All existing tests continue to pass
New Test Cases: Added unit tests for new functionality and a client integration fixture
Regression Tests: Added tests covering each @kind branch and the metadata-extraction call count
Edge Case Tests: table-kind no-op, explicit-format-override, and single-extraction guarantees

Validation Scenarios

Backward Compatibility: Explicit format: 'csv' on a @kind problem query still routes to csv (override test).
New Functionality: @kind problem, @kind path-problem, @kind graph queries trigger bqrs interpret with the inferred format. End-to-end client integration test (auto_cache_problem_kind) asserts Query results interpreted successfully with format: sarif-latest and Results cached with key: appear in the tool response when format is omitted.
Error Handling: Metadata-extraction failures are logged and fall through to the original early-return (no interpretation attempted).
Performance: New unit tests assert extractQueryMetadata is called exactly once per processQueryRunResults invocation in both the inferred-format and explicit-format paths.

Test Results

Unit Tests: All pass (1486/1486; 13/13 in result-processor.test.ts including 7 new cases — five for @kind inference and two asserting single-extraction)
Integration Tests: New codeql_query_run/auto_cache_problem_kind client integration fixture passes via gh-ql-mcp-client integration-tests; TDD-validated red→green by reverting the fix locally (RED: assertion failure on missing Query results interpreted successfully… line; GREEN: PASS after restoring the fix). Other JS/independent codeql_query_run fixtures (basic_query_run, custom_log_directory, evaluator_logging_with_tuple_counting, javascript_tools_print_ast) continue to pass.
Manual Testing: Not performed in this PR
Performance Testing: No regressions detected; the inferred-format path now performs one fewer metadata read

📋 Implementation Details

Files Modified

Core Implementation: server/src/lib/result-processor.ts (format inference + metadata reuse via optional interpretBQRSFile arg)
Tests: server/test/src/lib/result-processor.test.ts, client/integration-tests/primitives/tools/codeql_query_run/auto_cache_problem_kind/{test-config.json,before/monitoring-state.json,after/monitoring-state.json}
Bundle: server/dist/codeql-development-mcp-server.js(.map)
Documentation: CHANGELOG.md (Unreleased / Fixed)

Code Changes Summary

Algorithm Improvements: Reordered query-path resolution before format gating; added @kind → format inference; threaded extracted metadata into interpretBQRSFile to eliminate duplicate work
Error Handling: Improved error handling and messaging
Performance Optimization: Removed a duplicate extractQueryMetadata call by reusing the metadata object across inference and BQRS interpretation
Type Safety: Re-exported QueryMetadata and added it as an optional typed parameter to interpretBQRSFile
Input Validation: Improved input validation
Output Format: Enhanced output structure

Dependencies

No New Dependencies: Reuses existing extractQueryMetadata

🔍 Quality Improvements

Bug Fixes (if applicable)

Issue: codeql_query_run does not auto-cache @kind problem / path-problem results unless format is passed explicitly.
Root Cause: processQueryRunResults returned before resolving the query path or inspecting @kind, so the interpret + cache branch only ran with an explicit format.
Solution: Resolve queryPath first, extract metadata once, then infer format from @kind when the caller did not specify one. Pass the extracted metadata into interpretBQRSFile so it does not re-read the query file.
Prevention: Unit tests assert bqrs interpret is invoked with the correct inferred format for each kind, that non-interpretable kinds remain a no-op, and that extractQueryMetadata is called exactly once per invocation. A client integration fixture independently asserts the auto-cache markers appear in the tool response when format is omitted.

Performance Improvements

Baseline Performance: Inferred-format path would have performed two extractQueryMetadata calls (one for inference, one inside interpretBQRSFile), each requiring an openSync + fstatSync (LRU-cached only on the second call within the same mtime window).
Improved Performance: Exactly one extractQueryMetadata call per processQueryRunResults invocation, regardless of whether format is inferred or explicit.
Optimization Techniques: Optional pass-through parameter on interpretBQRSFile to reuse caller-supplied metadata.

Code Quality Enhancements

Readability: Single effectiveFormat and single queryMetadata variable make the explicit-vs-inferred path obvious
Maintainability: Better code organization and structure
Testability: New tests mock executeCodeQLCommand via vi.hoisted to capture the interpret invocation and spy on extractQueryMetadata to assert call counts; client integration fixture exercises the full server-side path
Reusability: More modular and reusable components

🔗 References

Related Issues/PRs

Related PRs: Fix invalid JSON Schema for query_results_cache_retrieve #263 (query_results_cache_retrieve schema fix — same caching pipeline)

External References

CodeQL @kind metadata documented in the in-repo codeql://learning/query-basics resource.

Validation Materials

Test Cases: Synthetic .ql fixtures with @kind problem, path-problem, graph, table (unit tests); ExampleQuery1.ql (@kind problem) for the client integration fixture
Performance Benchmarks: N/A

🚀 Compatibility & Migration

Backward Compatibility

Fully Compatible: No breaking changes
Deprecation Warnings: Deprecated features with warnings
Breaking Changes: Changes that break existing usage (detailed below)

Breaking Changes (if any)

None. The new metadata argument on interpretBQRSFile is optional; existing call sites that do not pass it continue to extract metadata internally as before.

API Evolution

Enhanced Parameters: interpretBQRSFile accepts an optional pre-extracted QueryMetadata to avoid duplicate reads
Improved Responses: Affected queries now return interpreted-output and cache-key lines instead of raw query run stdout
Better Error Messages: More descriptive error information
Maintained Contracts: Core API contracts preserved

👥 Review Guidelines

For Reviewers

Please verify:

Testing Instructions

npm install
npm run lint --workspace=server
npm run bundle --workspace=server
npm test --workspace=server

# Targeted unit tests
npx vitest run test/src/lib/result-processor.test.ts --root server

# Client integration test (requires codeql CLI on PATH)
make -C client build
./client/gh-ql-mcp-client integration-tests \
  --tools codeql_test_extract,codeql_query_run \
  --tests extract_tests,auto_cache_problem_kind \
  --no-install-packs

Validation Checklist

Regression Testing: Existing result-processor and cli-tool-registry tests pass; existing JS-based codeql_query_run integration fixtures continue to pass
New Feature Testing: Five new unit tests for @kind inference (problem, path-problem, graph, table-skip, explicit-override), two new tests asserting extractQueryMetadata is called exactly once per invocation, plus one client integration fixture asserting the auto-cache markers appear when format is omitted
Performance Testing: Confirmed one fewer metadata read per inferred-format invocation; no measurable regression elsewhere
Error Testing: extractQueryMetadata failure path is exercised by an inner try/catch + logger
Integration Testing: New auto_cache_problem_kind fixture runs the bundled server end-to-end through gh-ql-mcp-client and asserts the response contains Query results interpreted successfully with format: sarif-latest and Results cached with key:
Documentation Review: CHANGELOG entry under Unreleased / Fixed

📊 Impact Assessment

Performance Impact

Memory Usage: Unchanged
Execution Time: One extractQueryMetadata call per query_run invocation (down from two on the inferred-format path); cached after first read per file mtime
Throughput: Unchanged

Server Impact

Startup Time: No significant impact on server startup
Runtime Stability: No impact on server stability
Resource Usage: Reasonable resource consumption
Concurrent Usage: Safe for concurrent access

AI Assistant Impact

Enhanced Accuracy: Cached SARIF makes downstream query_results_cache_* tools usable without re-running queries
Better Coverage: Default-path users now get interpretation + caching out of the box
Improved Reliability: Eliminates a subtle silent-skip path
Enhanced User Experience: Tool description now matches actual behavior

🔄 Deployment Strategy

Rollout Considerations

Safe Deployment: Can be deployed safely to production
Gradual Rollout: Consider gradual rollout if high-impact changes
Monitoring: Appropriate monitoring for the update
Rollback Plan: Pure post-processor change — revertable in isolation

Post-Deployment Validation

Monitoring: Key metrics to monitor after deployment
User Feedback: Channels for collecting user feedback
Performance Tracking: Performance metrics to track
Error Tracking: Error patterns to watch for

Update Methodology: This update follows best practices:

✅ Comprehensive backward compatibility analysis
✅ Thorough testing of all changes (unit + client integration with TDD red→green validation, including a metadata-call-count regression test)
✅ Performance impact assessment
✅ Clear documentation of changes
✅ Robust error handling improvements
✅ Maintained code quality standards

Co-authored-by: data-douser <70299490+data-douser@users.noreply.github.com>

data-douser

@copilot The current changes look alright, but there must be some improvement client integration tests for client/integration-tests/primitives/tools/codeql_query_run/** in order to ensure that the changes actually address the reported problem with query results caching. Use TDD methodology to detect (red) the reported problem and validate (green) the solution, iterating as necessary to ensure feature reliability.

Co-authored-by: data-douser <70299490+data-douser@users.noreply.github.com>

Copilot · 2026-05-08T23:10:06Z

@copilot The current changes look alright, but there must be some improvement client integration tests for client/integration-tests/primitives/tools/codeql_query_run/** in order to ensure that the c...

Added a client integration-test fixture client/integration-tests/primitives/tools/codeql_query_run/auto_cache_problem_kind/ that calls codeql_query_run against the @kind problem example query (ExampleQuery1.ql) without passing format, and asserts the response contains both Query results interpreted successfully with format: sarif-latest and Results cached with key: — the two output markers proving SARIF interpretation and auto-caching ran.

TDD validation (locally, before pushing the fixture):

🔴 RED — temporarily checked out the pre-fix result-processor.ts from 4338f1f and rebundled:

FAIL auto_cache_problem_kind (assertion failed: response does not contain
     "Query results interpreted successfully with format: sarif-latest")

🟢 GREEN — restored the fix and rebundled:
```
PASS auto_cache_problem_kind [5.8s]
```

The other JS/independent codeql_query_run fixtures (basic_query_run, custom_log_directory, evaluator_logging_with_tuple_counting, javascript_tools_print_ast) continue to pass. Pre-existing failures for cpp/python/rust call-graph fixtures are unrelated (their per-language .testproj databases aren't extracted in this filtered run).

github-actions · 2026-05-08T23:17:18Z

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

Scanned Files

None

Copilot

Pull request overview

Fixes codeql_query_run result post-processing so that when format is omitted (and no legacy evaluationFunction is used), the server infers an appropriate default format from the query’s @kind metadata. This enables interpretation (SARIF/graphtext) and therefore query-results caching for problem, path-problem, and graph queries—matching the tool’s documented behavior.

Changes:

Infer effectiveFormat from @kind (problem/path-problem → sarif-latest, graph → graphtext) when format is not provided.
Add unit tests covering the new inference branches and explicit-format override.
Add a client integration fixture asserting interpretation + caching markers appear when format is omitted; update bundle + changelog.

Show a summary per file

File	Description
server/src/lib/result-processor.ts	Infers default output format from query `@kind` before deciding whether to interpret/cache results.
server/test/src/lib/result-processor.test.ts	Adds Vitest mocking for CLI execution and new unit tests validating `@kind`→format inference behavior.
client/integration-tests/primitives/tools/codeql_query_run/auto_cache_problem_kind/test-config.json	New integration test asserting interpreted output + cache-key markers when `format` is omitted for a `@kind problem` query.
client/integration-tests/primitives/tools/codeql_query_run/auto_cache_problem_kind/before/monitoring-state.json	Adds the “before” monitoring state for the new integration fixture.
client/integration-tests/primitives/tools/codeql_query_run/auto_cache_problem_kind/after/monitoring-state.json	Adds the “after” monitoring state for the new integration fixture (successful tool call recorded).
server/dist/codeql-development-mcp-server.js	Re-bundled server output reflecting the updated result processor logic.
CHANGELOG.md	Documents the bug fix under Unreleased → Fixed with PR link.

Copilot's findings

Files reviewed: 6/8 changed files
Comments generated: 1

Co-authored-by: data-douser <70299490+data-douser@users.noreply.github.com>

Copilot

Copilot's findings

Files reviewed: 6/8 changed files
Comments generated: 1

Initial plan

4338f1f

Copilot AI assigned Copilot and data-douser May 8, 2026

Copilot AI requested review from Copilot and removed request for Copilot May 8, 2026 18:48

Copilot started work on behalf of data-douser May 8, 2026 18:48 View session

Copilot AI linked an issue May 8, 2026 that may be closed by this pull request

codeql_query_run does not auto-cache results for @kind problem / @kind path-problem queries without explicit format parameter #268

Closed

Auto-infer format from @kind for codeql_query_run when not provided

001d126

Co-authored-by: data-douser <70299490+data-douser@users.noreply.github.com>

Copilot AI requested review from Copilot and removed request for Copilot May 8, 2026 18:58

Copilot AI changed the title ~~[WIP] Fix auto-caching for kind problem queries in codeql_query_run~~ [UPDATE PRIMITIVE] Auto-infer codeql_query_run format from @kind to enable result caching May 8, 2026

Copilot finished work on behalf of data-douser May 8, 2026 19:00

Copilot AI requested a review from data-douser May 8, 2026 19:00

data-douser requested changes May 8, 2026

View reviewed changes

Copilot started work on behalf of data-douser May 8, 2026 22:55 View session

Add codeql_query_run integration fixture for auto-cache on @kind problem

0d8b5ea

Co-authored-by: data-douser <70299490+data-douser@users.noreply.github.com>

Copilot AI requested review from Copilot and removed request for Copilot May 8, 2026 23:09

Copilot finished work on behalf of data-douser May 8, 2026 23:12

Copilot AI requested a review from data-douser May 8, 2026 23:12

data-douser marked this pull request as ready for review May 9, 2026 03:11

data-douser requested a review from enyil as a code owner May 9, 2026 03:11

Copilot AI review requested due to automatic review settings May 9, 2026 03:11

data-douser requested a review from a team as a code owner May 9, 2026 03:11

data-douser approved these changes May 9, 2026

View reviewed changes

Copilot started reviewing on behalf of data-douser May 9, 2026 03:12 View session

Copilot AI reviewed May 9, 2026

View reviewed changes

Comment thread server/src/lib/result-processor.ts

Copilot started work on behalf of data-douser May 10, 2026 19:01 View session

Reuse query metadata in processQueryRunResults to avoid duplicate read

227d96f

Co-authored-by: data-douser <70299490+data-douser@users.noreply.github.com>

Copilot finished work on behalf of data-douser May 10, 2026 19:13

Copilot AI requested a review from data-douser May 10, 2026 19:13

data-douser requested a review from Copilot May 11, 2026 00:07

Copilot started reviewing on behalf of data-douser May 11, 2026 00:08 View session

Copilot AI reviewed May 11, 2026

View reviewed changes

Comment thread server/src/lib/result-processor.ts

data-douser merged commit eb7e2a6 into main May 11, 2026
24 checks passed

data-douser deleted the copilot/fix-cache-issue-for-kind-problem branch May 11, 2026 13:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[UPDATE PRIMITIVE] Auto-infer codeql_query_run format from @kind to enable result caching#275

[UPDATE PRIMITIVE] Auto-infer codeql_query_run format from @kind to enable result caching#275
data-douser merged 4 commits into
mainfrom
copilot/fix-cache-issue-for-kind-problem

Copilot AI commented May 8, 2026 •

edited

Loading

Uh oh!

data-douser left a comment

Uh oh!

Copilot AI commented May 8, 2026

Uh oh!

github-actions Bot commented May 8, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Copilot AI commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

📝 Update Information

Primitive Details

⚠️ CRITICAL: PR SCOPE VALIDATION

🛑 MANDATORY PR VALIDATION CHECKLIST

Update Metadata

🎯 Changes Description

Current Behavior

Updated Behavior

Motivation

🔄 Before vs. After Comparison

Functionality Changes

API Changes

Output Format Changes

🧪 Testing & Validation

Test Coverage Updates

Validation Scenarios

Test Results

📋 Implementation Details

Files Modified

Code Changes Summary

Dependencies

🔍 Quality Improvements

Bug Fixes (if applicable)

Performance Improvements

Code Quality Enhancements

🔗 References

Related Issues/PRs

External References

Validation Materials

🚀 Compatibility & Migration

Backward Compatibility

Breaking Changes (if any)

API Evolution

👥 Review Guidelines

For Reviewers

Testing Instructions

Validation Checklist

📊 Impact Assessment

Performance Impact

Server Impact

AI Assistant Impact

🔄 Deployment Strategy

Rollout Considerations

Post-Deployment Validation

Uh oh!

data-douser left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI commented May 8, 2026

Uh oh!

github-actions Bot commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Dependency Review

Scanned Files

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Copilot's findings

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Copilot's findings

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Copilot AI commented May 8, 2026 •

edited

Loading

github-actions Bot commented May 8, 2026 •

edited

Loading