feat: adopt CodeQL advanced setup + Models-as-Data sanitiser pack (#1836)#1839
feat: adopt CodeQL advanced setup + Models-as-Data sanitiser pack (#1836)#1839Aureliolo wants to merge 7 commits into
Conversation
Dependency Review✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.OpenSSF Scorecard
Scanned Files
|
WalkthroughThis PR implements two CodeQL custom query packs ( 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
There was a problem hiding this comment.
Code Review
This pull request introduces a custom CodeQL extension pack to model project-specific sanitiser helpers as taint barriers for Python, Go, and JavaScript, including a validation framework and updated documentation. The review identifies a critical issue where the Models-as-Data definitions incorrectly use summaryModel with kind: "value", which preserves taint instead of blocking it. Feedback also notes potential module resolution issues in the Python models and a test fixture that fails to exercise the helper it is intended to validate.
| - addsTo: | ||
| pack: codeql/python-all | ||
| extensible: summaryModel | ||
| data: | ||
| - ["synthorg.observability.redaction", "Member[safe_error_description]", "", "", "", "", "Argument[0]", "ReturnValue", "value", "manual"] | ||
| - ["synthorg.observability", "Member[safe_error_description]", "", "", "", "", "Argument[0]", "ReturnValue", "value", "manual"] | ||
| - ["synthorg.observability.redaction", "Member[scrub_secret_tokens]", "", "", "", "", "Argument[0]", "ReturnValue", "value", "manual"] |
There was a problem hiding this comment.
The models are incorrectly using the summaryModel extensible predicate with kind: "value". In CodeQL Models-as-Data, kind: "value" indicates that the value (and thus any taint) is preserved from the input to the output, which is the opposite of a sanitiser's purpose. To model a sanitiser that blocks taint flow, you should use the sanitizerModel extensible predicate.
Additionally, the data format for sanitizerModel in Python is simpler, requiring only 3 columns: (package, member, kind). For these helpers, the kind should be "taint".
- addsTo:
pack: codeql/python-all
extensible: sanitizerModel
data:
- ["synthorg.observability.redaction", "Member[safe_error_description]", "taint"]
- ["synthorg.observability", "Member[safe_error_description]", "taint"]
- ["synthorg.observability.redaction", "Member[scrub_secret_tokens]", "taint"]| - addsTo: | ||
| pack: codeql/go-all | ||
| extensible: summaryModel | ||
| data: | ||
| - ["github.com/Aureliolo/synthorg/cli/internal/config", "", "False", "SecurePath", "", "", "Argument[0]", "ReturnValue[0]", "value", "manual"] |
There was a problem hiding this comment.
Similar to the Python models, Go sanitisers should be modelled using the sanitizerModel extensible predicate rather than summaryModel. Using kind: "value" in a summary model explicitly tells CodeQL that the return value is tainted if the input is tainted.
The sanitizerModel for Go uses 9 columns (removing the Output column used in summaries).
- addsTo:
pack: codeql/go-all
extensible: sanitizerModel
data:
- ["github.com/Aureliolo/synthorg/cli/internal/config", "", "False", "SecurePath", "", "", "Argument[0]", "taint", "manual"]| - addsTo: | ||
| pack: codeql/javascript-all | ||
| extensible: summaryModel | ||
| data: | ||
| - ["web/src/utils/logging", "Member[sanitizeForLog]", "", "", "", "", "Argument[0]", "ReturnValue", "value", "manual"] |
There was a problem hiding this comment.
JavaScript sanitisers should use the sanitizerModel extensible predicate. The summaryModel with kind: "value" indicates that taint is preserved. The sanitizerModel format for JavaScript requires 3 columns: (package, member, kind).
- addsTo:
pack: codeql/javascript-all
extensible: sanitizerModel
data:
- ["web/src/utils/logging", "Member[sanitizeForLog]", "taint"]| - addsTo: | ||
| pack: codeql/python-all | ||
| extensible: summaryModel | ||
| data: | ||
| - ["scripts.check_currency_aggregation_invariant", "Member[_resolve_root]", "", "", "", "", "Argument[0]", "ReturnValue", "value", "manual"] | ||
| - ["scripts.check_dependency_inversion", "Member[_resolve_root]", "", "", "", "", "Argument[0]", "ReturnValue", "value", "manual"] | ||
| - ["scripts.check_domain_error_hierarchy", "Member[_resolve_root]", "", "", "", "", "Argument[0]", "ReturnValue", "value", "manual"] | ||
| - ["scripts.check_forbidden_literals", "Member[_resolve_root]", "", "", "", "", "Argument[0]", "ReturnValue", "value", "manual"] | ||
| - ["scripts.check_no_magic_numbers", "Member[_resolve_root]", "", "", "", "", "Argument[0]", "ReturnValue", "value", "manual"] | ||
| - ["scripts.check_no_migration_framing", "Member[_resolve_root]", "", "", "", "", "Argument[0]", "ReturnValue", "value", "manual"] | ||
| - ["scripts.check_no_review_origin_in_code", "Member[_resolve_root]", "", "", "", "", "Argument[0]", "ReturnValue", "value", "manual"] | ||
| - ["scripts.check_persistence_boundary", "Member[_resolve_root]", "", "", "", "", "Argument[0]", "ReturnValue", "value", "manual"] |
There was a problem hiding this comment.
The package names for these scripts (e.g., scripts.check_currency_aggregation_invariant) assume that the scripts/ directory is a Python package. If scripts/ does not contain an __init__.py file, CodeQL will likely treat these as top-level modules (e.g., just check_currency_aggregation_invariant). Please verify the module resolution in the CodeQL database; if they are not in a package, the first column should be empty or match the module name directly.
| def negative_path_injection_relative_to( | ||
| user_input: str, project_root: Path | ||
| ) -> str | None: | ||
| """py/path-injection must NOT fire here. | ||
|
|
||
| Pattern: resolve, then assert containment via ``Path.relative_to`` in | ||
| a try/except block. The ValueError branch is the safe-exit. | ||
| """ | ||
| candidate = (project_root / user_input).resolve() | ||
| try: | ||
| candidate.relative_to(project_root) | ||
| except ValueError: | ||
| return None | ||
| return candidate.read_text() |
There was a problem hiding this comment.
The function negative_path_injection_relative_to implements path containment logic inline but does not call the _resolve_root helper that is modelled in python-sanitisers.model.yml. As a result, this fixture does not actually validate that the custom CodeQL model for _resolve_root is working correctly.
To properly test the model, this fixture should import and call _resolve_root from one of the scripts listed in the model file.
Merging this PR will not alter performance
Comparing Footnotes
|
There was a problem hiding this comment.
Actionable comments posted: 7
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.github/codeql/codeql-config.yml:
- Around line 10-14: Add the missing CodeQL ignore entry for the intentionally
vulnerable fixtures by adding the pattern 'cli/internal/codeqlfixtures/**' to
the paths-ignore list in the CodeQL config (maintain the same YAML indentation
and quoting style as the existing entries so the paths-ignore key remains a YAML
sequence).
In @.github/codeql/extensions/synthorg-sanitisers/js-sanitisers.model.yml:
- Around line 22-39: The summaryModel rows for Member[sanitizeForLog],
Member[sanitizeWsString], and Member[sanitizeWsEnum] use an invalid 10-field
tuple shape which breaks CodeQL; update each data row so it matches the required
5- or 6-field summaryModel shape (remove the extra empty-string entries and the
manual/Argument/ReturnValue/value tail so each row is a 5- or 6-element list),
ensure any 6th element is an integer origin when used, and keep the
module/member names (e.g., "web/src/utils/logging" Member[sanitizeForLog],
"web/src/utils/ws-sanitize" Member[sanitizeWsString], Member[sanitizeWsEnum])
unchanged; then run the provided Python validation snippet to confirm all
summaryModel rows now have valid 5/6-field shape.
In @.github/codeql/fixtures/javascript/negative.ts:
- Around line 7-8: The deep relative imports of sanitizeForLog,
sanitizeWsString, and sanitizeWsEnum from '../../../../web/src/utils/' are
fragile; update the imports to a stable reference by either co-locating the
fixture with the source utilities, publishing/using an internal package
(importing from that package name), or adding a module path alias (e.g., via
tsconfig/webpack) and changing the import statements to use the package or
alias; ensure sanitizeForLog, sanitizeWsString, and sanitizeWsEnum are resolved
from the new stable location and update any build/test config accordingly so
tests still run under the CodeQL fixtures constraint.
In @.github/codeql/fixtures/python/negative.py:
- Around line 14-18: The test fixture imports private helpers
_validate_image_tag and _validate_repo_prefix alongside ImageTag; to fix, either
make those validators public (rename _validate_image_tag and
_validate_repo_prefix to validate_image_tag and validate_repo_prefix in
scripts.check_image_signatures and update imports), or remove the private
imports and duplicate the needed validator logic inside the fixture, or move the
validator functions into a new public module and import them there; update all
references to use the new public names (e.g., validate_image_tag,
validate_repo_prefix) or the locally duplicated functions to avoid importing
private symbols.
In `@scripts/check_codeql_fixtures.py`:
- Line 96: The code currently defaults start_line = int(region.get("startLine",
0)) which can mask malformed SARIF; update the logic around region and startLine
to explicitly validate existence and type: check region contains "startLine" and
that it's convertible to int (referencing the region variable and start_line
assignment), and if missing/invalid either log a warning with context (e.g.,
which result/file) or raise an exception so parsing fails fast; ensure any
logging uses the existing logger in this module to provide useful diagnostic
info.
In `@scripts/run_codeql_pack_validation.sh`:
- Around line 81-83: The script currently copies javascript-negative.sarif to
javascript-positive.sarif producing identical results; replace that cp operation
with an actual run_fixture invocation for the positive JS fixtures (mirror how
Python was handled) so run_fixture produces
"$RESULTS_DIR/javascript-positive.sarif" instead of copying; update the code
around the existing run_fixture call and remove the cp of
"$RESULTS_DIR/javascript-negative.sarif" to ensure distinct analysis outputs for
the positive fixture.
- Around line 70-72: The script currently copies python-negative.sarif to
python-positive.sarif which yields identical results; instead re-run analysis
for the positive fixture (call the existing run_fixture invocation or the CodeQL
analysis step that produced python-negative.sarif) to generate a true
python-positive.sarif output, i.e. invoke run_fixture (or the analyze step that
takes the DB and source-root) with the positive fixture and write its output to
"$RESULTS_DIR/python-positive.sarif" rather than using cp; if the positive and
negative use the exact same source and database, add a comment or assertion
explaining why differing must_fire/must_not_fire expectations are valid for
identical SARIFs.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 73cb4f85-2e65-4fe9-8264-e31de2a0e067
📒 Files selected for processing (21)
.github/codeql/codeql-config.yml.github/codeql/extensions/synthorg-sanitisers/go-sanitisers.model.yml.github/codeql/extensions/synthorg-sanitisers/js-sanitisers.model.yml.github/codeql/extensions/synthorg-sanitisers/python-sanitisers.model.yml.github/codeql/extensions/synthorg-sanitisers/qlpack.yml.github/codeql/fixtures/expected.json.github/codeql/fixtures/javascript/negative.ts.github/codeql/fixtures/javascript/positive.ts.github/codeql/fixtures/python/negative.py.github/codeql/fixtures/python/positive.py.github/workflows/codeql-pack-validate.yml.github/workflows/codeql.ymlCLAUDE.mdcli/internal/codeqlfixtures/fixture.godata/runtime_stats.yamldocs/reference/convention-gates.mddocs/reference/sec-prompt-safety.mdpyproject.tomlscripts/check_codeql_fixtures.pyscripts/convention_gate_map.yamlscripts/run_codeql_pack_validation.sh
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
- GitHub Check: Build Fine-Tune (cpu, fine-tune-cpu)
- GitHub Check: Build Fine-Tune (gpu, fine-tune-gpu)
- GitHub Check: Build Backend
- GitHub Check: Build Web Assets (melange)
- GitHub Check: CLI Test (windows-latest)
- GitHub Check: CLI Bench Regression
- GitHub Check: CodSpeed Python benchmarks
- GitHub Check: Lighthouse Site
- GitHub Check: Dashboard Test
- GitHub Check: Test (Python 3.14)
- GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (7)
**/*.md
📄 CodeRabbit inference engine (CLAUDE.md)
Read the relevant design specification from
docs/design/before implementing; deviations require approval
Files:
CLAUDE.mddocs/reference/convention-gates.mddocs/reference/sec-prompt-safety.md
**/*
📄 CodeRabbit inference engine (CLAUDE.md)
Apply Regional Defaults: no region/currency/locale privileged; use metric units; use British English
Files:
CLAUDE.mddocs/reference/convention-gates.mdpyproject.tomlscripts/run_codeql_pack_validation.shcli/internal/codeqlfixtures/fixture.goscripts/convention_gate_map.yamldata/runtime_stats.yamldocs/reference/sec-prompt-safety.mdscripts/check_codeql_fixtures.py
**/*.{d2,md}
📄 CodeRabbit inference engine (CLAUDE.md)
Diagrams: use
d2for architecture/nested containers (theme 200 Dark Mauve, CI pinned to v0.7.1),mermaidfor flowcharts/sequence/pipelines. Use Markdown tables for tabular data
Files:
CLAUDE.mddocs/reference/convention-gates.mddocs/reference/sec-prompt-safety.md
{.github/codeql/extensions/synthorg-sanitisers/*.model.yml,scripts/check_codeql_fixtures.py,cli/internal/codeqlfixtures/**}
📄 CodeRabbit inference engine (CLAUDE.md)
Any change to a project sanitiser requires a matching row in
.github/codeql/extensions/synthorg-sanitisers/*.model.ymland a fixture pair under.github/codeql/fixtures/orcli/internal/codeqlfixtures/. Enforced by.github/workflows/codeql-pack-validate.yml+scripts/check_codeql_fixtures.py
Files:
.github/codeql/extensions/synthorg-sanitisers/go-sanitisers.model.yml.github/codeql/extensions/synthorg-sanitisers/js-sanitisers.model.ymlcli/internal/codeqlfixtures/fixture.go.github/codeql/extensions/synthorg-sanitisers/python-sanitisers.model.ymlscripts/check_codeql_fixtures.py
{README.md,docs/**/*.md}
📄 CodeRabbit inference engine (CLAUDE.md)
Numeric claims in README and public docs must be sourced from
data/runtime_stats.yamlvia<!--RS:NAME-->markers
Files:
docs/reference/convention-gates.mddocs/reference/sec-prompt-safety.md
cli/**/*.go
📄 CodeRabbit inference engine (cli/CLAUDE.md)
Use hardcoded
localhost,postgres:5432, andnats:4222literals in CLI code by design (correct by security/audit rationale; do not make these configurable). See docs/reference/cli-env-vars.md for audit justification
Files:
cli/internal/codeqlfixtures/fixture.go
cli/**
📄 CodeRabbit inference engine (CLAUDE.md)
CLI is Docker-only (init/start/stop/status); features go in dashboard + REST API. Use
go -C cli, nevercd cli. Seecli/CLAUDE.md
Files:
cli/internal/codeqlfixtures/fixture.go
🧠 Learnings (2)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T11:53:18.276Z
Learning: Present every plan for accept/deny before coding
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T11:53:18.276Z
Learning: Every convention PR ships its enforcement gate; details in `docs/reference/convention-gates.md`
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T11:53:18.276Z
Learning: After issue completion: branch + commit + push (no auto-PR); use `/pre-pr-review`. After PR: use `/aurelio-review-pr` for external feedback. Fix everything valid; never defer
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T11:53:18.276Z
Learning: Git commits: `<type>: <description>` (feat/fix/refactor/docs/test/chore/perf/ci); commitizen-enforced
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T11:53:18.276Z
Learning: Signed commits required on protected refs (GPG/SSH or GitHub App via `synthorg-repo-bot`)
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T11:53:18.276Z
Learning: Branches: `<type>/<slug>` from main
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T11:53:18.276Z
Learning: Squash merge PRs. PR body becomes squash commit; trailers (`Release-As`, `Closes `#N``) must be in PR body
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T11:53:18.276Z
Learning: GitHub queries: use `gh issue list` via Bash, NOT MCP `list_issues`
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T11:53:18.276Z
Learning: Finished issue: branch + commit + push (no auto-PR). Use `/pre-pr-review`. After every squash merge: `/post-merge-cleanup`. Always read `docs/design/` before implementing
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T11:53:18.276Z
Learning: PR review: fix everything valid; never skip as out-of-scope
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T11:53:18.276Z
Learning: Framework for synthetic organisations (autonomous AI agents orchestrated as a virtual company). Python 3.14+. Layout: `src/synthorg/` (src layout), `tests/`, `web/` (React 19), `cli/` (Go binary). BUSL-1.1 → Apache 2.0 after Change Date
📚 Learning: 2026-05-05T09:04:46.195Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1760
File: scripts/_dual_backend_parity_lib.py:215-216
Timestamp: 2026-05-05T09:04:46.195Z
Learning: This repository targets Python 3.14+ and follows PEP 758. Therefore, reviewer tooling should NOT treat unparenthesized multi-exception `except` clauses written without an `as` clause (e.g., `except MemoryError, RecursionError:`) as syntax errors. Only flag `except`-clause problems when they are genuinely invalid for Python 3.14+.
Applied to files:
scripts/check_codeql_fixtures.py
🪛 GitHub Actions: CodeQL / 0_Analyze (javascript-typescript).txt
.github/codeql/extensions/synthorg-sanitisers/js-sanitisers.model.yml
[error] 22-124: CodeQL extension validation failed for codeql/javascript-all:summaryModel. Row 1 has invalid data: found '"web/src/uti...", "Member[sani...", ... "ReturnValue", "value", "manual"' which does not match signature 'summaryModel(string type, string path, string input, string output, string kind, [int origin])'. ( /opt/hostedtoolcache/.../ApiGraphModelsExtensions.qll:51,1-53,3 )
[error] 30-130: CodeQL extension validation failed for codeql/javascript-all:summaryModel. Row 1 has invalid data: found '"web/src/uti...", "Member[sani...", ... "ReturnValue", "value", "manual"' which does not match signature 'summaryModel(string type, string path, string input, string output, string kind, [int origin])'. ( /opt/hostedtoolcache/.../ApiGraphModelsExtensions.qll:51,1-53,3 )
[error] 39-128: CodeQL extension validation failed for codeql/javascript-all:summaryModel. Row 1 has invalid data: found '"web/src/uti...", "Member[sani...", ... "ReturnValue", "value", "manual"' which does not match signature 'summaryModel(string type, string path, string input, string output, string kind, [int origin])'. ( /opt/hostedtoolcache/.../ApiGraphModelsExtensions.qll:51,1-53,3 )
🪛 GitHub Actions: CodeQL / Analyze (javascript-typescript)
.github/codeql/extensions/synthorg-sanitisers/js-sanitisers.model.yml
[error] 22-22: In extension for codeql/javascript-all:summaryModel, row 1 is invalid. Found entries that do not match signature 'summaryModel(string type, string path, string input, string output, string kind, [int origin])'.
[error] 30-30: In extension for codeql/javascript-all:summaryModel, row 1 is invalid. Found entries that do not match signature 'summaryModel(string type, string path, string input, string output, string kind, [int origin])'.
[error] 39-39: In extension for codeql/javascript-all:summaryModel, row 1 is invalid. Found entries that do not match signature 'summaryModel(string type, string path, string input, string output, string kind, [int origin])'.
🪛 GitHub Actions: CodeQL / Analyze (typescript)
.github/codeql/codeql-config.yml
[error] 1-1: Command failed fatally during CodeQL resolve extensions-by-pack (error code 2). Analysis step failed while running: codeql database run-queries --ram=14575 --threads=4 --expect-discarded-cache --additional-packs=/home/runner/work/_temp/pr-diff-range --extension-packs=codeql-action/pr-diff-range /home/runner/work/_temp/codeql_databases/javascript --min-disk-free=1024 -v
.github/codeql/fixtures/expected.json
[error] 1-1: Command failed fatally during CodeQL resolve extensions-by-pack (error code 2). Analysis step failed while running: codeql database run-queries --ram=14575 --threads=4 --expect-discarded-cache --additional-packs=/home/runner/work/_temp/pr-diff-range --extension-packs=codeql-action/pr-diff-range /home/runner/work/_temp/codeql_databases/javascript --min-disk-free=1024 -v
.github/codeql/extensions/synthorg-sanitisers/qlpack.yml
[error] 1-1: Command failed fatally during CodeQL resolve extensions-by-pack (error code 2). Analysis step failed while running: codeql database run-queries --ram=14575 --threads=4 --expect-discarded-cache --additional-packs=/home/runner/work/_temp/pr-diff-range --extension-packs=codeql-action/pr-diff-range /home/runner/work/_temp/codeql_databases/javascript --min-disk-free=1024 -v
.github/codeql/fixtures/javascript/positive.ts
[error] 1-1: Command failed fatally during CodeQL resolve extensions-by-pack (error code 2). Analysis step failed while running: codeql database run-queries --ram=14575 --threads=4 --expect-discarded-cache --additional-packs=/home/runner/work/_temp/pr-diff-range --extension-packs=codeql-action/pr-diff-range /home/runner/work/_temp/codeql_databases/javascript --min-disk-free=1024 -v
.github/codeql/fixtures/javascript/negative.ts
[error] 1-1: Command failed fatally during CodeQL resolve extensions-by-pack (error code 2). Analysis step failed while running: codeql database run-queries --ram=14575 --threads=4 --expect-discarded-cache --additional-packs=/home/runner/work/_temp/pr-diff-range --extension-packs=codeql-action/pr-diff-range /home/runner/work/_temp/codeql_databases/javascript --min-disk-free=1024 -v
.github/codeql/extensions/synthorg-sanitisers/go-sanitisers.model.yml
[error] 1-1: Command failed fatally during CodeQL resolve extensions-by-pack (error code 2). Analysis step failed while running: codeql database run-queries --ram=14575 --threads=4 --expect-discarded-cache --additional-packs=/home/runner/work/_temp/pr-diff-range --extension-packs=codeql-action/pr-diff-range /home/runner/work/_temp/codeql_databases/javascript --min-disk-free=1024 -v
.github/codeql/extensions/synthorg-sanitisers/js-sanitisers.model.yml
[error] 1-1: Command failed fatally during CodeQL resolve extensions-by-pack (error code 2). Analysis step failed while running: codeql database run-queries --ram=14575 --threads=4 --expect-discarded-cache --additional-packs=/home/runner/work/_temp/pr-diff-range --extension-packs=codeql-action/pr-diff-range /home/runner/work/_temp/codeql_databases/javascript --min-disk-free=1024 -v
.github/workflows/codeql-pack-validate.yml
[error] 1-1: Command failed fatally during CodeQL resolve extensions-by-pack (error code 2). Analysis step failed while running: codeql database run-queries --ram=14575 --threads=4 --expect-discarded-cache --additional-packs=/home/runner/work/_temp/pr-diff-range --extension-packs=codeql-action/pr-diff-range /home/runner/work/_temp/codeql_databases/javascript --min-disk-free=1024 -v
scripts/run_codeql_pack_validation.sh
[error] 1-1: Command failed fatally during CodeQL resolve extensions-by-pack (error code 2). Analysis step failed while running: codeql database run-queries --ram=14575 --threads=4 --expect-discarded-cache --additional-packs=/home/runner/work/_temp/pr-diff-range --extension-packs=codeql-action/pr-diff-range /home/runner/work/_temp/codeql_databases/javascript --min-disk-free=1024 -v
cli/internal/codeqlfixtures/fixture.go
[error] 1-1: Command failed fatally during CodeQL resolve extensions-by-pack (error code 2). Analysis step failed while running: codeql database run-queries --ram=14575 --threads=4 --expect-discarded-cache --additional-packs=/home/runner/work/_temp/pr-diff-range --extension-packs=codeql-action/pr-diff-range /home/runner/work/_temp/codeql_databases/javascript --min-disk-free=1024 -v
.github/workflows/codeql.yml
[error] 1-1: Command failed fatally during CodeQL resolve extensions-by-pack (error code 2). Analysis step failed while running: codeql database run-queries --ram=14575 --threads=4 --expect-discarded-cache --additional-packs=/home/runner/work/_temp/pr-diff-range --extension-packs=codeql-action/pr-diff-range /home/runner/work/_temp/codeql_databases/javascript --min-disk-free=1024 -v
.github/codeql/extensions/synthorg-sanitisers/python-sanitisers.model.yml
[error] 1-1: Command failed fatally during CodeQL resolve extensions-by-pack (error code 2). Analysis step failed while running: codeql database run-queries --ram=14575 --threads=4 --expect-discarded-cache --additional-packs=/home/runner/work/_temp/pr-diff-range --extension-packs=codeql-action/pr-diff-range /home/runner/work/_temp/codeql_databases/javascript --min-disk-free=1024 -v
.github/codeql/fixtures/python/negative.py
[error] 1-1: Command failed fatally during CodeQL resolve extensions-by-pack (error code 2). Analysis step failed while running: codeql database run-queries --ram=14575 --threads=4 --expect-discarded-cache --additional-packs=/home/runner/work/_temp/pr-diff-range --extension-packs=codeql-action/pr-diff-range /home/runner/work/_temp/codeql_databases/javascript --min-disk-free=1024 -v
.github/codeql/fixtures/python/positive.py
[error] 1-1: Command failed fatally during CodeQL resolve extensions-by-pack (error code 2). Analysis step failed while running: codeql database run-queries --ram=14575 --threads=4 --expect-discarded-cache --additional-packs=/home/runner/work/_temp/pr-diff-range --extension-packs=codeql-action/pr-diff-range /home/runner/work/_temp/codeql_databases/javascript --min-disk-free=1024 -v
scripts/check_codeql_fixtures.py
[error] 1-1: Command failed fatally during CodeQL resolve extensions-by-pack (error code 2). Analysis step failed while running: codeql database run-queries --ram=14575 --threads=4 --expect-discarded-cache --additional-packs=/home/runner/work/_temp/pr-diff-range --extension-packs=codeql-action/pr-diff-range /home/runner/work/_temp/codeql_databases/javascript --min-disk-free=1024 -v
🪛 GitHub Actions: CodeQL Pack Validate / 0_Validate sanitiser pack.txt
scripts/run_codeql_pack_validation.sh
[error] 1-1: CodeQL validation failed: "codeql CLI not on PATH". Install via codeql-action bundle (see script header).
🪛 GitHub Actions: CodeQL Pack Validate / Validate sanitiser pack
scripts/run_codeql_pack_validation.sh
[error] 1-1: CodeQL CLI not on PATH. Install via codeql-action bundle (see script header).
🪛 LanguageTool
CLAUDE.md
[uncategorized] ~19-~19: The official name of this software platform is spelled with a capital “H”.
Context: ...a-flow rule) requires a matching row in `.github/codeql/extensions/synthorg-sanitisers/*...
(GITHUB)
[uncategorized] ~19-~19: The official name of this software platform is spelled with a capital “H”.
Context: ...s/*.model.ymland a fixture pair under.github/codeql/fixtures/(orcli/internal/cod...
(GITHUB)
[uncategorized] ~19-~19: The official name of this software platform is spelled with a capital “H”.
Context: ...internal/codeqlfixtures/). Enforced by .github/workflows/codeql-pack-validate.yml+...
(GITHUB)
docs/reference/sec-prompt-safety.md
[uncategorized] ~151-~151: The official name of this software platform is spelled with a capital “H”.
Context: ...CodeQL Models-as-Data extension pack at .github/codeql/extensions/synthorg-sanitisers/...
(GITHUB)
[uncategorized] ~151-~151: The official name of this software platform is spelled with a capital “H”.
Context: ... to fire. The pack is provably precise: .github/workflows/codeql-pack-validate.yml run...
(GITHUB)
[uncategorized] ~166-~166: The official name of this software platform is spelled with a capital “H”.
Context: ...ixture pair (negative + positive) under .github/codeql/fixtures/ (or `cli/internal/cod...
(GITHUB)
[uncategorized] ~174-~174: The official name of this software platform is spelled with a capital “H”.
Context: ... paths-ignore: ['tests/**'] clause in .github/codeql/codeql-config.yml. Every histor...
(GITHUB)
🪛 Shellcheck (0.11.0)
scripts/run_codeql_pack_validation.sh
[info] 36-36: The mentioned syntax error was in this brace group.
(SC1009)
[error] 46-46: Couldn't parse this shellcheck directive. Fix to allow more checks.
(SC1073)
[error] 46-46: Expected '=' after directive key. Fix any mentioned problems and try again.
(SC1072)
🔇 Additional comments (25)
docs/reference/sec-prompt-safety.md (1)
149-180: LGTM! Clear barrier inventory and validation guarantees.The new CodeQL barrier inventory section clearly documents the sanitiser-to-rule mappings and the validation mechanism. The cross-references to the fixture validation workflow and the note about three-part changes (helper + model + fixture) align with the coding guideline for sanitiser changes.
Note on static analysis: The LanguageTool hints flagging
.github/...paths as needing capitalization are false positives—these are literal file paths in inline code, not references to the GitHub platform..github/codeql/fixtures/python/negative.py (1)
39-52: Fixture pattern matches documentedPath.relative_tocontainment check.The negative path-injection fixture correctly implements the containment pattern: resolve the candidate, then assert it's within
project_rootviarelative_to()in a try/except block, returningNoneon escape. This aligns with the barrier inventory insec-prompt-safety.md.pyproject.toml (1)
270-275: LGTM! Fixture-specific lint ignores are appropriate.The new per-file ignores for CodeQL fixture files are well-justified:
TRY301: Deliberate raise patterns for exercising data-flowEM101: String-literal exceptions are part of the fixture test shapeS310:urllibcalls are intentional data-flow sinks for SSRF testingD: Fixture files are not public API surface.github/codeql/fixtures/python/positive.py (3)
16-24: LGTM! Deliberate cleartext logging vulnerability.The positive fixture correctly demonstrates the anti-pattern: using
str(exc)directly in the logger'sextradict. This should triggerpy/clear-text-logging-sensitive-dataeven with the sanitiser pack loaded, proving the models don't over-suppress.
27-33: LGTM! Deliberate path injection vulnerability.The positive fixture passes user input directly to
Path().read_text()with no containment validation, which should triggerpy/path-injection. This validates that thePath.relative_tomodel in the negative fixture isn't over-broad.
36-45: LGTM! Deliberate partial SSRF vulnerability.The positive fixture constructs a URL with unvalidated user input (
repo,tag) and passes it tourlopen, which should triggerpy/partial-ssrf. This ensures the_validate_repo_prefix/_validate_image_tagmodels only suppress when the validators are called..github/codeql/extensions/synthorg-sanitisers/python-sanitisers.model.yml (2)
44-57: LGTM! SSRF validator models are well-scoped.The
_validate_repo_prefixand_validate_image_tagmodels correctly targetscripts.check_image_signaturesand document the regex anchoring +SystemExit(2)enforcement. The note abouturllib.parse.quote()being belt-and-braces is helpful context.
25-42: ⚡ Quick winRemove
check_domain_error_hierarchyfrom the model: it does not define_resolve_root.The model lists 8 modules, but only 7 scripts in
scripts/actually define_resolve_root:check_currency_aggregation_invariant,check_dependency_inversion,check_forbidden_literals,check_no_magic_numbers,check_no_migration_framing,check_no_review_origin_in_code, andcheck_persistence_boundary. The filecheck_domain_error_hierarchy.pyexists but does not contain_resolve_rootand should be removed from the model data block. Additionally, the comment claiming_resolve_rootis "duplicated across 12 scripts" is inaccurate; it exists in only 7 scripts.> Likely an incorrect or invalid review comment.docs/reference/convention-gates.md (1)
15-15: LGTM! New gate correctly inserted in alphabetical order.The
check_codeql_fixtures.pyentry is correctly positioned alphabetically in the convention gate inventory, and the total count on line 51 has been updated to 38 with the proper RS marker..github/codeql/fixtures/javascript/positive.ts (1)
1-16: ⚡ Quick winjs-sanitisers.model.yml exists and is properly configured.
The file is declared in
.github/codeql/extensions/synthorg-sanitisers/qlpack.ymlunderdataExtensionsand contains valid JavaScript/TypeScript sanitizer models forsanitizeForLog,sanitizeWsString, andsanitizeWsEnum. The pack configuration is complete with all extension targets (codeql/python-all,codeql/go-all,codeql/javascript-all) correctly specified, and.github/codeql/codeql-config.ymlproperly references the pack. No configuration issues were found.> Likely an incorrect or invalid review comment..github/codeql/extensions/synthorg-sanitisers/go-sanitisers.model.yml (1)
23-33: ⚡ Quick winIncorrect premise: cli/cmd is
package cmd, notpackage main, and CodeQL can model it.The package declaration in cli/cmd is
package cmd—a regular importable Go package, notpackage main. CodeQL's Models-as-Data can model non-main packages without issue, so the concern about unresolvable package paths does not apply here.However, note that the model entry for
safeStateDircurrently lacks test fixtures incli/internal/codeqlfixtures/fixture.go. Per the coding guidelines, any sanitiser model should have corresponding positive and negative fixtures to validate it works as expected. The existing fixture coversconfig.SecurePathdirectly but not thesafeStateDirwrapper.> Likely an incorrect or invalid review comment.data/runtime_stats.yaml (1)
46-47: LGTM: convention gate count update is consistent.The numeric bump to
38is coherent for this metadata file.scripts/convention_gate_map.yaml (1)
78-82: LGTM: new mandatory gate is correctly registered.The new rule entry is well-formed and wired to an explicit enforcement script.
CLAUDE.md (1)
19-19: LGTM: policy text and gate count update are coherent.The new mandatory CodeQL sanitiser requirement is clearly documented and the gate inventory count is synchronised.
Also applies to: 46-46
.github/workflows/codeql-pack-validate.yml (1)
1-73: LGTM: validation workflow is well-scoped and hardened.Permissions, trigger paths, SHA pinning, and failure-only SARIF artifact upload all look solid.
.github/workflows/codeql.yml (1)
1-83: LGTM: advanced setup workflow is structured correctly.The job-level permissions, concurrency policy, pinned action SHAs, and scheduling setup are all in good shape.
.github/codeql/fixtures/expected.json (1)
1-59: LGTM: fixture expectation manifest is clear and complete.The positive/negative split and Go function-scoped expectations are well specified for deterministic validation.
scripts/run_codeql_pack_validation.sh (1)
46-46: Shellcheck false positive on disable directive.The shellcheck errors (SC1073, SC1072) flagging this line appear to be a shellcheck version or parser issue. The syntax
# shellcheck disable=SC2086 -- <comment>is the standard format for shellcheck disable directives and is correct..github/codeql/fixtures/javascript/negative.ts (1)
13-28: Fixture logic looks correct.The three exported functions correctly exercise the sanitizer barriers:
sanitizeForLogwrapping unknown inputsanitizeWsStringclamping WebSocket payloadssanitizeWsEnumvalidating against an allowlistEach should prevent
js/log-injectionfrom firing when the synthorg-sanitisers pack is loaded.cli/internal/codeqlfixtures/fixture.go (2)
26-39: Path injection fixture pair is well-designed.The negative case properly demonstrates the sanitizer idiom (
config.SecurePathvalidation beforeos.ReadFile), while the positive case shows the vulnerable pattern (direct user input to file operation). This clear contrast should correctly validate the Models-as-Data barrier definition.
54-63: Log injection fixture pair correctly models safe vs. unsafe logging.The negative case uses only static labels and non-string scalars (int, bool), while the positive case directly logs a user-controlled string. This properly tests that CodeQL's
go/log-injectionrule distinguishes between safe structured logging and tainted string interpolation.scripts/check_codeql_fixtures.py (3)
113-124: Scan path matching logic is correct.The prefix matching correctly handles:
- Empty
scan_paths→ match all alerts- Exact path match
- Prefix match with
/separator to avoid false substring matchesThis will correctly filter alerts to the intended fixture scope.
127-170: Fixture validation logic is comprehensive and correct.The checks properly validate:
must_not_fire: reports violations with file:line contextmust_fire: reports missing expected alertsmust_not_fire_at/must_fire_at: function-scoped variantsThe failure messages clearly distinguish under-modeling (unexpected fires) from over-suppression (missing fires).
181-238: Main orchestration logic is well-structured.The function:
- Validates inputs before processing
- Uses appropriate exit codes for different failure modes (USAGE_EXIT_CODE vs FAILURE_EXIT_CODE)
- Accumulates failures across all fixtures before reporting
- Provides clear pass/fail output
.github/codeql/extensions/synthorg-sanitisers/qlpack.yml (1)
1-11: ⚡ Quick winThe CodeQL pack manifest syntax and file structure are valid.
All referenced files exist and the pack configuration follows the correct syntax for CodeQL model packs. The
extensionTargetsusing wildcard"*"is valid and idiomatic, andcodeql/javascript-allcorrectly covers TypeScript analysis.The pipeline error "resolve extensions-by-pack (error code 2)" is therefore unlikely to be caused by the manifest structure itself. Investigate the content and syntax of the referenced model files (
python-sanitisers.model.yml,go-sanitisers.model.yml,js-sanitisers.model.yml) instead, as they may contain invalid definitions or syntax errors.
| import { sanitizeForLog } from '../../../../web/src/utils/logging' | ||
| import { sanitizeWsString, sanitizeWsEnum } from '../../../../web/src/utils/ws-sanitize' |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial | ⚖️ Poor tradeoff
Deep relative imports create fragile coupling.
The ../../../../web/src/utils/ imports are brittle—moving either the fixture or the source files will break the tests. Consider whether the fixtures could be co-located with the source code, or whether the source utilities could be published as an internal package that can be imported by path.
That said, this may be a necessary trade-off given the CodeQL pack validation requirements for fixture placement under .github/codeql/fixtures/.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.github/codeql/fixtures/javascript/negative.ts around lines 7 - 8, The deep
relative imports of sanitizeForLog, sanitizeWsString, and sanitizeWsEnum from
'../../../../web/src/utils/' are fragile; update the imports to a stable
reference by either co-locating the fixture with the source utilities,
publishing/using an internal package (importing from that package name), or
adding a module path alias (e.g., via tsconfig/webpack) and changing the import
statements to use the package or alias; ensure sanitizeForLog, sanitizeWsString,
and sanitizeWsEnum are resolved from the new stable location and update any
build/test config accordingly so tests still run under the CodeQL fixtures
constraint.
| from scripts.check_image_signatures import ( | ||
| ImageTag, | ||
| _validate_image_tag, | ||
| _validate_repo_prefix, | ||
| ) |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial | ⚡ Quick win
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Description: Check if scripts.check_image_signatures exports the required symbols
# Check if scripts/ is a package (has __init__.py)
fd __init__.py scripts/
# Check if the functions exist in check_image_signatures.py
rg -n "^(def _validate_image_tag|def _validate_repo_prefix|class ImageTag)" scripts/check_image_signatures.py
# Try importing in Python to verify module resolution
python3 -c "from scripts.check_image_signatures import ImageTag, _validate_image_tag, _validate_repo_prefix; print('OK')"Repository: Aureliolo/synthorg
Length of output: 193
The imports are valid and work correctly, but consider the style implications of importing private helpers.
The fixture successfully imports ImageTag, _validate_image_tag, and _validate_repo_prefix from scripts.check_image_signatures. While the imports function correctly (despite scripts/ lacking an __init__.py), importing functions prefixed with _ violates convention as they are intended as private implementation details. Consider either:
- Making the validators public by removing the
_prefix (if they are legitimately part of the API) - Duplicating the validator logic within the fixture
- Moving validators to a public module and importing from there
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.github/codeql/fixtures/python/negative.py around lines 14 - 18, The test
fixture imports private helpers _validate_image_tag and _validate_repo_prefix
alongside ImageTag; to fix, either make those validators public (rename
_validate_image_tag and _validate_repo_prefix to validate_image_tag and
validate_repo_prefix in scripts.check_image_signatures and update imports), or
remove the private imports and duplicate the needed validator logic inside the
fixture, or move the validator functions into a new public module and import
them there; update all references to use the new public names (e.g.,
validate_image_tag, validate_repo_prefix) or the locally duplicated functions to
avoid importing private symbols.
| artifact = physical.get("artifactLocation", {}) | ||
| region = physical.get("region", {}) | ||
| file_uri = artifact.get("uri") or "" | ||
| start_line = int(region.get("startLine", 0)) |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial | 💤 Low value
Consider validating startLine is present.
Defaulting to 0 when startLine is missing may hide SARIF parsing issues. Consider logging a warning or failing if this critical field is absent:
🔍 Proposed defensive check
- start_line = int(region.get("startLine", 0))
+ start_line_raw = region.get("startLine")
+ if start_line_raw is None:
+ print(
+ f"warning: missing startLine in SARIF result for {rule_id} in {file_uri}",
+ file=sys.stderr
+ )
+ start_line = 0
+ else:
+ start_line = int(start_line_raw)That said, CodeQL SARIF output is generally well-formed, so this may be overly defensive.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@scripts/check_codeql_fixtures.py` at line 96, The code currently defaults
start_line = int(region.get("startLine", 0)) which can mask malformed SARIF;
update the logic around region and startLine to explicitly validate existence
and type: check region contains "startLine" and that it's convertible to int
(referencing the region variable and start_line assignment), and if
missing/invalid either log a warning with context (e.g., which result/file) or
raise an exception so parsing fails fast; ensure any logging uses the existing
logger in this module to provide useful diagnostic info.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #1839 +/- ##
==========================================
- Coverage 84.79% 84.79% -0.01%
==========================================
Files 1800 1800
Lines 104855 104855
Branches 9200 9200
==========================================
- Hits 88910 88907 -3
- Misses 13712 13714 +2
- Partials 2233 2234 +1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.github/codeql/extensions/synthorg-sanitisers/python-sanitisers.model.yml:
- Around line 27-43: The comment and model are inconsistent: update the
inventory comment to reflect the actual number of scripts that define
_resolve_root (7, not 12) and remove the incorrect model entry for
"scripts.check_domain_error_hierarchy" from the data list (or replace it with
the correct module that actually defines _resolve_root); ensure the YAML entries
only reference real members like "Member[_resolve_root]" for modules that
contain that function so the model and comment match the true set of scripts.
In `@scripts/run_codeql_pack_validation.sh`:
- Line 60: The ShellCheck directive on the line containing "# shellcheck
disable=SC2086 -- extra_args is a deliberately-split arg list" is malformed;
remove the inline "--" comment and separate the directive from the explanatory
comment so ShellCheck recognizes it. Replace that single invalid line with a
directive-only line "# shellcheck disable=SC2086" followed by a plain comment
line like "# extra_args is a deliberately-split arg list" so the directive
applies to SC2086 while preserving the explanation for maintainers.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: bc1e09e8-8d85-488d-be14-b0232494717f
📒 Files selected for processing (6)
.github/codeql/codeql-config.yml.github/codeql/extensions/synthorg-sanitisers/go-sanitisers.model.yml.github/codeql/extensions/synthorg-sanitisers/js-sanitisers.model.yml.github/codeql/extensions/synthorg-sanitisers/python-sanitisers.model.yml.github/codeql/fixtures/python/negative.pyscripts/run_codeql_pack_validation.sh
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (18)
- GitHub Check: Build Backend
- GitHub Check: Deploy Preview
- GitHub Check: Build Fine-Tune (cpu, fine-tune-cpu)
- GitHub Check: Build Fine-Tune (gpu, fine-tune-gpu)
- GitHub Check: CodSpeed Python benchmarks
- GitHub Check: Lighthouse Site
- GitHub Check: CLI Test (windows-latest)
- GitHub Check: CLI Bench Regression
- GitHub Check: Dashboard Build
- GitHub Check: Dashboard Storybook Build
- GitHub Check: Build Web Assets (melange)
- GitHub Check: Dashboard Test
- GitHub Check: Test (Python 3.14)
- GitHub Check: Type Check
- GitHub Check: Lint
- GitHub Check: Analyze (python)
- GitHub Check: Validate sanitiser pack
- GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (1)
**/*.sh
📄 CodeRabbit inference engine (CLAUDE.md)
Shell scripting: see
~/.claude/rules/common/bash.md(canonical forcd/git -C/ Bash file-write rules)
Files:
scripts/run_codeql_pack_validation.sh
🧠 Learnings (1)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T13:11:29.851Z
Learning: Read design specification from `docs/design/` page before implementing; deviations require approval per DESIGN_SPEC.md
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T13:11:29.851Z
Learning: Present every plan for acceptance/denial before coding (MANDATORY planning requirement)
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T13:11:29.851Z
Learning: No region/currency/locale privileged in code; use metric units; use British English
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T13:11:29.851Z
Learning: Every convention PR must ship its enforcement gate per docs/reference/convention-gates.md
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T13:11:29.851Z
Learning: Timeout/slow test failures indicate source-code regression; never edit `tests/baselines/unit_timing.json`
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T13:11:29.851Z
Learning: Post-Implementation workflow (MANDATORY): branch + commit + push (no auto-PR); use `/pre-pr-review` before PR creation; use `/aurelio-review-pr` for external feedback; fix EVERYTHING valid; no deferring
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T13:11:29.851Z
Learning: xdist: use `-n 8 --dist=loadfile` (prevents 3.14+Windows ProactorEventLoop leak); auto-applied via pyproject addopts
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T13:11:29.851Z
Learning: Git commits: use `<type>: <description>` format (feat/fix/refactor/docs/test/chore/perf/ci); commitizen-enforced
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T13:11:29.851Z
Learning: Signed commits required on protected refs (GPG/SSH or GitHub App via `synthorg-repo-bot`)
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T13:11:29.851Z
Learning: Branch naming: use `<type>/<slug>` format from main
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T13:11:29.851Z
Learning: Pre-commit/pre-push hooks configured in `.pre-commit-config.yaml`; Hookify rules in `.claude/hookify.*.md`
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T13:11:29.851Z
Learning: Squash merge with PR body as squash commit message; trailers (`Release-As`, `Closes `#N``) must be in PR body
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T13:11:29.851Z
Learning: GitHub queries: use `gh issue list` via Bash, NOT MCP `list_issues` tool
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T13:11:29.851Z
Learning: Web implementation: see `web/CLAUDE.md` for React 19 dashboard guidelines
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T13:11:29.851Z
Learning: Python 3.14+ required for synthorg framework; BUSL-1.1 → Apache 2.0 after Change Date
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T13:11:29.851Z
Learning: Project layout: src layout with `src/synthorg/` (Python), `tests/`, `web/` (React 19), `cli/` (Go binary)
🪛 Shellcheck (0.11.0)
scripts/run_codeql_pack_validation.sh
[info] 50-50: The mentioned syntax error was in this brace group.
(SC1009)
[error] 60-60: Couldn't parse this shellcheck directive. Fix to allow more checks.
(SC1073)
[error] 60-60: Expected '=' after directive key. Fix any mentioned problems and try again.
(SC1072)
🔇 Additional comments (4)
.github/codeql/codeql-config.yml (1)
1-14: Configuration looks correct and complete.This matches the advanced CodeQL setup goals, and the fixture ignore patterns (including
cli/internal/codeqlfixtures/**) correctly prevent intentional fixture findings from polluting main scan results..github/codeql/extensions/synthorg-sanitisers/go-sanitisers.model.yml (1)
14-35: Go sanitiser modelling block looks consistent.The two
sanitizerModelentries are clear and scoped/documented well for the intended path-safety flow..github/codeql/extensions/synthorg-sanitisers/js-sanitisers.model.yml (1)
19-40: JS/TS sanitiser rows look correctly structured and focused.This keeps the model surface narrow to the documented leaf sanitisers.
.github/codeql/fixtures/python/negative.py (1)
26-69: Negative fixture cases are well-formed and map cleanly to the modelled barriers.The three functions are explicit and suitable for
must_not_firevalidation.
3049fad to
ee744f1
Compare
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.github/codeql/fixtures/javascript/negative.ts:
- Around line 24-28: The test fixture negativeWsEnumLogInjection returns the
result of sanitizeWsEnum(…) but never passes it into a js/log-injection sink, so
CodeQL cannot validate the sanitizer; change negativeWsEnumLogInjection to send
the sanitized value into a recognized log-injection sink (e.g., call the
project’s logging function or a test sink that CodeQL recognizes) instead of
only returning it, and when you modify the sanitizer model also add the required
CodeQL binding and matching fixture pair under .github/codeql/fixtures (or
cli/internal/codeqlfixtures/) so sanitizeWsEnum and the sink invocation are both
covered.
In @.github/workflows/codeql.yml:
- Around line 34-40: The language matrix under the "language" key currently
lists duplicate analyzers ("javascript", "typescript", and
"javascript-typescript"); remove the redundant "javascript" and "typescript"
entries and keep only "javascript-typescript" so the CodeQL job analyzes JS/TS
code once (update the language array where it lists these three values).
- Around line 69-74: Remove the workflow-level "queries" parameter from the
CodeQL init step (the uses: github/codeql-action/init block) so the config-file
(.github/codeql/codeql-config.yml) can control which queries run; specifically
delete the line setting queries: security-extended in that init step to allow
your custom queries (synthorg-cleartext-logging, synthorg-partial-ssrf) and
language-specific packs from the config file to be applied.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 7d618099-e652-47f9-9f4c-d12e34031cac
📒 Files selected for processing (27)
.github/codeql/codeql-config.yml.github/codeql/extensions/synthorg-sanitisers/go-sanitisers.model.yml.github/codeql/extensions/synthorg-sanitisers/js-sanitisers.model.yml.github/codeql/extensions/synthorg-sanitisers/python-sanitisers.model.yml.github/codeql/extensions/synthorg-sanitisers/qlpack.yml.github/codeql/fixtures/expected.json.github/codeql/fixtures/javascript/negative.ts.github/codeql/fixtures/javascript/positive.ts.github/codeql/fixtures/python/negative.py.github/codeql/fixtures/python/positive.py.github/codeql/queries/synthorg-cleartext-logging/CleartextLoggingWithBarriers.ql.github/codeql/queries/synthorg-cleartext-logging/Sanitizers.qll.github/codeql/queries/synthorg-cleartext-logging/qlpack.yml.github/codeql/queries/synthorg-partial-ssrf/PartialServerSideRequestForgeryWithBarriers.ql.github/codeql/queries/synthorg-partial-ssrf/Sanitizers.qll.github/codeql/queries/synthorg-partial-ssrf/qlpack.yml.github/workflows/codeql-pack-validate.yml.github/workflows/codeql.ymlCLAUDE.mdcli/internal/codeqlfixtures/fixture.godata/runtime_stats.yamldocs/reference/convention-gates.mddocs/reference/sec-prompt-safety.mdpyproject.tomlscripts/check_codeql_fixtures.pyscripts/convention_gate_map.yamlscripts/run_codeql_pack_validation.sh
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (13)
- GitHub Check: Build Backend
- GitHub Check: Build Fine-Tune (cpu, fine-tune-cpu)
- GitHub Check: Build Fine-Tune (gpu, fine-tune-gpu)
- GitHub Check: Build Web Assets (melange)
- GitHub Check: CLI Test (windows-latest)
- GitHub Check: CLI Bench Regression
- GitHub Check: CodSpeed Python benchmarks
- GitHub Check: Dashboard Test
- GitHub Check: Test (Python 3.14)
- GitHub Check: Lighthouse Site
- GitHub Check: Analyze (python)
- GitHub Check: Validate sanitiser pack
- GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (5)
{README.md,docs/**/*.md}
📄 CodeRabbit inference engine (CLAUDE.md)
Numerics in README and public docs must be sourced from
data/runtime_stats.yamlvia<!--RS:NAME-->markers
Files:
docs/reference/convention-gates.mddocs/reference/sec-prompt-safety.md
docs/**/*.{md,d2}
📄 CodeRabbit inference engine (CLAUDE.md)
Use
d2for architecture/nested container diagrams; usemermaidfor flowcharts/sequence/pipelines; use markdown tables for tabular data; D2 theme 200 (Dark Mauve); CI pinned to v0.7.1
Files:
docs/reference/convention-gates.mddocs/reference/sec-prompt-safety.md
pyproject.toml
📄 CodeRabbit inference engine (CLAUDE.md)
Apply xdist
-n 8 --dist=loadfilevia pyprojectaddopts(loadfile prevents Python 3.14+ Windows ProactorEventLoop leak)
Files:
pyproject.toml
{src/synthorg/**/*.py,.github/codeql/**,cli/internal/codeqlfixtures/**}
📄 CodeRabbit inference engine (CLAUDE.md)
Any change to a project sanitiser requires a matching CodeQL binding (barrierModel in
.github/codeql/extensions/synthorg-sanitisers/*.model.ymlorSanitizersubclass in.github/codeql/queries/synthorg-<rule>/Sanitizers.qll) plus a fixture pair under.github/codeql/fixtures/orcli/internal/codeqlfixtures/
Files:
.github/codeql/fixtures/javascript/positive.ts.github/codeql/extensions/synthorg-sanitisers/qlpack.yml.github/codeql/queries/synthorg-cleartext-logging/qlpack.yml.github/codeql/queries/synthorg-partial-ssrf/qlpack.yml.github/codeql/queries/synthorg-partial-ssrf/Sanitizers.qll.github/codeql/queries/synthorg-cleartext-logging/Sanitizers.qll.github/codeql/codeql-config.yml.github/codeql/extensions/synthorg-sanitisers/python-sanitisers.model.yml.github/codeql/extensions/synthorg-sanitisers/go-sanitisers.model.yml.github/codeql/fixtures/python/positive.py.github/codeql/queries/synthorg-cleartext-logging/CleartextLoggingWithBarriers.ql.github/codeql/fixtures/javascript/negative.ts.github/codeql/queries/synthorg-partial-ssrf/PartialServerSideRequestForgeryWithBarriers.ql.github/codeql/fixtures/python/negative.py.github/codeql/extensions/synthorg-sanitisers/js-sanitisers.model.ymlcli/internal/codeqlfixtures/fixture.go.github/codeql/fixtures/expected.json
cli/**/*.go
📄 CodeRabbit inference engine (cli/CLAUDE.md)
Use hardcoded
localhost,postgres:5432, andnats:4222literals in CLI code by design (correct by security/audit rationale; do not make these configurable). See docs/reference/cli-env-vars.md for audit justification
Files:
cli/internal/codeqlfixtures/fixture.go
🧠 Learnings (2)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T14:11:42.550Z
Learning: Read design specification in `docs/design/` before implementing; deviations require approval
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T14:11:42.550Z
Learning: Present every plan for accept/deny before coding
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T14:11:42.550Z
Learning: No region/currency/locale should be privileged; use metric units and British English
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T14:11:42.550Z
Learning: Every convention PR must ship its enforcement gate
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T14:11:42.550Z
Learning: Timeout/slow test failures indicate source-code regression; never edit `tests/baselines/unit_timing.json`
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T14:11:42.550Z
Learning: After issue: create branch + commit + push (no auto-PR); use `/pre-pr-review`; after PR: use `/aurelio-review-pr` for external feedback; fix everything valid and do not defer
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T14:11:42.550Z
Learning: Commits use format `<type>: <description>` where type is {feat,fix,refactor,docs,test,chore,perf,ci}; enforced by commitizen
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T14:11:42.550Z
Learning: Signed commits required on protected refs (GPG/SSH or GitHub App via `synthorg-repo-bot`)
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T14:11:42.550Z
Learning: Branch naming: `<type>/<slug>` from main
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T14:11:42.550Z
Learning: Use squash merge with PR body becoming squash commit; trailers (`Release-As`, `Closes `#N``) must be in PR body
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T14:11:42.550Z
Learning: Use `gh issue list` via Bash for GitHub queries, NOT MCP `list_issues`
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T14:11:42.550Z
Learning: Finished issue: branch + commit + push (no auto-PR); use `/pre-pr-review`; after squash merge use `/post-merge-cleanup`
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T14:11:42.550Z
Learning: CLI is Docker-only (init/start/stop/status); features go in dashboard + REST API
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T14:11:42.550Z
Learning: Use `go -C cli` never `cd cli` for CLI operations
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T14:11:42.550Z
Learning: See `web/CLAUDE.md` for web framework guidelines and `cli/CLAUDE.md` for CLI guidelines
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T14:11:42.550Z
Learning: See `~/.claude/rules/common/bash.md` for canonical Bash rules (`cd` / `git -C` / file-write rules)
📚 Learning: 2026-05-05T09:04:46.195Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1760
File: scripts/_dual_backend_parity_lib.py:215-216
Timestamp: 2026-05-05T09:04:46.195Z
Learning: This repository targets Python 3.14+ and follows PEP 758. Therefore, reviewer tooling should NOT treat unparenthesized multi-exception `except` clauses written without an `as` clause (e.g., `except MemoryError, RecursionError:`) as syntax errors. Only flag `except`-clause problems when they are genuinely invalid for Python 3.14+.
Applied to files:
scripts/check_codeql_fixtures.py
🪛 LanguageTool
docs/reference/sec-prompt-safety.md
[uncategorized] ~151-~151: The official name of this software platform is spelled with a capital “H”.
Context: ...aces. Each binding is provably precise: .github/workflows/codeql-pack-validate.yml run...
(GITHUB)
[uncategorized] ~153-~153: The official name of this software platform is spelled with a capital “H”.
Context: ...e 1 -- Models-as-Data extension pack** (.github/codeql/extensions/synthorg-sanitisers/...
(GITHUB)
[uncategorized] ~155-~155: The official name of this software platform is spelled with a capital “H”.
Context: ...Surface 2 -- custom QL query packs (.github/codeql/queries/synthorg-*/). For queri...
(GITHUB)
[uncategorized] ~155-~155: Do not mix variants of the same word (‘sanitizer’ and ‘sanitiser’) within a single text.
Context: ...nthorg-*/). For queries whose abstract Sanitizerclass has nobarrierNode` MaD hook up...
(EN_WORD_COHERENCY)
[uncategorized] ~155-~155: Do not mix variants of the same word (‘sanitizer’ and ‘sanitiser’) within a single text.
Context: ...aD hook upstream, the pack defines a QL Sanitizer subclass and ships a custom query (re-...
(EN_WORD_COHERENCY)
[uncategorized] ~170-~170: Do not mix variants of the same word (‘sanitizer’ and ‘sanitiser’) within a single text.
Context: ...-sanitisers.model.yml(Surface 1) or aSanitizersubclass under.github/codeql/queries...
(EN_WORD_COHERENCY)
[uncategorized] ~170-~170: Do not mix variants of the same word (‘sanitizer’ and ‘sanitiser’) within a single text.
Context: ...face 1) or a Sanitizer subclass under .github/codeql/queries/synthorg-<rule>/Sanitizers.qll (Surface 2), (3) a fixture pair (n...
(EN_WORD_COHERENCY)
[uncategorized] ~170-~170: The official name of this software platform is spelled with a capital “H”.
Context: ...ixture pair (negative + positive) under .github/codeql/fixtures/ (or `cli/internal/cod...
(GITHUB)
[uncategorized] ~178-~178: The official name of this software platform is spelled with a capital “H”.
Context: ... paths-ignore: ['tests/**'] clause in .github/codeql/codeql-config.yml. Every histor...
(GITHUB)
CLAUDE.md
[uncategorized] ~19-~19: The official name of this software platform is spelled with a capital “H”.
Context: ...tching binding (a barrierModel row in `.github/codeql/extensions/synthorg-sanitisers/*...
(GITHUB)
[uncategorized] ~19-~19: Do not mix variants of the same word (‘sanitizer’ and ‘sanitiser’) within a single text.
Context: ...rNodeModels-as-Data hook, otherwise aSanitizersubclass under.github/codeql/queries...
(EN_WORD_COHERENCY)
[uncategorized] ~19-~19: Do not mix variants of the same word (‘sanitizer’ and ‘sanitiser’) within a single text.
Context: ... otherwise a Sanitizer subclass under .github/codeql/queries/synthorg-<rule>/Sanitizers.qll plus the matching custom .ql) an...
(EN_WORD_COHERENCY)
[uncategorized] ~19-~19: The official name of this software platform is spelled with a capital “H”.
Context: ... custom .ql) and a fixture pair under .github/codeql/fixtures/ (or `cli/internal/cod...
(GITHUB)
[uncategorized] ~19-~19: The official name of this software platform is spelled with a capital “H”.
Context: ...internal/codeqlfixtures/). Enforced by .github/workflows/codeql-pack-validate.yml+...
(GITHUB)
🔇 Additional comments (25)
pyproject.toml (1)
270-275: Per-file ignores are appropriate for CodeQL test fixtures.The Ruff rule exceptions (TRY301, EM101, S310, D) are well-justified for security-scanner fixtures that must contain deliberate anti-patterns to validate CodeQL detection. Verification confirms:
- ✓ Fixture files are isolated (no imports outside
.github/codeql/fixtures/)- ✓ Pytest excludes fixtures via
testpaths = ["tests"]- Fixtures contain intentional vulnerable patterns (e.g., S310 urllib security sinks) needed to validate CodeQL's detection capabilities
No blocking concerns. The fixture structure prevents accidental execution in test suites and eliminates risk of anti-patterns leaking into production code.
.github/codeql/queries/synthorg-cleartext-logging/qlpack.yml (1)
1-5: LGTM!The query pack manifest is correctly structured with appropriate naming (
synthorg/cleartext-logging-queries),library: falsefor a query pack, and the standardcodeql/python-alldependency..github/codeql/queries/synthorg-partial-ssrf/Sanitizers.qll (1)
28-35: LGTM!The sanitizer correctly models the OCI validators using the CodeQL API graph. The characteristic predicate properly resolves
scripts.check_image_signatures._validate_repo_prefixand_validate_image_tagcall sites. The documentation clearly explains why the custom QL approach is needed (nobarrierNodeMaD hook for partial-SSRF)..github/codeql/queries/synthorg-cleartext-logging/Sanitizers.qll (1)
24-38: LGTM!The sanitizer correctly models both import paths for the redaction helpers: the full path via
synthorg.observability.redactionand the re-exported path viasynthorg.observability. The disjunction ensures coverage regardless of how callers import the functions.docs/reference/sec-prompt-safety.md (1)
148-183: LGTM!The CodeQL barrier inventory documentation is comprehensive and well-structured. The two-surface approach (Models-as-Data vs custom QL query packs) is clearly explained, and the inventory table provides actionable information for maintainers adding new sanitisers. The mixed spelling of "sanitiser" (prose) vs "Sanitizer" (CodeQL class names) is appropriate—the latter are fixed upstream identifiers.
.github/codeql/extensions/synthorg-sanitisers/go-sanitisers.model.yml (1)
17-33: LGTM!The
barrierModelrows correctly model both Go sanitisers with proper package paths,ReturnValue[0]targeting the cleaned path (not the error), andpath-injectionkind. The comment explaining whysafeStateDirneeds separate modeling (inter-procedural precision loss) is a helpful maintenance note..github/codeql/queries/synthorg-cleartext-logging/CleartextLoggingWithBarriers.ql (1)
1-36: LGTM!The custom query correctly re-uses the upstream
CleartextLoggingFlowwith project-specific sanitisers in scope. Theimport SanitizersbringsSynthorgRedactionSanitizerinto the analysis, and the path-problem structure with source classification is appropriate for security queries..github/codeql/queries/synthorg-partial-ssrf/PartialServerSideRequestForgeryWithBarriers.ql (1)
1-34: LGTM!The partial-SSRF query correctly targets cases where only part of the URL is user-controlled (
not fullyControlledRequest(request)). The high@security-severity 9.1is appropriate for SSRF, and@precision mediumacknowledges the inherent noise in partial-control detection. The sanitiser import ensuresSynthorgImageRefSanitizeris in scope..github/codeql/queries/synthorg-partial-ssrf/qlpack.yml (1)
1-5: LGTM!The query pack manifest is correctly structured, consistent with the cleartext-logging pack format.
.github/codeql/extensions/synthorg-sanitisers/python-sanitisers.model.yml (1)
17-35:_resolve_rootbarrier inventory looks consistent.The explicit 7-module
barrierModelmapping is coherent and matches the inline inventory notes..github/codeql/fixtures/python/negative.py (1)
26-69: Negative fixture flows are well-constructed.All three paths keep the intended sanitiser/barrier on the dataflow route for “must not fire” assertions.
.github/codeql/fixtures/python/positive.py (1)
16-45: Positive fixtures correctly preserve genuine leak paths.The three functions are clean “must fire” counterparts to the negative cases.
cli/internal/codeqlfixtures/fixture.go (1)
26-63: Go fixture coverage is solid for both path and log injection cases.The positive/negative function pairs are precise and suitable for pack validation.
.github/codeql/codeql-config.yml (1)
11-29: Config-level fixture exclusions and Python query replacement filters look correct.docs/reference/convention-gates.md (1)
15-51: Gate inventory update is consistent with the RS-backed total.data/runtime_stats.yaml (1)
46-47: Runtime stat bump forconvention_gatesis aligned with the docs inventory update.scripts/convention_gate_map.yaml (1)
78-82: Good gate registration for the new CodeQL sanitiser convention.This mandatory entry correctly wires the enforcement script into the convention inventory.
.github/workflows/codeql-pack-validate.yml (1)
20-73: Workflow hardening and validation flow look solid.Pinned actions, least-privilege permissions, path-scoped triggers, and failure artifact upload are all implemented well.
CLAUDE.md (1)
19-19: The new MANDATORY CodeQL sanitiser rule is clearly specified and enforceable.The linkage to workflow + fixture checker is explicit and actionable.
.github/codeql/fixtures/javascript/positive.ts (1)
7-15: Positive JS fixtures are well targeted for over-suppression checks.Raw payloads flow directly to logging sinks, which is exactly what this harness needs.
.github/codeql/fixtures/expected.json (1)
4-58: Expectation manifest is structured well for cross-language validation.The split between
must_fire/must_not_fireand Go’s function-scoped assertions is appropriate..github/codeql/extensions/synthorg-sanitisers/qlpack.yml (1)
1-11: Pack manifest wiring looks correct.
extensionTargetsanddataExtensionsare set up cleanly for the three model files.scripts/run_codeql_pack_validation.sh (1)
83-119: Validation harness now correctly runs distinct positive/negative analyses.Good to see explicit per-variant analysis and clear rationale in comments.
.github/codeql/extensions/synthorg-sanitisers/js-sanitisers.model.yml (1)
13-39: JS barrier models look correctly scoped and structured.The
barrierModelentries are consistent and target the intended log-injection sanitisers with a minimal model surface.scripts/check_codeql_fixtures.py (1)
80-242: Fixture validator implementation is solid and CI-friendly.Parsing, scope filtering, expectation checks, and exit-code handling are all well structured for reliable gate enforcement.
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@CLAUDE.md`:
- Line 19: The long single-line bullet under "CodeQL Sanitisers (MANDATORY)" in
CLAUDE.md is hard to read; split it into a short lead sentence and a bulleted
sub-list that separates: 1) the requirement (matching binding/barrierModel or
Sanitizer subclass), 2) where to put the binding (language-specific
.github/codeql/extensions/.../*.model.yml or
.github/codeql/queries/.../Sanitizers.qll + custom .ql), 3) the required
fixtures (.github/codeql/fixtures/ or cli/internal/codeqlfixtures/), and 4) the
enforcing checks (.github/workflows/codeql-pack-validate.yml and
scripts/check_codeql_fixtures.py) and reference link
(docs/reference/sec-prompt-safety.md#codeql-barrier-inventory); keep each item
on its own line to improve scannability.
- Line 19: The text mixes British "sanitiser" with American
"Sanitizer"/"Sanitizers" in code artifact names; standardize to British spelling
across prose and referenced artifacts to match the codebase policy: rename
usages of "Sanitizer", "Sanitizers.qll" and any occurrences in filenames or
class names referenced (e.g.
`.github/codeql/queries/synthorg-<rule>/Sanitizers.qll`, `Sanitizer` subclass,
`.github/codeql/extensions/.../*.model.yml` bindings, and fixture references
under `.github/codeql/fixtures/`) to use "Sanitiser"/"Sanitisers" (and update
CLAUDE.md line accordingly), and update any docs or references such as
docs/reference/sec-prompt-safety.md to the British spelling so everything is
consistent.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 794e1005-2715-4630-be48-044a7e5f6bba
📒 Files selected for processing (11)
.github/codeql/extensions/synthorg-go-sanitisers/go-sanitisers.model.yml.github/codeql/extensions/synthorg-go-sanitisers/qlpack.yml.github/codeql/extensions/synthorg-js-sanitisers/js-sanitisers.model.yml.github/codeql/extensions/synthorg-js-sanitisers/qlpack.yml.github/codeql/extensions/synthorg-python-sanitisers/python-sanitisers.model.yml.github/codeql/extensions/synthorg-python-sanitisers/qlpack.yml.github/codeql/fixtures/javascript/negative.ts.github/workflows/codeql.ymlCLAUDE.mddocs/reference/sec-prompt-safety.mdscripts/run_codeql_pack_validation.sh
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (13)
- GitHub Check: Build Backend
- GitHub Check: Build Fine-Tune (cpu, fine-tune-cpu)
- GitHub Check: Build Fine-Tune (gpu, fine-tune-gpu)
- GitHub Check: Validate sanitiser pack
- GitHub Check: CodSpeed Python benchmarks
- GitHub Check: Build Web Assets (melange)
- GitHub Check: Lighthouse Site
- GitHub Check: CLI Test (windows-latest)
- GitHub Check: CLI Bench Regression
- GitHub Check: Dashboard Test
- GitHub Check: Test (Python 3.14)
- GitHub Check: Analyze (python)
- GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (3)
{.github/codeql/extensions/**,cli/internal/codeqlfixtures/**,.github/codeql/fixtures/**}
📄 CodeRabbit inference engine (CLAUDE.md)
Any change to a CodeQL sanitiser requires matching binding in language-specific
.github/codeql/extensions/synthorg-{python,js,go}-sanitisers/*.model.ymlor custom.qlfiles plus fixture pair in.github/codeql/fixtures/orcli/internal/codeqlfixtures/
Files:
.github/codeql/extensions/synthorg-python-sanitisers/qlpack.yml.github/codeql/extensions/synthorg-js-sanitisers/qlpack.yml.github/codeql/extensions/synthorg-js-sanitisers/js-sanitisers.model.yml.github/codeql/extensions/synthorg-go-sanitisers/qlpack.yml.github/codeql/extensions/synthorg-go-sanitisers/go-sanitisers.model.yml.github/codeql/extensions/synthorg-python-sanitisers/python-sanitisers.model.yml.github/codeql/fixtures/javascript/negative.ts
**/*.sh
📄 CodeRabbit inference engine (CLAUDE.md)
Shell scripts: reference
~/.claude/rules/common/bash.mdfor canonical rules oncd/git -C/ Bash file-write
Files:
scripts/run_codeql_pack_validation.sh
docs/**/*.md
📄 CodeRabbit inference engine (CLAUDE.md)
D2 diagrams for architecture/nested containers; mermaid for flowcharts/sequence/pipelines. Markdown tables for tabular data. D2 theme 200 (Dark Mauve), CI pinned to v0.7.1.
Files:
docs/reference/sec-prompt-safety.md
🧠 Learnings (1)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T14:26:51.406Z
Learning: Read `docs/design/` page before implementing; deviations need approval
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T14:26:51.406Z
Learning: Present every plan for accept/deny before coding
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T14:26:51.406Z
Learning: No region/currency/locale privileged; use metric units; British English in text and code
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T14:26:51.406Z
Learning: Every convention PR must ship its enforcement gate
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T14:26:51.406Z
Learning: Timeout/slow test failures are source-code regressions; never edit `tests/baselines/unit_timing.json`
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T14:26:51.406Z
Learning: After issue completion: branch + commit + push (no auto-PR); use `/pre-pr-review`. After PR merge: `/aurelio-review-pr` for external feedback. Fix EVERYTHING valid; no deferring.
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T14:26:51.406Z
Learning: Git commits: `<type>: <description>` format (feat/fix/refactor/docs/test/chore/perf/ci); commitizen-enforced
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T14:26:51.406Z
Learning: Signed commits required on protected refs (GPG/SSH or GitHub App via `synthorg-repo-bot`)
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T14:26:51.406Z
Learning: Branches: `<type>/<slug>` format from main
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T14:26:51.406Z
Learning: Use `.pre-commit-config.yaml` for pre-commit/pre-push hooks; see `.claude/hookify.*.md` for rules
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T14:26:51.406Z
Learning: Squash merge PRs; PR body becomes squash commit; trailers (`Release-As`, `Closes `#N``) must be in PR body
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T14:26:51.406Z
Learning: GitHub queries: use `gh issue list` via Bash, NOT MCP `list_issues`
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T14:26:51.406Z
Learning: Finished issue: create branch + commit + push (no auto-PR). Use `/pre-pr-review` slash command.
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T14:26:51.406Z
Learning: After every squash merge: run `/post-merge-cleanup` slash command
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T14:26:51.406Z
Learning: CLI is Docker-only (init/start/stop/status); features go in dashboard + REST API
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T14:26:51.406Z
Learning: Web dashboard: reference `web/CLAUDE.md` for additional guidelines
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T14:26:51.406Z
Learning: CLI (Go binary): reference `cli/CLAUDE.md` for additional guidelines; use `go -C cli`, never `cd cli`
🪛 LanguageTool
docs/reference/sec-prompt-safety.md
[uncategorized] ~151-~151: The official name of this software platform is spelled with a capital “H”.
Context: ...aces. Each binding is provably precise: .github/workflows/codeql-pack-validate.yml run...
(GITHUB)
[uncategorized] ~153-~153: The official name of this software platform is spelled with a capital “H”.
Context: ... 1 -- Models-as-Data extension packs** (`.github/codeql/extensions/synthorg-{python,js,g...
(GITHUB)
[uncategorized] ~155-~155: The official name of this software platform is spelled with a capital “H”.
Context: ...Surface 2 -- custom QL query packs (.github/codeql/queries/synthorg-*/). For queri...
(GITHUB)
[uncategorized] ~155-~155: Do not mix variants of the same word (‘sanitizer’ and ‘sanitiser’) within a single text.
Context: ...nthorg-*/). For queries whose abstract Sanitizerclass has nobarrierNode` MaD hook up...
(EN_WORD_COHERENCY)
[uncategorized] ~155-~155: Do not mix variants of the same word (‘sanitizer’ and ‘sanitiser’) within a single text.
Context: ...aD hook upstream, the pack defines a QL Sanitizer subclass and ships a custom query (re-...
(EN_WORD_COHERENCY)
[uncategorized] ~170-~170: Do not mix variants of the same word (‘sanitizer’ and ‘sanitiser’) within a single text.
Context: ...-sanitisers.model.yml(Surface 1) or aSanitizersubclass under.github/codeql/queries...
(EN_WORD_COHERENCY)
[uncategorized] ~170-~170: Do not mix variants of the same word (‘sanitizer’ and ‘sanitiser’) within a single text.
Context: ...face 1) or a Sanitizer subclass under .github/codeql/queries/synthorg-<rule>/Sanitizers.qll (Surface 2), (3) a fixture pair (n...
(EN_WORD_COHERENCY)
[uncategorized] ~170-~170: The official name of this software platform is spelled with a capital “H”.
Context: ...ixture pair (negative + positive) under .github/codeql/fixtures/ (or `cli/internal/cod...
(GITHUB)
[uncategorized] ~178-~178: The official name of this software platform is spelled with a capital “H”.
Context: ... paths-ignore: ['tests/**'] clause in .github/codeql/codeql-config.yml. Every histor...
(GITHUB)
CLAUDE.md
[uncategorized] ~19-~19: The official name of this software platform is spelled with a capital “H”.
Context: ...rierModelrow in the language-specific.github/codeql/extensions/synthorg-{python,js,g...
(GITHUB)
[uncategorized] ~19-~19: Do not mix variants of the same word (‘sanitizer’ and ‘sanitiser’) within a single text.
Context: ...rNodeModels-as-Data hook, otherwise aSanitizersubclass under.github/codeql/queries...
(EN_WORD_COHERENCY)
[uncategorized] ~19-~19: Do not mix variants of the same word (‘sanitizer’ and ‘sanitiser’) within a single text.
Context: ... otherwise a Sanitizer subclass under .github/codeql/queries/synthorg-<rule>/Sanitizers.qll plus the matching custom .ql) an...
(EN_WORD_COHERENCY)
[uncategorized] ~19-~19: The official name of this software platform is spelled with a capital “H”.
Context: ... custom .ql) and a fixture pair under .github/codeql/fixtures/ (or `cli/internal/cod...
(GITHUB)
[uncategorized] ~19-~19: The official name of this software platform is spelled with a capital “H”.
Context: ...internal/codeqlfixtures/). Enforced by .github/workflows/codeql-pack-validate.yml+...
(GITHUB)
🔇 Additional comments (14)
.github/workflows/codeql.yml (3)
15-30: Least-privilege permissions are well implemented.Workflow-level
permissions: {}plus narrowly scoped job permissions is solid and reduces token blast radius.
34-38: Language matrix is correctly de-duplicated.Using
javascript-typescriptonce (instead of separate JS/TS entries) is the right setup for CodeQL and avoids duplicate analysis cost.
67-79: CodeQL init/analyse wiring looks correct.Using only
config-filein init and a language-scoped analysis category keeps query selection centralised and avoids accidental override behaviour..github/codeql/extensions/synthorg-js-sanitisers/js-sanitisers.model.yml (1)
13-39: JS barrier models are correctly scoped and explicit.The three
barrierModelbindings are clear, minimal, and aligned to the intendedlog-injectionkind..github/codeql/extensions/synthorg-js-sanitisers/qlpack.yml (1)
1-7: Pack metadata and data extension linkage look good.
extensionTargetsanddataExtensionsare correctly configured for the JS sanitiser pack..github/codeql/extensions/synthorg-go-sanitisers/go-sanitisers.model.yml (1)
12-33: Go path-injection barrier bindings are well-formed.The two modeled return-value barriers are explicit and appropriately constrained to
path-injection..github/codeql/extensions/synthorg-go-sanitisers/qlpack.yml (1)
1-7: Go extension pack wiring is correct.The qlpack cleanly targets
codeql/go-alland includes the expected model file..github/codeql/fixtures/javascript/negative.ts (1)
24-32:sanitizeWsEnumfixture now validates the barrier path properly.Routing the sanitized enum into
console.warnmakes this negative test non-vacuous and meaningfully checks the model..github/codeql/extensions/synthorg-python-sanitisers/qlpack.yml (1)
1-7: Python extension pack configuration is solid.The pack metadata and model registration are correctly set for Python.
docs/reference/sec-prompt-safety.md (1)
149-179: Barrier inventory section is clear and operationally complete.The two-surface model explanation plus rule-to-sanitiser tables make maintenance expectations explicit and auditable.
.github/codeql/extensions/synthorg-python-sanitisers/python-sanitisers.model.yml (1)
17-35: Python path-injection barrier mappings are explicit and maintainable.Enumerating each
_resolve_rootmodule keeps the model precise for current script entry points.scripts/run_codeql_pack_validation.sh (1)
1-133: LGTM!Well-structured validation script with good practices:
- Uses
set -euo pipefailfor robust error handling- Avoids
cdin favour of absolute paths derived fromgit rev-parse --show-toplevel- Properly handles CI environment PATH discovery for CodeQL CLI
- Clear documentation explaining fixture strategy per language
The previous critical issues (copying SARIF instead of running separate analyses, malformed ShellCheck directive) have been correctly addressed — Python and JavaScript fixtures now run separate
run_fixtureinvocations, and the ShellCheck disable is properly formatted on its own line.CLAUDE.md (2)
46-46: LGTM: Convention-gates count updated correctly.The update from 35 to 38 enforcement gates aligns with this PR adding CodeQL sanitiser validation as a new enforcement gate, consistent with the mandatory requirement that convention PRs ship their enforcement mechanisms.
19-19: ⚡ Quick winPath pattern in line 19 is accurate.
The directory structure verification confirms that
.github/codeql/extensions/synthorg-{python,js,go}-sanitisers/*.model.ymlcorrectly describes the actual per-language extension pack layout. The three separate directories (synthorg-go-sanitisers/,synthorg-js-sanitisers/,synthorg-python-sanitisers/), each containing a language-specific.model.ymlfile, align exactly with the documented path pattern.> Likely an incorrect or invalid review comment.
f01f6c2 to
04a9fac
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@docs/reference/sec-prompt-safety.md`:
- Around line 153-170: The doc contains hard-coded numeric claims (e.g., counts
referenced near the "Surface 1 -- Models-as-Data extension packs" and "Surface 2
-- custom QL query packs" sections and in the table rows describing sanitiser
coverage) that must be replaced with runtime-stats markers or rewritten
non-numerically; update the markdown to use <!--RS:NAME--> markers pulling from
runtime_stats.yaml (or change phrasing to avoid explicit numbers) where the text
currently asserts concrete counts, and ensure references to specific sanitiser
examples like sanitizeForLog, sanitizeWsString, safe_error_description,
scrub_secret_tokens, and the Surface 1/Surface 2 headings reflect the marker
usage or non-numeric wording.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 1083ddb6-b777-47d9-9a96-ac45d23abfb8
📒 Files selected for processing (29)
.github/codeql/codeql-config.yml.github/codeql/extensions/synthorg-go-sanitisers/go-sanitisers.model.yml.github/codeql/extensions/synthorg-go-sanitisers/qlpack.yml.github/codeql/extensions/synthorg-js-sanitisers/js-sanitisers.model.yml.github/codeql/extensions/synthorg-js-sanitisers/qlpack.yml.github/codeql/extensions/synthorg-python-sanitisers/python-sanitisers.model.yml.github/codeql/extensions/synthorg-python-sanitisers/qlpack.yml.github/codeql/fixtures/expected.json.github/codeql/fixtures/javascript/negative.ts.github/codeql/fixtures/javascript/positive.ts.github/codeql/fixtures/python/negative.py.github/codeql/fixtures/python/positive.py.github/codeql/queries/synthorg-cleartext-logging/CleartextLoggingWithBarriers.ql.github/codeql/queries/synthorg-cleartext-logging/Sanitisers.qll.github/codeql/queries/synthorg-cleartext-logging/qlpack.yml.github/codeql/queries/synthorg-partial-ssrf/PartialServerSideRequestForgeryWithBarriers.ql.github/codeql/queries/synthorg-partial-ssrf/Sanitisers.qll.github/codeql/queries/synthorg-partial-ssrf/qlpack.yml.github/workflows/codeql-pack-validate.yml.github/workflows/codeql.ymlCLAUDE.mdcli/internal/codeqlfixtures/fixture.godata/runtime_stats.yamldocs/reference/convention-gates.mddocs/reference/sec-prompt-safety.mdpyproject.tomlscripts/check_codeql_fixtures.pyscripts/convention_gate_map.yamlscripts/run_codeql_pack_validation.sh
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (13)
- GitHub Check: Build Backend
- GitHub Check: Build Fine-Tune (gpu, fine-tune-gpu)
- GitHub Check: Build Fine-Tune (cpu, fine-tune-cpu)
- GitHub Check: Dashboard Test
- GitHub Check: Test (Python 3.14)
- GitHub Check: Lighthouse Site
- GitHub Check: Build Web Assets (melange)
- GitHub Check: CLI Test (windows-latest)
- GitHub Check: CLI Bench Regression
- GitHub Check: CodSpeed Python benchmarks
- GitHub Check: Validate sanitiser pack
- GitHub Check: Analyze (python)
- GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (6)
.github/codeql/**/*.{yml,yaml,qll,ql}
📄 CodeRabbit inference engine (CLAUDE.md)
CodeQL Sanitisers: any change to a project sanitiser requires a matching binding in language-specific
.github/codeql/extensions/synthorg-{python,js,go}-sanitisers/*.model.ymlor custom.qlquery, plus fixture pairs under.github/codeql/fixtures/orcli/internal/codeqlfixtures/. Enforced by.github/workflows/codeql-pack-validate.ymlandscripts/check_codeql_fixtures.py.
Files:
.github/codeql/queries/synthorg-cleartext-logging/qlpack.yml.github/codeql/extensions/synthorg-js-sanitisers/qlpack.yml.github/codeql/extensions/synthorg-go-sanitisers/qlpack.yml.github/codeql/codeql-config.yml.github/codeql/queries/synthorg-partial-ssrf/qlpack.yml.github/codeql/queries/synthorg-partial-ssrf/Sanitisers.qll.github/codeql/extensions/synthorg-js-sanitisers/js-sanitisers.model.yml.github/codeql/queries/synthorg-cleartext-logging/Sanitisers.qll.github/codeql/extensions/synthorg-python-sanitisers/python-sanitisers.model.yml.github/codeql/extensions/synthorg-python-sanitisers/qlpack.yml.github/codeql/queries/synthorg-cleartext-logging/CleartextLoggingWithBarriers.ql.github/codeql/queries/synthorg-partial-ssrf/PartialServerSideRequestForgeryWithBarriers.ql.github/codeql/extensions/synthorg-go-sanitisers/go-sanitisers.model.yml
**/*.{md,rst}
📄 CodeRabbit inference engine (CLAUDE.md)
Doc Numeric Claims: numerics in README and public docs sourced from
data/runtime_stats.yamlvia<!--RS:NAME-->markers.
Files:
docs/reference/convention-gates.mdCLAUDE.mddocs/reference/sec-prompt-safety.md
**/*.md
📄 CodeRabbit inference engine (CLAUDE.md)
Diagrams: use
d2for architecture / nested containers,mermaidfor flowcharts / sequence / pipelines. Use markdown tables for tabular data. D2 theme 200 (Dark Mauve), D2 CLI pinned to v0.7.1 in CI.
Files:
docs/reference/convention-gates.mdCLAUDE.mddocs/reference/sec-prompt-safety.md
cli/**/*.go
📄 CodeRabbit inference engine (cli/CLAUDE.md)
cli/**/*.go: Organize CLI Go code withcmd/for Cobra commands andinternal/for packages (version, config, docker, compose, health, diagnostics, images, selfupdate, completion, ui, verify, backup, scaffold)
SetSYNTHORG_IMAGE_VERIFY_TIMEOUTdefault to 120s with a hard minimum of 1s
SetSYNTHORG_IMAGE_PULL_ATTEMPTSto range 1..100 with default 3, andSYNTHORG_IMAGE_PULL_RETRY_DELAYto default 2s with exponential backoff
SetSYNTHORG_MAX_API_RESPONSE_BYTESdefault to 4MiB,SYNTHORG_MAX_BINARY_BYTESto 256MiB, andSYNTHORG_MAX_ARCHIVE_ENTRY_BYTESto 128MiB
When bothSYNTHORG_DATABASE_URLandSYNTHORG_DB_PATHare present,SYNTHORG_DATABASE_URLtakes precedence; a malformed URL raises loudly at startup rather than silently falling back
Use port3000for web,3001for backend,3002for postgres, and3003for NATS client; validate port collisions across all enabled services ingenerate.go
In generatedcompose.yml, include a one-shotdata-inithelper that chowns each named volume to its non-root owner (65532:65532for backend/NATS,70:70with mode0700for Postgres) before stateful services start
Declare Postgres and NATS services withdepends_on: data-init: condition: service_completed_successfullyto ensure volume initialization completes before services start
In hint tier selection, useHintErrorfor error recovery (always visible unless--quiet),HintNextStepfor natural next actions,HintTipfor config automation suggestions (once/session deduplication), andHintGuidancefor flag/feature discovery (invisible in defaultautomode)
UseHintTipdeduplication within a session (same message shown at most once) and makeHintGuidanceinvisible in defaultautomode, visible only whensynthorg config set hints always
Config-driven overrides apply:color neverimplies--no-color,color alwaysforces color on non-TTYs,output jsonimplies--json, andhintsmode is config-only with options `always/auto/...
Files:
cli/internal/codeqlfixtures/fixture.go
**/*.{sh,bash}
📄 CodeRabbit inference engine (CLAUDE.md)
Shell: follow
~/.claude/rules/common/bash.md(canonical forcd/git -C/ Bash file-write rules).
Files:
scripts/run_codeql_pack_validation.sh
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
Read
docs/design/page before implementing; deviations need approval. See DESIGN_SPEC.md.Present every plan for accept/deny before coding.
Apply Regional Defaults: no region/currency/locale privileged; use metric units; British English.
Every convention PR ships its enforcement gate.
Configuration Precedence: DB > env > YAML > code default via
SettingsService/ConfigResolver; noos.environ.getoutside startup.No Hardcoded Numeric Values: numerics live in
settings/definitions/; allowlist only 0/1/-1, HTTP codes, hex masks, powers-of-2. Enforced byscripts/check_no_magic_numbers.py.Post-Implementation + Pre-PR Review: after issue work, branch + commit + push (no auto-PR); use
/pre-pr-reviewcommand. After PR creation: use/aurelio-review-prfor external feedback. Fix EVERYTHING valid; no deferring.Comments WHY only; no reviewer citations / issue back-refs / migration framing. Enforced by
check_no_review_origin_in_code.pyandcheck_no_migration_framing.py.No
from __future__ import annotations(3.14 has PEP 649). Use PEP 758 except:except A, B:no parens unless binding.Type hints on public functions; mypy strict. Google-style docstrings. Line length 88; functions <50 lines; files <800 lines.
Errors: define as
<Domain><Condition>Errorinheriting fromDomainError; never inherit fromException/RuntimeErrordirectly. Enforced bycheck_domain_error_hierarchy.py.Pydantic v2: use frozen +
extra='forbid'on API DTOs (Request/Response/Snapshot/Result/Envelope/Status/Info/Summary suffixes);@computed_fieldfor derived;NotBlankStrfor identifiers.Args models at every system boundary; use
parse_typed()for every external dict ingestion. Enforced bycheck_boundary_typed.py.Immutability: use
model_copy(update=...)orcopy.deepcopy(); apply deepcopy at system boundaries.Async patterns: use
asyncio.TaskGroupfor fan-out/fan-in; helpers must catchExceptionand re-raiseMemoryError/RecursionError.Clock se...
Files:
scripts/check_codeql_fixtures.py
🧠 Learnings (2)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T14:56:10.785Z
Learning: Git commits: use format `<type>: <description>` (feat/fix/refactor/docs/test/chore/perf/ci); enforced by commitizen.
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T14:56:10.785Z
Learning: Signed commits required on protected refs (GPG/SSH or GitHub App via `synthorg-repo-bot`).
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T14:56:10.785Z
Learning: Branches: use format `<type>/<slug>` from main.
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T14:56:10.785Z
Learning: Squash merge: PR body becomes squash commit; trailers (`Release-As`, `Closes `#N``) must be in PR body.
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T14:56:10.785Z
Learning: Workflow: after every squash merge, run `/post-merge-cleanup`.
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T14:56:10.785Z
Learning: GitHub queries: use `gh issue list` via Bash, NOT MCP `list_issues`.
📚 Learning: 2026-05-05T09:04:46.195Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1760
File: scripts/_dual_backend_parity_lib.py:215-216
Timestamp: 2026-05-05T09:04:46.195Z
Learning: This repository targets Python 3.14+ and follows PEP 758. Therefore, reviewer tooling should NOT treat unparenthesized multi-exception `except` clauses written without an `as` clause (e.g., `except MemoryError, RecursionError:`) as syntax errors. Only flag `except`-clause problems when they are genuinely invalid for Python 3.14+.
Applied to files:
scripts/check_codeql_fixtures.py
🪛 LanguageTool
CLAUDE.md
[uncategorized] ~19-~19: The official name of this software platform is spelled with a capital “H”.
Context: ...rierModelrow in the language-specific.github/codeql/extensions/synthorg-{python,js,g...
(GITHUB)
[uncategorized] ~19-~19: The official name of this software platform is spelled with a capital “H”.
Context: ...k, otherwise a sanitiser subclass under `.github/codeql/queries/synthorg-/Sanitise...
(GITHUB)
[uncategorized] ~19-~19: The official name of this software platform is spelled with a capital “H”.
Context: ... custom .ql) and a fixture pair under .github/codeql/fixtures/ (or `cli/internal/cod...
(GITHUB)
[uncategorized] ~19-~19: The official name of this software platform is spelled with a capital “H”.
Context: ...internal/codeqlfixtures/). Enforced by .github/workflows/codeql-pack-validate.yml+...
(GITHUB)
docs/reference/sec-prompt-safety.md
[uncategorized] ~151-~151: The official name of this software platform is spelled with a capital “H”.
Context: ...aces. Each binding is provably precise: .github/workflows/codeql-pack-validate.yml run...
(GITHUB)
[uncategorized] ~153-~153: The official name of this software platform is spelled with a capital “H”.
Context: ... 1 -- Models-as-Data extension packs** (`.github/codeql/extensions/synthorg-{python,js,g...
(GITHUB)
[uncategorized] ~155-~155: The official name of this software platform is spelled with a capital “H”.
Context: ...Surface 2 -- custom QL query packs (.github/codeql/queries/synthorg-*/). For queri...
(GITHUB)
[uncategorized] ~155-~155: Do not mix variants of the same word (‘sanitizer’ and ‘sanitiser’) within a single text.
Context: ...nthorg-*/). For queries whose abstract Sanitizerclass has nobarrierNode` MaD hook up...
(EN_WORD_COHERENCY)
[uncategorized] ~155-~155: Do not mix variants of the same word (‘sanitizer’ and ‘sanitiser’) within a single text.
Context: ...ass extends the upstream abstract base (CleartextLogging::Sanitizer, ServerSideRequestForgery::Sanitizer...
(EN_WORD_COHERENCY)
[uncategorized] ~155-~155: Do not mix variants of the same word (‘sanitizer’ and ‘sanitiser’) within a single text.
Context: ...ct base (CleartextLogging::Sanitizer, ServerSideRequestForgery::Sanitizer, etc.), which uses American spelling b...
(EN_WORD_COHERENCY)
[uncategorized] ~170-~170: The official name of this software platform is spelled with a capital “H”.
Context: ...urface 1) or a sanitiser subclass under `.github/codeql/queries/synthorg-/Sanitise...
(GITHUB)
[uncategorized] ~170-~170: The official name of this software platform is spelled with a capital “H”.
Context: ...ixture pair (negative + positive) under .github/codeql/fixtures/ (or `cli/internal/cod...
(GITHUB)
[uncategorized] ~178-~178: The official name of this software platform is spelled with a capital “H”.
Context: ... paths-ignore: ['tests/**'] clause in .github/codeql/codeql-config.yml. Every histor...
(GITHUB)
🔇 Additional comments (30)
pyproject.toml (1)
270-275: Scoped Ruff fixture ignores are well-contained.The new per-file ignore block is narrowly targeted to CodeQL Python fixtures and avoids weakening linting globally.
data/runtime_stats.yaml (1)
46-47: Convention gate count update is consistent.
convention_gatesnow reflects the new enforcement gate added in this PR.CLAUDE.md (2)
19-19: New mandatory CodeQL sanitiser policy is explicit and actionable.The rule clearly links sanitiser changes to binding requirements, fixture coverage, and enforcement checks.
46-46: Gate inventory reference is in sync with this PR.Updating to “38 enforcement gates + meta-gate” matches the introduced convention-gate expansion.
.github/codeql/queries/synthorg-partial-ssrf/Sanitisers.qll (1)
1-36: Custom partial-SSRF sanitiser modelling looks correct.Good choice using a custom
ServerSideRequestForgery::Sanitizersubclass for the partial-SSRF path where Models-as-DatabarrierNodeisn’t available..github/codeql/extensions/synthorg-js-sanitisers/js-sanitisers.model.yml (1)
13-39: JS barrier models are appropriately scoped.The three
barrierModelentries are tightly scoped tolog-injectionand target the intended sanitizer return values..github/codeql/extensions/synthorg-js-sanitisers/qlpack.yml (1)
1-7: QL pack wiring is clean and sufficient.
extensionTargetsanddataExtensionsare correctly configured for the JS sanitiser model pack..github/codeql/fixtures/javascript/positive.ts (1)
7-15: Positive fixture cases are clear and purposeful.These functions provide straightforward unsanitized sink flows to validate that
js/log-injectionstill fires under the extension pack..github/codeql/fixtures/javascript/negative.ts (1)
13-33: Negative fixtures now exercise each sanitiser barrier effectively.The
sanitizeWsEnumpath correctly includes sink-flow coverage, so this file now validates all three JS models in non-alert scenarios..github/codeql/queries/synthorg-partial-ssrf/PartialServerSideRequestForgeryWithBarriers.ql (1)
21-34: Partial-SSRF query wiring looks correct.The flow/path construction and project-barrier import are consistent with a custom path-problem override.
.github/codeql/queries/synthorg-partial-ssrf/qlpack.yml (1)
1-5: QLPack manifest is clean and valid.Pack metadata and dependency declaration are correctly structured for this custom query pack.
cli/internal/codeqlfixtures/fixture.go (2)
22-39: Path-injection fixture pair is well-formed.The negative/positive split is explicit and suitable for deterministic CodeQL sanitizer validation.
As per coding guidelines, CodeQL sanitiser changes should include fixture pairs, and this segment provides that coverage pattern.
41-63: Log-injection fixture pair is clear and testable.The interface shape and argument choices create a good sanitized-vs-unsanitized contrast for rule assertions.
docs/reference/convention-gates.md (1)
15-51: Convention-gate inventory update is consistent.The new gate entry and RS total increment are synchronized correctly.
.github/codeql/queries/synthorg-cleartext-logging/qlpack.yml (1)
1-5: Cleartext-logging qlpack manifest looks good.Metadata and dependency declaration are correctly defined.
scripts/convention_gate_map.yaml (1)
78-81: New mandatory rule registration is correctly wired.The
CLAUDE.mdheader mapping toscripts/check_codeql_fixtures.pyis complete and consistent.As per coding guidelines, new conventions should be backed by enforcement gates, and this mapping adds that enforcement hook.
.github/codeql/extensions/synthorg-go-sanitisers/qlpack.yml (1)
1-7: Go sanitiser extension pack manifest is correct.Target binding and model extension registration are properly configured.
.github/codeql/extensions/synthorg-python-sanitisers/qlpack.yml (1)
1-7: Python sanitiser extension pack manifest is sound.The manifest correctly wires Python target scope to the model file.
.github/codeql/codeql-config.yml (1)
3-33: CodeQL config wiring looks consistent.The query replacement (
query-filters+ local custom packs) and fixture exclusions are correctly aligned..github/codeql/queries/synthorg-cleartext-logging/Sanitisers.qll (1)
24-38: Sanitiser subclass implementation is well-scoped.The API graph bindings cover both direct redaction-module calls and the observability re-export path.
.github/codeql/fixtures/expected.json (1)
4-57: Fixture expectation manifest is comprehensive.Positive/negative assertions are clearly separated per language and should support stable SARIF diffing.
.github/codeql/fixtures/python/negative.py (1)
26-69: Negative fixture paths are correctly constructed for barrier validation.The three functions cleanly represent sanitized flows for the target Python rules.
.github/workflows/codeql-pack-validate.yml (1)
20-72: Workflow hardening is solid.Least-privilege permissions, pinned actions, scoped triggers, and failure artifact upload are all in place.
.github/codeql/extensions/synthorg-python-sanitisers/python-sanitisers.model.yml (1)
25-35: Python barrier rows are explicit and well-targeted.Module-specific
_resolve_rootmodeling is clear and avoids overbroad matching..github/codeql/extensions/synthorg-go-sanitisers/go-sanitisers.model.yml (1)
17-33: Go sanitizer barrier entries look correct.Both
SecurePathand wrappersafeStateDirare modeled with the expected GobarrierModelshape andpath-injectionkind..github/workflows/codeql.yml (1)
15-31: Strong hardening in the advanced CodeQL workflow.Pinned SHAs, root-level empty permissions with per-job grants, and clean init/analyze wiring look solid.
Also applies to: 67-79
.github/codeql/queries/synthorg-cleartext-logging/CleartextLoggingWithBarriers.ql (1)
29-37: Query implementation looks correct for barrier-aware flow reporting.The source/sink path selection and custom rule metadata are consistent with the intended sanitizer-aware variant.
scripts/run_codeql_pack_validation.sh (1)
56-87: Validation harness structure is clean and maintainable.
run_fixtureplus the final expectation diff step gives a clear single-path validation flow for local and CI runs.Also applies to: 103-133
.github/codeql/fixtures/python/positive.py (1)
16-45: Good positive-fixture coverage for leak paths.The three unsanitized cases (cleartext logging, path injection, partial SSRF) are well targeted for must-fire regression checks.
scripts/check_codeql_fixtures.py (1)
127-170: The SARIF assertion pipeline is robust.Scope filtering plus rule/function assertions and aggregated failure reporting make this a strong convention-gate checker.
Also applies to: 181-238
| **Surface 1 -- Models-as-Data extension packs** (`.github/codeql/extensions/synthorg-{python,js,go}-sanitisers/`). One pack per target language, each declaring `barrierModel` rows for the queries whose CodeQL standard-library `Customizations.qll` exposes a `barrierNode("<kind>")` Models-as-Data hook. The per-language split is required because `barrierModel` signatures differ across language packs (Python and JS take 3 user-provided columns, Go takes 9), and a single mixed pack with multi-language `extensionTargets` causes CodeQL to validate every YAML row's arity against every target pack's signature -- the Go 9-col rows would then fail the Python and JS signature checks. Today: Python `path-injection`, JavaScript `log-injection`, Go `path-injection`. | ||
|
|
||
| **Surface 2 -- custom QL query packs** (`.github/codeql/queries/synthorg-*/`). For queries whose abstract `Sanitizer` class has no `barrierNode` MaD hook upstream, the pack defines a QL sanitiser subclass (in `Sanitisers.qll`) and ships a custom query (re-running the standard flow analysis with the subclass in scope). The subclass extends the upstream abstract base (`CleartextLogging::Sanitizer`, `ServerSideRequestForgery::Sanitizer`, etc.), which uses American spelling because the upstream class is named that way; our subclass names use the British "Sanitiser" form. The standard rule is excluded via `query-filters` in `codeql-config.yml`, and the custom query emits results under a `synthorg/<rule>` ID. Today that covers Python `clear-text-logging-sensitive-data` and `partial-ssrf`. | ||
|
|
||
| | Sanitiser | File | Rule | Surface | Runtime guarantee | | ||
| |---|---|---|---|---| | ||
| | `safe_error_description` | `src/synthorg/observability/redaction.py` | `synthorg/clear-text-logging-sensitive-data` | Custom QL query | Strips OAuth tokens, JSON credential values, URI userinfo, `Authorization:` headers, and Fernet ciphertexts from exception messages; truncates to `MAX_SCRUBBED_LENGTH` | | ||
| | `scrub_secret_tokens` | `src/synthorg/observability/redaction.py` | `synthorg/clear-text-logging-sensitive-data` | Custom QL query | Lower-level helper called by `safe_error_description`; pattern-replace credential shapes idempotently | | ||
| | `_resolve_root` (7 sites under `scripts/`) | `scripts/check_*.py` | `py/path-injection` | `barrierModel` | Resolves the candidate, asserts containment via `os.path.commonpath` and `Path.relative_to`; returns `None` on escape | | ||
| | `_validate_repo_prefix` | `scripts/check_image_signatures.py` | `synthorg/partial-ssrf` | Custom QL query | Anchored regex matches the OCI repo grammar; `\A...\Z` rejects newlines and CR | | ||
| | `_validate_image_tag` | `scripts/check_image_signatures.py` | `synthorg/partial-ssrf` | Custom QL query | Anchored regex matches the OCI image-name + tag grammar | | ||
| | `config.SecurePath` | `cli/internal/config/paths.go` | `go/path-injection` | `barrierModel` | `filepath.Clean` + absolute-path assertion | | ||
| | `safeStateDir` | `cli/cmd/root.go` | `go/path-injection` | `barrierModel` | Wrapper around `config.SecurePath` keyed on the loaded `State.DataDir` | | ||
| | `sanitizeForLog` | `web/src/utils/logging.ts` | `js/log-injection` | `barrierModel` | Strips C0/C1 controls + Unicode BIDI overrides; truncates to `maxLen` | | ||
| | `sanitizeWsString` | `web/src/utils/ws-sanitize.ts` | `js/log-injection` | `barrierModel` | Calls `sanitizeForLog` with the WS-payload length cap | | ||
| | `sanitizeWsEnum` | `web/src/utils/ws-sanitize.ts` | `js/log-injection` | `barrierModel` | Allowlist-validated; returns either the input or a literal fallback constant | | ||
|
|
||
| Adding a new sanitiser is a four-part change in one PR: (1) the helper itself, (2) a matching binding -- either a row in the language-specific `synthorg-{python,js,go}-sanitisers/*-sanitisers.model.yml` (Surface 1) or a sanitiser subclass under `.github/codeql/queries/synthorg-<rule>/Sanitisers.qll` (Surface 2), (3) a fixture pair (negative + positive) under `.github/codeql/fixtures/` (or `cli/internal/codeqlfixtures/` for Go), and (4) its `expected.json` entry. Pick Surface 1 if the upstream query exposes a `barrierNode` MaD hook for the relevant kind; otherwise Surface 2. The standalone `codeql-pack-validate.yml` workflow runs on every pack-related change and fails if either fixture diverges from its expectation. |
There was a problem hiding this comment.
Replace hard-coded numeric claims with runtime-stats markers (or non-numeric phrasing).
Line 153, Line 170, and Line 174 introduce fixed numeric claims in a public doc without <!--RS:...--> markers. Please source these values via runtime-stats markers or rewrite to avoid explicit counts.
As per coding guidelines: **/*.{md,rst}: Doc Numeric Claims: numerics in README and public docs sourced from data/runtime_stats.yaml via <!--RS:NAME--> markers.
Also applies to: 174-179
🧰 Tools
🪛 LanguageTool
[uncategorized] ~153-~153: The official name of this software platform is spelled with a capital “H”.
Context: ... 1 -- Models-as-Data extension packs** (`.github/codeql/extensions/synthorg-{python,js,g...
(GITHUB)
[uncategorized] ~155-~155: The official name of this software platform is spelled with a capital “H”.
Context: ...Surface 2 -- custom QL query packs (.github/codeql/queries/synthorg-*/). For queri...
(GITHUB)
[uncategorized] ~155-~155: Do not mix variants of the same word (‘sanitizer’ and ‘sanitiser’) within a single text.
Context: ...nthorg-*/). For queries whose abstract Sanitizerclass has nobarrierNode` MaD hook up...
(EN_WORD_COHERENCY)
[uncategorized] ~155-~155: Do not mix variants of the same word (‘sanitizer’ and ‘sanitiser’) within a single text.
Context: ...ass extends the upstream abstract base (CleartextLogging::Sanitizer, ServerSideRequestForgery::Sanitizer...
(EN_WORD_COHERENCY)
[uncategorized] ~155-~155: Do not mix variants of the same word (‘sanitizer’ and ‘sanitiser’) within a single text.
Context: ...ct base (CleartextLogging::Sanitizer, ServerSideRequestForgery::Sanitizer, etc.), which uses American spelling b...
(EN_WORD_COHERENCY)
[uncategorized] ~170-~170: The official name of this software platform is spelled with a capital “H”.
Context: ...urface 1) or a sanitiser subclass under `.github/codeql/queries/synthorg-/Sanitise...
(GITHUB)
[uncategorized] ~170-~170: The official name of this software platform is spelled with a capital “H”.
Context: ...ixture pair (negative + positive) under .github/codeql/fixtures/ (or `cli/internal/cod...
(GITHUB)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@docs/reference/sec-prompt-safety.md` around lines 153 - 170, The doc contains
hard-coded numeric claims (e.g., counts referenced near the "Surface 1 --
Models-as-Data extension packs" and "Surface 2 -- custom QL query packs"
sections and in the table rows describing sanitiser coverage) that must be
replaced with runtime-stats markers or rewritten non-numerically; update the
markdown to use <!--RS:NAME--> markers pulling from runtime_stats.yaml (or
change phrasing to avoid explicit numbers) where the text currently asserts
concrete counts, and ensure references to specific sanitiser examples like
sanitizeForLog, sanitizeWsString, safe_error_description, scrub_secret_tokens,
and the Surface 1/Surface 2 headings reflect the marker usage or non-numeric
wording.
… discovery + per-variant SARIF analysis + _resolve_root fixture call)
…r queries lacking MaD hooks (cleartext-logging, partial-ssrf), drop stale _resolve_root entry, fix shellcheck directive
…void go 9-col rows tripping js/py 3-col signature check), remove duplicate codeql language matrix entries + workflow-level queries override, fix sanitizeWsEnum fixture to flow into a sink
…ig (was forcing cross-language barrierModel signature collisions), rename Sanitizers.qll to British Sanitisers.qll
…erModel arity collision blocks every analysis); keep custom QL query packs (cleartext-logging + partial-ssrf) which work via Sanitiser subclass mechanism; defer remaining barriers to follow-up
04a9fac to
aae666b
Compare
| extra={ | ||
| "error_type": type(exc).__name__, | ||
| "error": safe_error_description(exc), | ||
| }, |
| safe_repo = urllib.parse.quote(f"{repo_prefix}{pair.image}", safe="/") | ||
| safe_tag = urllib.parse.quote(pair.tag, safe="") | ||
| url = f"https://ghcr.io/v2/{safe_repo}/manifests/{safe_tag}" | ||
| req = urllib.request.Request(url, method="HEAD") |
| components flow straight into the URL path. | ||
| """ | ||
| url = f"https://ghcr.io/v2/{repo}/manifests/{tag}" | ||
| req = urllib.request.Request(url, method="HEAD") |
There was a problem hiding this comment.
Actionable comments posted: 2
♻️ Duplicate comments (4)
CLAUDE.md (1)
19-19:⚠️ Potential issue | 🟠 Major | ⚡ Quick winThe CodeQL sanitiser rule should allow both binding mechanisms.
This wording mandates only the custom
Sanitisers.qllpath. It should also allow Models-as-DatabarrierModelrows when the upstream query exposes abarrierNodehook, otherwise the policy is stricter than the intended implementation model.Suggested patch
-- **CodeQL Sanitisers (MANDATORY)**: any change to a project sanitiser (helper that closes a CodeQL data-flow rule) requires a sanitiser subclass under `.github/codeql/queries/synthorg-<rule>/Sanitisers.qll` plus the matching custom `.ql` and a fixture pair under `.github/codeql/fixtures/`. Enforced by `.github/workflows/codeql-pack-validate.yml` + `scripts/check_codeql_fixtures.py`. See [docs/reference/sec-prompt-safety.md](docs/reference/sec-prompt-safety.md#codeql-barrier-inventory). +- **CodeQL Sanitisers (MANDATORY)**: any change to a project sanitiser (helper that closes a CodeQL data-flow rule) requires either a `barrierModel` row in `.github/codeql/extensions/synthorg-*/**/*.model.yml` (when the upstream query exposes a `barrierNode` hook) or a sanitiser subclass under `.github/codeql/queries/synthorg-<rule>/Sanitisers.qll` plus the matching custom `.ql`, and a fixture pair under `.github/codeql/fixtures/` (or `cli/internal/codeqlfixtures/`). Enforced by `.github/workflows/codeql-pack-validate.yml` + `scripts/check_codeql_fixtures.py`. See [docs/reference/sec-prompt-safety.md](docs/reference/sec-prompt-safety.md#codeql-barrier-inventory).🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@CLAUDE.md` at line 19, The CLAUDE.md sentence forces only custom Sanitisers.qll path; change it to permit either a sanitiser subclass under Sanitisers.qll OR Models-as-Data barrierModel rows when the upstream query exposes a barrierNode hook, and clarify that the barrierModel mechanism is acceptable and equivalent to the Sanitisers.qll requirement; reference Sanitisers.qll, barrierModel, and barrierNode in the updated wording so reviewers can verify both binding mechanisms are allowed..github/codeql/codeql-config.yml (1)
7-12:⚠️ Potential issue | 🟠 Major | ⚡ Quick winExclude
cli/internal/codeqlfixtures/**from the main CodeQL scan.Intentionally vulnerable Go fixtures can be reported as real repository alerts if this directory is not ignored.
Suggested patch
paths-ignore: - 'tests/**' - 'web/__mocks__/**' - 'web/e2e/**' - '.github/codeql/fixtures/**' + - 'cli/internal/codeqlfixtures/**'🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/codeql/codeql-config.yml around lines 7 - 12, Add the cli/internal/codeqlfixtures/** pattern to the CodeQL paths-ignore list so the intentionally vulnerable Go fixtures are excluded from scans; update the paths-ignore block (the existing paths-ignore entry containing 'tests/**', 'web/__mocks__/**', etc.) to include 'cli/internal/codeqlfixtures/**' ensuring the pattern matches that directory and its contents.docs/reference/sec-prompt-safety.md (1)
160-168:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winUse runtime-stats markers (or non-numeric phrasing) for numeric claims in this public doc.
This section hard-codes numeric claims in prose. Please replace with
<!--RS:...-->markers or rewrite to avoid explicit numerics.As per coding guidelines:
{README.md,docs/**/*.md}: Numerics in README + public docs sourced from data/runtime_stats.yaml via <!--RS:NAME--> markers per data/README.md.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/reference/sec-prompt-safety.md` around lines 160 - 168, The doc section currently contains hard-coded numeric claims; replace those explicit numerics with runtime-stats markers using the exact syntax <!--RS:NAME--> (or rephrase to avoid numeric claims), update the corresponding NAME key in data/runtime_stats.yaml if missing, and ensure the README.md/docs/**/*.md files follow the guideline by referencing the same <!--RS:...--> marker; locate the text in sec-prompt-safety.md and swap the numerics for the marker(s) and verify the marker name exists in data/runtime_stats.yaml and is documented in README.md per the project's coding guidelines.scripts/run_codeql_pack_validation.sh (1)
81-102:⚠️ Potential issue | 🟠 Major | 🏗️ Heavy liftFixture validation is currently Python-only, leaving non-Python regressions unchecked.
The script states it validates fixture expectations, but it only runs
python-negative/python-positive. Add Go and JavaScript/TypeScript fixture analyses (or drive execution fromexpected.json) so pack validation actually covers all declared fixture sets.Suggested direction
run_fixture python-negative python "." \ "codeql/python-queries:codeql-suites/python-security-extended.qls" \ "$PYTHON_CUSTOM_QUERIES" run_fixture python-positive python "." \ "codeql/python-queries:codeql-suites/python-security-extended.qls" \ "$PYTHON_CUSTOM_QUERIES" +# Add language fixtures that exist in expected.json +# (or replace this block with a manifest-driven loop). +run_fixture javascript-negative javascript-typescript "." \ + "codeql/javascript-queries:codeql-suites/javascript-security-extended.qls" +run_fixture javascript-positive javascript-typescript "." \ + "codeql/javascript-queries:codeql-suites/javascript-security-extended.qls" +run_fixture go-negative go "cli/internal" \ + "codeql/go-queries:codeql-suites/go-security-extended.qls" +run_fixture go-positive go "cli/internal" \ + "codeql/go-queries:codeql-suites/go-security-extended.qls"🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@scripts/run_codeql_pack_validation.sh` around lines 81 - 102, The validation only runs Python fixtures (PYTHON_CUSTOM_QUERIES + run_fixture calls for python-negative and python-positive); add equivalent runs for Go and JavaScript/TypeScript fixture sets (e.g., invoke run_fixture with go-negative/go-positive and js-negative/js-positive using appropriate CodeQL suites and any custom queries) or update the harness to iterate expected.json to drive run_fixture for every declared fixture set; update the script around the existing run_fixture calls so run_fixture is invoked for Go and JS/TS fixtures (or replace the hardcoded python calls with a loop over expected.json) and ensure rule-filtering behavior mirrors the Python path.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.github/codeql/fixtures/expected.json:
- Around line 4-24: The manifest only lists Python fixtures ("python-negative"
and "python-positive"), so add equivalent fixture objects for
JavaScript/TypeScript and Go to ensure those languages are validated; create
negative and positive entries for JS/TS (e.g., "js-negative"/"js-positive" or
"javascript-negative"/"javascript-positive") and for Go (e.g.,
"go-negative"/"go-positive") that mirror the Python structure, include
appropriate "scan_paths" pointing to the JS/TS and Go fixture files, and set
"must_fire" / "must_not_fire" arrays to the same rule IDs
("synthorg/clear-text-logging-sensitive-data", "synthorg/partial-ssrf") so the
pack validates multi-language sanitiser modelling.
In @.github/codeql/queries/synthorg-partial-ssrf/Sanitisers.qll:
- Around line 30-35: Update the binding to also match a fully-qualified import
of the target module: add an alternate pattern that uses
API::moduleImport("scripts.check_image_signatures") with
getMember(["_validate_repo_prefix", "_validate_image_tag"]).getACall() (in
addition to the existing
API::moduleImport("scripts").getMember("check_image_signatures") chain) so both
chained imports and direct moduleImport("scripts.check_image_signatures") cases
are covered for the functions _validate_repo_prefix and _validate_image_tag.
---
Duplicate comments:
In @.github/codeql/codeql-config.yml:
- Around line 7-12: Add the cli/internal/codeqlfixtures/** pattern to the CodeQL
paths-ignore list so the intentionally vulnerable Go fixtures are excluded from
scans; update the paths-ignore block (the existing paths-ignore entry containing
'tests/**', 'web/__mocks__/**', etc.) to include
'cli/internal/codeqlfixtures/**' ensuring the pattern matches that directory and
its contents.
In `@CLAUDE.md`:
- Line 19: The CLAUDE.md sentence forces only custom Sanitisers.qll path; change
it to permit either a sanitiser subclass under Sanitisers.qll OR Models-as-Data
barrierModel rows when the upstream query exposes a barrierNode hook, and
clarify that the barrierModel mechanism is acceptable and equivalent to the
Sanitisers.qll requirement; reference Sanitisers.qll, barrierModel, and
barrierNode in the updated wording so reviewers can verify both binding
mechanisms are allowed.
In `@docs/reference/sec-prompt-safety.md`:
- Around line 160-168: The doc section currently contains hard-coded numeric
claims; replace those explicit numerics with runtime-stats markers using the
exact syntax <!--RS:NAME--> (or rephrase to avoid numeric claims), update the
corresponding NAME key in data/runtime_stats.yaml if missing, and ensure the
README.md/docs/**/*.md files follow the guideline by referencing the same
<!--RS:...--> marker; locate the text in sec-prompt-safety.md and swap the
numerics for the marker(s) and verify the marker name exists in
data/runtime_stats.yaml and is documented in README.md per the project's coding
guidelines.
In `@scripts/run_codeql_pack_validation.sh`:
- Around line 81-102: The validation only runs Python fixtures
(PYTHON_CUSTOM_QUERIES + run_fixture calls for python-negative and
python-positive); add equivalent runs for Go and JavaScript/TypeScript fixture
sets (e.g., invoke run_fixture with go-negative/go-positive and
js-negative/js-positive using appropriate CodeQL suites and any custom queries)
or update the harness to iterate expected.json to drive run_fixture for every
declared fixture set; update the script around the existing run_fixture calls so
run_fixture is invoked for Go and JS/TS fixtures (or replace the hardcoded
python calls with a loop over expected.json) and ensure rule-filtering behavior
mirrors the Python path.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: d2aefa2b-007c-48a2-bd2c-496095557c41
📒 Files selected for processing (20)
.github/codeql/codeql-config.yml.github/codeql/fixtures/expected.json.github/codeql/fixtures/python/negative.py.github/codeql/fixtures/python/positive.py.github/codeql/queries/synthorg-cleartext-logging/CleartextLoggingWithBarriers.ql.github/codeql/queries/synthorg-cleartext-logging/Sanitisers.qll.github/codeql/queries/synthorg-cleartext-logging/qlpack.yml.github/codeql/queries/synthorg-partial-ssrf/PartialServerSideRequestForgeryWithBarriers.ql.github/codeql/queries/synthorg-partial-ssrf/Sanitisers.qll.github/codeql/queries/synthorg-partial-ssrf/qlpack.yml.github/workflows/codeql-pack-validate.yml.github/workflows/codeql.ymlCLAUDE.mddata/runtime_stats.yamldocs/reference/convention-gates.mddocs/reference/sec-prompt-safety.mdpyproject.tomlscripts/check_codeql_fixtures.pyscripts/convention_gate_map.yamlscripts/run_codeql_pack_validation.sh
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
- GitHub Check: Build Backend
- GitHub Check: Build Fine-Tune (cpu, fine-tune-cpu)
- GitHub Check: Build Fine-Tune (gpu, fine-tune-gpu)
- GitHub Check: Build Web Assets (melange)
- GitHub Check: Dashboard Test
- GitHub Check: Test (Python 3.14)
- GitHub Check: Lighthouse Site
- GitHub Check: CodSpeed Python benchmarks
- GitHub Check: Analyze (python)
- GitHub Check: Validate sanitiser pack
- GitHub Check: Analyze (python)
🧰 Additional context used
📓 Path-based instructions (4)
.github/codeql/**/*.qll
📄 CodeRabbit inference engine (CLAUDE.md)
Any change to project sanitiser requires subclass under
.github/codeql/queries/synthorg-<rule>/Sanitisers.qllplus matching custom.qland fixture pair under.github/codeql/fixtures/per sec-prompt-safety.md
Files:
.github/codeql/queries/synthorg-partial-ssrf/Sanitisers.qll.github/codeql/queries/synthorg-cleartext-logging/Sanitisers.qll
**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
No
from __future__ import annotations(Python 3.14 has PEP 649). Use PEP 758 except forexcept A, B:which needs parens when binding
Files:
scripts/check_codeql_fixtures.py
{README.md,docs/**/*.md}
📄 CodeRabbit inference engine (CLAUDE.md)
Numerics in README + public docs sourced from
data/runtime_stats.yamlvia<!--RS:NAME-->markers perdata/README.md
Files:
docs/reference/sec-prompt-safety.mddocs/reference/convention-gates.md
docs/**/*.md
📄 CodeRabbit inference engine (CLAUDE.md)
Use
d2for architecture / nested containers; usemermaidfor flowcharts / sequence / pipelines; markdown tables for tabular data; D2 theme 200 (Dark Mauve), D2 CLI pinned to v0.7.1 in CI
Files:
docs/reference/sec-prompt-safety.mddocs/reference/convention-gates.md
🧠 Learnings (2)
📓 Common learnings
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T15:24:52.675Z
Learning: Read `docs/design/` page before implementing; deviations need approval per DESIGN_SPEC.md
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T15:24:52.675Z
Learning: Present every plan for accept/deny before coding
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T15:24:52.675Z
Learning: No region/currency/locale privileged; use metric units; use British English per `docs/reference/regional-defaults.md`
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T15:24:52.675Z
Learning: Every convention PR ships its enforcement gate per `docs/reference/convention-gates.md`
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T15:24:52.675Z
Learning: Timeout/slow test failures indicate source-code regression; never edit `tests/baselines/unit_timing.json`
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T15:24:52.675Z
Learning: After issue: branch + commit + push (no auto-PR); use `/pre-pr-review`; use `/aurelio-review-pr` for external feedback after PR; fix EVERYTHING valid; no deferring
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T15:24:52.675Z
Learning: xdist `-n 8 --dist=loadfile` auto-applied via pyproject `addopts` (`loadfile` prevents 3.14+Windows ProactorEventLoop leak)
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T15:24:52.675Z
Learning: Git commits: `<type>: <description>` (feat/fix/refactor/docs/test/chore/perf/ci); commitizen-enforced
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T15:24:52.675Z
Learning: Signed commits required on protected refs (GPG/SSH or GitHub App via `synthorg-repo-bot`)
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T15:24:52.675Z
Learning: Git branches: `<type>/<slug>` from main
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T15:24:52.675Z
Learning: Squash merge with PR body as squash commit; trailers (`Release-As`, `Closes `#N``) must be in PR body
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T15:24:52.675Z
Learning: After every squash merge → `/post-merge-cleanup`
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T15:24:52.675Z
Learning: CLI is Docker-only (init/start/stop/status); features go in dashboard + REST API
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T15:24:52.675Z
Learning: Web: see `web/CLAUDE.md`; CLI: see `cli/CLAUDE.md` (use `go -C cli`, never `cd cli`); Shell: see `~/.claude/rules/common/bash.md` (canonical for `cd` / `git -C` / Bash file-write rules)
Learnt from: CR
Repo: Aureliolo/synthorg
Timestamp: 2026-05-09T15:24:52.675Z
Learning: Use `uv sync` for dependency management and all project commands via UV
📚 Learning: 2026-05-05T09:04:46.195Z
Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1760
File: scripts/_dual_backend_parity_lib.py:215-216
Timestamp: 2026-05-05T09:04:46.195Z
Learning: This repository targets Python 3.14+ and follows PEP 758. Therefore, reviewer tooling should NOT treat unparenthesized multi-exception `except` clauses written without an `as` clause (e.g., `except MemoryError, RecursionError:`) as syntax errors. Only flag `except`-clause problems when they are genuinely invalid for Python 3.14+.
Applied to files:
scripts/check_codeql_fixtures.py
🪛 LanguageTool
CLAUDE.md
[uncategorized] ~19-~19: The official name of this software platform is spelled with a capital “H”.
Context: ...le) requires a sanitiser subclass under `.github/codeql/queries/synthorg-/Sanitise...
(GITHUB)
[uncategorized] ~19-~19: The official name of this software platform is spelled with a capital “H”.
Context: ...g custom .ql and a fixture pair under .github/codeql/fixtures/. Enforced by `.github...
(GITHUB)
[uncategorized] ~19-~19: The official name of this software platform is spelled with a capital “H”.
Context: ....github/codeql/fixtures/. Enforced by .github/workflows/codeql-pack-validate.yml + `...
(GITHUB)
docs/reference/sec-prompt-safety.md
[uncategorized] ~151-~151: The official name of this software platform is spelled with a capital “H”.
Context: ...tiser helpers via custom query packs at .github/codeql/queries/synthorg-*/. Each pack ...
(GITHUB)
[uncategorized] ~151-~151: Do not mix variants of the same word (‘sanitizer’ and ‘sanitiser’) within a single text.
Context: ...ass extends the upstream abstract base (CleartextLogging::Sanitizer, ServerSideRequestForgery::Sanitizer...
(EN_WORD_COHERENCY)
[uncategorized] ~151-~151: Do not mix variants of the same word (‘sanitizer’ and ‘sanitiser’) within a single text.
Context: ...ct base (CleartextLogging::Sanitizer, ServerSideRequestForgery::Sanitizer); the upstream base uses American spel...
(EN_WORD_COHERENCY)
[uncategorized] ~151-~151: The official name of this software platform is spelled with a capital “H”.
Context: ... ID. The bindings are provably precise: .github/workflows/codeql-pack-validate.yml run...
(GITHUB)
[uncategorized] ~160-~160: The official name of this software platform is spelled with a capital “H”.
Context: ... itself, (2) a sanitiser subclass under `.github/codeql/queries/synthorg-/Sanitise...
(GITHUB)
[uncategorized] ~172-~172: The official name of this software platform is spelled with a capital “H”.
Context: ... paths-ignore: ['tests/**'] clause in .github/codeql/codeql-config.yml. Every histor...
(GITHUB)
🔇 Additional comments (8)
pyproject.toml (1)
270-275: Scoped Ruff ignores for CodeQL fixtures look correct.The new per-file ignore target is narrowly scoped and appropriate for intentionally unsafe fixture patterns.
.github/codeql/queries/synthorg-cleartext-logging/CleartextLoggingWithBarriers.ql (1)
29-36: Query wiring and result projection look solid.Flow, classification extraction, and path-problem reporting are consistent with the custom-barrier rule intent.
.github/codeql/queries/synthorg-cleartext-logging/qlpack.yml (1)
1-5: Pack manifest is correctly structured for a non-library query pack..github/codeql/queries/synthorg-partial-ssrf/PartialServerSideRequestForgeryWithBarriers.ql (1)
26-34: Partial-SSRF query logic is well-formed.Source/sink/request binding and the
fullyControlledRequestexclusion are correctly composed for this variant..github/codeql/queries/synthorg-partial-ssrf/qlpack.yml (1)
1-5: Pack declaration is clean and consistent with the custom query setup..github/codeql/fixtures/python/negative.py (1)
24-52: Negative fixtures are well-targeted for barrier validation.Both functions cleanly exercise the intended “must not fire” paths for cleartext logging and partial SSRF.
.github/codeql/queries/synthorg-cleartext-logging/Sanitisers.qll (1)
27-37: ⚡ Quick winThe current sanitizer implementation is correct and already covers both import patterns used throughout the codebase (
from synthorg.observability.redaction import ...andfrom synthorg.observability import ...). The chainedgetMember()pattern is the standard CodeQL idiom used consistently across this repository's sanitizers. The proposed fully-qualifiedmoduleImport()paths do not appear in any existing CodeQL examples in the codebase and do not represent a valid improvement to the current working implementation.> Likely an incorrect or invalid review comment..github/codeql/fixtures/python/positive.py (1)
15-35: Positive fixture intent is clear and correctly implemented.The deliberate unsanitised paths are appropriate for
must_firevalidation and make regression signals explicit.
| { | ||
| "name": "python-negative", | ||
| "language": "python", | ||
| "source_root": ".", | ||
| "scan_paths": [".github/codeql/fixtures/python/negative.py"], | ||
| "must_not_fire": [ | ||
| "synthorg/clear-text-logging-sensitive-data", | ||
| "synthorg/partial-ssrf" | ||
| ] | ||
| }, | ||
| { | ||
| "name": "python-positive", | ||
| "language": "python", | ||
| "source_root": ".", | ||
| "scan_paths": [".github/codeql/fixtures/python/positive.py"], | ||
| "must_fire": [ | ||
| "synthorg/clear-text-logging-sensitive-data", | ||
| "synthorg/partial-ssrf" | ||
| ] | ||
| } | ||
| ] |
There was a problem hiding this comment.
Fixture expectation coverage is Python-only.
The manifest currently validates only Python fixtures, so JS/TS and Go sanitiser modelling can regress without failing this gate. Please add fixture entries for JavaScript/TypeScript and Go to keep validation aligned with the multi-language pack objectives.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.github/codeql/fixtures/expected.json around lines 4 - 24, The manifest only
lists Python fixtures ("python-negative" and "python-positive"), so add
equivalent fixture objects for JavaScript/TypeScript and Go to ensure those
languages are validated; create negative and positive entries for JS/TS (e.g.,
"js-negative"/"js-positive" or "javascript-negative"/"javascript-positive") and
for Go (e.g., "go-negative"/"go-positive") that mirror the Python structure,
include appropriate "scan_paths" pointing to the JS/TS and Go fixture files, and
set "must_fire" / "must_not_fire" arrays to the same rule IDs
("synthorg/clear-text-logging-sensitive-data", "synthorg/partial-ssrf") so the
pack validates multi-language sanitiser modelling.
| this = | ||
| API::moduleImport("scripts") | ||
| .getMember("check_image_signatures") | ||
| .getMember(["_validate_repo_prefix", "_validate_image_tag"]) | ||
| .getACall() | ||
| } |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial | ⚡ Quick win
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Verify import style used by Python callsites and compare with current QL API binding.
# Expected: callsites use scripts.check_image_signatures directly.
rg -n --type=py 'from\s+scripts\.check_image_signatures\s+import|import\s+scripts\.check_image_signatures'
rg -n --type=ql 'moduleImport\("scripts"\)\s*\.getMember\("check_image_signatures"\)|moduleImport\("scripts\.check_image_signatures"\)' \
.github/codeql/queries/synthorg-partial-ssrf/Sanitisers.qllRepository: Aureliolo/synthorg
Length of output: 89
🏁 Script executed:
#!/bin/bash
# Search for Python imports of scripts.check_image_signatures
echo "=== Python imports of scripts.check_image_signatures ==="
rg -n --type py 'scripts\.check_image_signatures' 2>/dev/null || echo "No Python matches found"
echo -e "\n=== Python imports using 'from scripts' ==="
rg -n --type py 'from\s+scripts' 2>/dev/null | head -20
echo -e "\n=== Examining the actual Sanitisers.qll file ==="
cat -n .github/codeql/queries/synthorg-partial-ssrf/Sanitisers.qll 2>/dev/null | head -50
echo -e "\n=== Search for other moduleImport patterns in .qll files ==="
find .github/codeql -name '*.qll' -exec grep -l "moduleImport" {} \; 2>/dev/null
echo -e "\n=== Show moduleImport patterns from found files ==="
find .github/codeql -name '*.qll' -exec grep -n "API::moduleImport" {} + 2>/dev/null | head -30Repository: Aureliolo/synthorg
Length of output: 4801
🏁 Script executed:
#!/bin/bash
echo "=== Search for check_image_signatures definitions and usage ==="
rg -n "check_image_signatures" --type py
echo -e "\n=== Fixture files for partial-ssrf query ==="
find .github/codeql/fixtures -name "*ssrf*" -o -name "*partial*" 2>/dev/null | head -20
echo -e "\n=== Check if there's a check_image_signatures.py file ==="
find . -name "check_image_signatures.py" 2>/dev/null
echo -e "\n=== Scripts directory structure ==="
ls -la scripts/ 2>/dev/null | grep -i "check\|image" || echo "Listing scripts directory..."
ls -la scripts/ 2>/dev/null | head -20
echo -e "\n=== Examine the partial-ssrf query file ==="
cat -n .github/codeql/queries/synthorg-partial-ssrf/PartialServerSideRequestForgeryWithBarriers.ql 2>/dev/null | head -50Repository: Aureliolo/synthorg
Length of output: 8648
🏁 Script executed:
#!/bin/bash
echo "=== Contents of check_image_signatures.py ==="
cat -n scripts/check_image_signatures.py | head -100
echo -e "\n=== Search for _validate_repo_prefix and _validate_image_tag ==="
rg -n "_validate_repo_prefix|_validate_image_tag" scripts/check_image_signatures.pyRepository: Aureliolo/synthorg
Length of output: 4924
🏁 Script executed:
#!/bin/bash
echo "=== Search for actual imports of check_image_signatures ==="
rg -n "from.*check_image_signatures|import.*check_image_signatures"
echo -e "\n=== Search for direct calls to _validate_repo_prefix or _validate_image_tag ==="
rg -n "_validate_repo_prefix|_validate_image_tag" --type py
echo -e "\n=== Check if scripts is a Python package ==="
ls -la scripts/__init__.py 2>/dev/null || echo "No __init__.py found - scripts is not a package"
echo -e "\n=== Look for fixture files ==="
find .github/codeql/fixtures -type f -name "*.py" | head -10Repository: Aureliolo/synthorg
Length of output: 1677
🏁 Script executed:
#!/bin/bash
echo "=== Contents of positive.py fixture ==="
cat -n .github/codeql/fixtures/python/positive.py
echo -e "\n=== Contents of negative.py fixture ==="
cat -n .github/codeql/fixtures/python/negative.py
echo -e "\n=== Check for partial-ssrf specific fixtures ==="
find .github/codeql/fixtures -type f | xargs grep -l "partial.*ssrf\|check_image_signatures" 2>/dev/nullRepository: Aureliolo/synthorg
Length of output: 3981
🏁 Script executed:
#!/bin/bash
echo "=== Examine the cleartext-logging Sanitisers.qll for comparison ==="
cat -n .github/codeql/queries/synthorg-cleartext-logging/Sanitisers.qll
echo -e "\n=== Check the fixture for cleartext-logging ==="
find .github/codeql/fixtures -name "*.py" -exec grep -l "synthorg.observability.redaction\|safe_error_description" {} \;
echo -e "\n=== Verify how safe_error_description is imported ==="
rg -n "from.*redaction|import.*safe_error_description" --type pyRepository: Aureliolo/synthorg
Length of output: 50374
Consider using fully-qualified module import for defensive binding.
At Line 31, the binding uses a chained member access pattern API::moduleImport("scripts").getMember("check_image_signatures"). While this matches the actual import style in the fixture (from scripts.check_image_signatures import ...), using a fully-qualified module path would be more defensive against future code patterns where someone might import scripts and access check_image_signatures as a member.
The synthorg-cleartext-logging Sanitisers.qll demonstrates this defensive approach by binding both chained and direct patterns separately. If you want stronger future-proofing:
Proposed fix
- API::moduleImport("scripts")
- .getMember("check_image_signatures")
+ API::moduleImport("scripts.check_image_signatures")
.getMember(["_validate_repo_prefix", "_validate_image_tag"])
.getACall()📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| this = | |
| API::moduleImport("scripts") | |
| .getMember("check_image_signatures") | |
| .getMember(["_validate_repo_prefix", "_validate_image_tag"]) | |
| .getACall() | |
| } | |
| this = | |
| API::moduleImport("scripts.check_image_signatures") | |
| .getMember(["_validate_repo_prefix", "_validate_image_tag"]) | |
| .getACall() | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In @.github/codeql/queries/synthorg-partial-ssrf/Sanitisers.qll around lines 30
- 35, Update the binding to also match a fully-qualified import of the target
module: add an alternate pattern that uses
API::moduleImport("scripts.check_image_signatures") with
getMember(["_validate_repo_prefix", "_validate_image_tag"]).getACall() (in
addition to the existing
API::moduleImport("scripts").getMember("check_image_signatures") chain) so both
chained imports and direct moduleImport("scripts.check_image_signatures") cases
are covered for the functions _validate_repo_prefix and _validate_image_tag.
|
Closing without merging. After 5 rounds of fixes, the architecture turned out to cost more than it delivered:
Net: substantial implementation effort, marginal residual value, and the surviving 4 sanitisers only matter if the advanced setup is enabled (which itself requires manual disabling of default setup). If there's ever a concrete reason to switch from default to advanced CodeQL setup, the architecture decisions and dead-ends explored here are documented in the babysit-pr round history at |
Closes #1836.
Why
GitHub CodeQL on default-setup repeatedly flags the same idioms in every PR because its built-in data-flow models can't see through SynthOrg's custom sanitiser helpers. Roughly 70 manual dismissals to date across
py/clear-text-logging-sensitive-data,py/path-injection,py/partial-ssrf,go/path-injection,js/log-injection, andgo/log-injection. Default-setup forbids custom queries, sanitiser models, and query-suite filters, so closing the gap requires advanced setup plus a Models-as-Data extension pack pinned in the repo.What changed
Workflows
.github/workflows/codeql.yml— advanced-setup scan (push + PR + Mon 07:23 UTC cron). 6-language matrix (actions, go, javascript, javascript-typescript, python, typescript);security-extendedquery suite explicit on both the action and the config; pinned action SHAs;permissions: {}at workflow root with per-job grants; concurrency-grouped..github/workflows/codeql-pack-validate.yml— runs only when pack files / fixtures / driver scripts change. Installs CodeQL CLI via the action, then drivesscripts/run_codeql_pack_validation.sh. Uploads SARIF artifacts on failure.Extension pack (
.github/codeql/extensions/synthorg-sanitisers/)qlpack.yml— extension targetscodeql/{python,go,javascript}-all.python-sanitisers.model.yml—safe_error_description,scrub_secret_tokens,_resolve_rootacross 7 scripts (the 12 sites listed in the issue collapse to 7 distinct modules in the diff),_validate_repo_prefix,_validate_image_tag.kind: "value"summary rows so taint is dropped without affecting unrelated rules.go-sanitisers.model.yml—config.SecurePath,cmd.safeStateDir.js-sanitisers.model.yml—sanitizeForLog,sanitizeWsString,sanitizeWsEnum.Fixtures (negative + positive pair per language)
.github/codeql/fixtures/python/{negative,positive}.py.github/codeql/fixtures/javascript/{negative,positive}.tscli/internal/codeqlfixtures/fixture.go(lives inside thecli/module sointernal/configimports resolve). Covers bothgo/path-injectionandgo/log-injectionshapes (the latter mirroringLogger.Info(msg, kvs ...any)with int+bool args fromdocker/sidecar/internal/health/health.go).expected.jsonmanifest ofmust_fire/must_not_fireassertions per fixture.scripts/check_codeql_fixtures.py— SARIF assertion script (stdlib only; per-rule + per-function granularity).scripts/run_codeql_pack_validation.sh— driver that callscodeql database create / analyzeper language and pipes the result through the assertion script.Docs + conventions
docs/reference/sec-prompt-safety.md— new "CodeQL barrier inventory" table (10 sanitisers → rules), a "Rules covered without a barrier model" subsection (paths-ignore strategy forpy/incomplete-url-substring-sanitization; typed-args fixture forgo/log-injection), and a cross-link from the Key reference call sites.CLAUDE.md— newCodeQL Sanitisers (MANDATORY)one-liner; bumped the gate-inventory count from 35 to 38.docs/reference/convention-gates.md— addedcheck_codeql_fixtures.py; bumped the RS counter from 37 to 38.data/runtime_stats.yaml— same counter bump.scripts/convention_gate_map.yaml— registeredclaude-md::codeql-sanitisersagainstscripts/check_codeql_fixtures.pyso the meta-gate accepts the new MANDATORY paragraph.pyproject.toml— per-file ruff ignores for.github/codeql/fixtures/python/*.py(deliberate anti-pattern demonstrations).Issue FP-table coverage
py/clear-text-logging-sensitive-datasafe_error_description+scrub_secret_tokens; fixtures assertpy/path-injection_resolve_root(7 modules); fixtures assertgo/path-injectionconfig.SecurePath+safeStateDir; fixtures assertjs/log-injectionsanitizeForLog+sanitizeWsString+sanitizeWsEnum; fixtures assertpy/partial-ssrf_validate_repo_prefix+_validate_image_tag; fixtures assertpy/incomplete-url-substring-sanitizationpaths-ignore: ['tests/**'](every historical dismissal was on test code); documented in inventorygo/log-injectionNegativeLogInjectionint+bool) asserts CodeQL's built-inLogger.Info(msg, kvs ...any)handling is sufficient; positive fixture proves the rule still fires on raw user-string argsTest plan
go -C cli vet ./internal/codeqlfixtures/✅go -C cli build ./internal/codeqlfixtures/✅uv run pre-commit run --all-files✅ (incl. zizmor + actionlint on both new workflows)uv run pre-commit run --hook-stage pre-push convention-gate-inventory✅ (meta-gate accepts the new MANDATORY)codeql-pack-validateworkflow itself runs on this PR's diff and is the end-to-end regression test for the pack's precision.Deploy checklist (BLOCKING — manual repo-settings step)
code-scanningSARIF channel and only one may be enabled at a time.maincommit to complete (~15 min wall-clock for 6 languages).(ruleId, file, startLine, fingerprint). Already-dismissed alerts stay dismissed; new alerts that the pack does not suppress will (correctly) re-fire.Review coverage
/aurelio-review-prfor external-reviewer feedback (CodeRabbit, etc.).Out of scope