feat: adopt CodeQL advanced setup + Models-as-Data sanitiser pack (#1836) by Aureliolo · Pull Request #1839 · Aureliolo/synthorg

Aureliolo · 2026-05-09T11:52:22Z

Closes #1836.

Why

GitHub CodeQL on default-setup repeatedly flags the same idioms in every PR because its built-in data-flow models can't see through SynthOrg's custom sanitiser helpers. Roughly 70 manual dismissals to date across py/clear-text-logging-sensitive-data, py/path-injection, py/partial-ssrf, go/path-injection, js/log-injection, and go/log-injection. Default-setup forbids custom queries, sanitiser models, and query-suite filters, so closing the gap requires advanced setup plus a Models-as-Data extension pack pinned in the repo.

What changed

Workflows

.github/workflows/codeql.yml — advanced-setup scan (push + PR + Mon 07:23 UTC cron). 6-language matrix (actions, go, javascript, javascript-typescript, python, typescript); security-extended query suite explicit on both the action and the config; pinned action SHAs; permissions: {} at workflow root with per-job grants; concurrency-grouped.
.github/workflows/codeql-pack-validate.yml — runs only when pack files / fixtures / driver scripts change. Installs CodeQL CLI via the action, then drives scripts/run_codeql_pack_validation.sh. Uploads SARIF artifacts on failure.

Extension pack (.github/codeql/extensions/synthorg-sanitisers/)

qlpack.yml — extension targets codeql/{python,go,javascript}-all.
python-sanitisers.model.yml — safe_error_description, scrub_secret_tokens, _resolve_root across 7 scripts (the 12 sites listed in the issue collapse to 7 distinct modules in the diff), _validate_repo_prefix, _validate_image_tag. kind: "value" summary rows so taint is dropped without affecting unrelated rules.
go-sanitisers.model.yml — config.SecurePath, cmd.safeStateDir.
js-sanitisers.model.yml — sanitizeForLog, sanitizeWsString, sanitizeWsEnum.

Fixtures (negative + positive pair per language)

Python: .github/codeql/fixtures/python/{negative,positive}.py
JS/TS: .github/codeql/fixtures/javascript/{negative,positive}.ts
Go: cli/internal/codeqlfixtures/fixture.go (lives inside the cli/ module so internal/config imports resolve). Covers both go/path-injection and go/log-injection shapes (the latter mirroring Logger.Info(msg, kvs ...any) with int+bool args from docker/sidecar/internal/health/health.go).
expected.json manifest of must_fire / must_not_fire assertions per fixture.
scripts/check_codeql_fixtures.py — SARIF assertion script (stdlib only; per-rule + per-function granularity).
scripts/run_codeql_pack_validation.sh — driver that calls codeql database create / analyze per language and pipes the result through the assertion script.

Docs + conventions

docs/reference/sec-prompt-safety.md — new "CodeQL barrier inventory" table (10 sanitisers → rules), a "Rules covered without a barrier model" subsection (paths-ignore strategy for py/incomplete-url-substring-sanitization; typed-args fixture for go/log-injection), and a cross-link from the Key reference call sites.
CLAUDE.md — new CodeQL Sanitisers (MANDATORY) one-liner; bumped the gate-inventory count from 35 to 38.
docs/reference/convention-gates.md — added check_codeql_fixtures.py; bumped the RS counter from 37 to 38.
data/runtime_stats.yaml — same counter bump.
scripts/convention_gate_map.yaml — registered claude-md::codeql-sanitisers against scripts/check_codeql_fixtures.py so the meta-gate accepts the new MANDATORY paragraph.
pyproject.toml — per-file ruff ignores for .github/codeql/fixtures/python/*.py (deliberate anti-pattern demonstrations).

Issue FP-table coverage

Rule	Coverage
`py/clear-text-logging-sensitive-data`	Barrier rows for `safe_error_description` + `scrub_secret_tokens`; fixtures assert
`py/path-injection`	Barrier rows for `_resolve_root` (7 modules); fixtures assert
`go/path-injection`	Barrier rows for `config.SecurePath` + `safeStateDir`; fixtures assert
`js/log-injection`	Barrier rows for `sanitizeForLog` + `sanitizeWsString` + `sanitizeWsEnum`; fixtures assert
`py/partial-ssrf`	Barrier rows for `_validate_repo_prefix` + `_validate_image_tag`; fixtures assert
`py/incomplete-url-substring-sanitization`	Suppressed by `paths-ignore: ['tests/**']` (every historical dismissal was on test code); documented in inventory
`go/log-injection`	Typed-args fixture (`NegativeLogInjection` int+bool) asserts CodeQL's built-in `Logger.Info(msg, kvs ...any)` handling is sufficient; positive fixture proves the rule still fires on raw user-string args

Test plan

go -C cli vet ./internal/codeqlfixtures/ ✅
go -C cli build ./internal/codeqlfixtures/ ✅
uv run pre-commit run --all-files ✅ (incl. zizmor + actionlint on both new workflows)
uv run pre-commit run --hook-stage pre-push convention-gate-inventory ✅ (meta-gate accepts the new MANDATORY)
The new codeql-pack-validate workflow itself runs on this PR's diff and is the end-to-end regression test for the pack's precision.

Deploy checklist (BLOCKING — manual repo-settings step)

Disable default-setup CodeQL in repo Settings → Code security → Code scanning → Default → Disable. The advanced-setup workflow cannot start while default-setup is still active; both compete for the same code-scanning SARIF channel and only one may be enabled at a time.
Wait for the first advanced-setup run on the merged main commit to complete (~15 min wall-clock for 6 languages).
Verify all six language entries appear under Security → Code scanning alerts.
Existing dismissals carry over: CodeQL dedupes by (ruleId, file, startLine, fingerprint). Already-dismissed alerts stay dismissed; new alerts that the pack does not suppress will (correctly) re-fire.

Review coverage

7 local-agent reviewers (docs-consistency, comment-quality-rot, infra-reviewer, go-reviewer, go-security-reviewer, go-conventions-enforcer, issue-resolution-verifier) plus a triage round that surfaced 9 valid findings; all 9 implemented in commit 2 of this branch.
After merge: run /aurelio-review-pr for external-reviewer feedback (CodeRabbit, etc.).

Out of scope

Dismissing the 12 currently-open alerts (handled in fix existing security issues in github alerts #1825).
Any change to sanitiser implementations themselves (the pack only models existing helpers).

github-actions · 2026-05-09T11:52:34Z

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

OpenSSF Scorecard

Package

Version

Score

Details

actions/actions/checkout

de0fac2e4500dabe0009e67214ff5f5447ce83dd

🟢 5.7

Details

Check	Score	Reason
Code-Review	🟢 10	all changesets reviewed
Maintained	⚠️ 0	0 commit(s) and 0 issue activity found in the last 90 days -- score normalized to 0
Dangerous-Workflow	🟢 10	no dangerous workflow patterns detected
Binary-Artifacts	🟢 10	no binaries found in the repo
CII-Best-Practices	⚠️ 0	no effort to earn an OpenSSF best practices badge detected
Token-Permissions	⚠️ 0	detected GitHub workflow tokens with excessive permissions
Fuzzing	⚠️ 0	project is not fuzzed
Packaging	⚠️ -1	packaging workflow not detected
License	🟢 10	license file detected
Pinned-Dependencies	🟢 3	dependency not pinned by hash detected -- score normalized to 3
Signed-Releases	⚠️ -1	no releases found
Security-Policy	🟢 9	security policy file detected
Branch-Protection	🟢 5	branch protection is not maximal on development and all release branches
SAST	🟢 8	SAST tool detected but not run on all commits

actions/actions/setup-go

4a3601121dd01d1626a1e23e37211e3254c1c06c

🟢 5.6

Details

Check	Score	Reason
Code-Review	🟢 10	all changesets reviewed
Maintained	🟢 5	6 commit(s) and 1 issue activity found in the last 90 days -- score normalized to 5
Packaging	⚠️ -1	packaging workflow not detected
Binary-Artifacts	🟢 10	no binaries found in the repo
Dangerous-Workflow	🟢 10	no dangerous workflow patterns detected
CII-Best-Practices	⚠️ 0	no effort to earn an OpenSSF best practices badge detected
Token-Permissions	⚠️ 0	detected GitHub workflow tokens with excessive permissions
Pinned-Dependencies	⚠️ 0	dependency not pinned by hash detected -- score normalized to 0
Fuzzing	⚠️ 0	project is not fuzzed
License	🟢 10	license file detected
Signed-Releases	⚠️ -1	no releases found
Security-Policy	🟢 9	security policy file detected
Branch-Protection	⚠️ 0	branch protection not enabled on development/release branches
SAST	🟢 10	SAST tool is run on all commits

actions/actions/setup-node

48b55a011bda9f5d6aeb4c2d9c7362e8dae4041e

🟢 5.8

Details

Check	Score	Reason
Maintained	🟢 7	9 commit(s) and 0 issue activity found in the last 90 days -- score normalized to 7
Code-Review	🟢 10	all changesets reviewed
Binary-Artifacts	🟢 9	binaries present in source code
Dangerous-Workflow	🟢 10	no dangerous workflow patterns detected
CII-Best-Practices	⚠️ 0	no effort to earn an OpenSSF best practices badge detected
Packaging	⚠️ -1	packaging workflow not detected
Token-Permissions	⚠️ 0	detected GitHub workflow tokens with excessive permissions
Pinned-Dependencies	⚠️ 0	dependency not pinned by hash detected -- score normalized to 0
Fuzzing	⚠️ 0	project is not fuzzed
License	🟢 10	license file detected
Signed-Releases	⚠️ -1	no releases found
Security-Policy	🟢 9	security policy file detected
Branch-Protection	⚠️ 1	branch protection is not maximal on development and all release branches
SAST	🟢 9	SAST tool is not run on all commits -- score normalized to 9

actions/astral-sh/setup-uv

08807647e7069bb48b6ef5acd8ec9567f424441b

Unknown

actions/github/codeql-action/analyze

68bde559dea0fdcac2102bfdf6230c5f70eb485e

Unknown

actions/github/codeql-action/autobuild

68bde559dea0fdcac2102bfdf6230c5f70eb485e

Unknown

actions/github/codeql-action/init

68bde559dea0fdcac2102bfdf6230c5f70eb485e

Unknown

Scanned Files

.github/workflows/codeql.yml

coderabbitai · 2026-05-09T11:53:18Z

Walkthrough

This PR implements two CodeQL custom query packs (synthorg-cleartext-logging and synthorg-partial-ssrf) with project-specific sanitizer barriers, integrated via a fixture-based validation framework. The configuration enables default CodeQL queries while routing analysis through security-extended plus the custom packs, which define taint-flow rules that suppress alerts when validated by helper functions (safe_error_description() for logging, image/repo validators for SSRF). The validation infrastructure comprises negative and positive Python fixtures, a SARIF result checker, and a bash orchestration script that generates and validates CodeQL databases against expected.json. GitHub Actions workflows automate both fixture validation on PRs and production analysis across Python, Go, and JavaScript code. Convention gates, documentation, and linting configuration are updated to enforce and document the new sanitizer barrier requirement.

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The PR title 'feat: adopt CodeQL advanced setup + Models-as-Data sanitiser pack' clearly and specifically summarizes the main change: adopting CodeQL advanced setup with a Models-as-Data sanitiser pack.
Description check	✅ Passed	The PR description is comprehensive and directly addresses the changeset, explaining the motivation, what changed, coverage of issue objectives, test plan, and deployment requirements.
Linked Issues check	✅ Passed	The PR fulfills the linked issue `#1836` objectives: implements advanced-setup CodeQL workflow, adds Models-as-Data extension pack modeling sanitizers across Python/Go/JS, validates with fixtures and SARIF assertions, documents the barrier inventory, and registers validation script in conventions.
Out of Scope Changes check	✅ Passed	All changes are scoped to implementing the CodeQL advanced setup and Models-as-Data pack per issue `#1836`; no unrelated modifications are present. Dismissing open alerts and sanitizer implementation changes are explicitly marked out of scope.
Docstring Coverage	✅ Passed	Docstring coverage is 75.00% which is sufficient. The required threshold is 40.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

gemini-code-assist

Code Review

This pull request introduces a custom CodeQL extension pack to model project-specific sanitiser helpers as taint barriers for Python, Go, and JavaScript, including a validation framework and updated documentation. The review identifies a critical issue where the Models-as-Data definitions incorrectly use summaryModel with kind: "value", which preserves taint instead of blocking it. Feedback also notes potential module resolution issues in the Python models and a test fixture that fails to exercise the helper it is intended to validate.

gemini-code-assist · 2026-05-09T11:55:00Z

+  - addsTo:
+      pack: codeql/python-all
+      extensible: summaryModel
+    data:
+      - ["synthorg.observability.redaction", "Member[safe_error_description]", "", "", "", "", "Argument[0]", "ReturnValue", "value", "manual"]
+      - ["synthorg.observability", "Member[safe_error_description]", "", "", "", "", "Argument[0]", "ReturnValue", "value", "manual"]
+      - ["synthorg.observability.redaction", "Member[scrub_secret_tokens]", "", "", "", "", "Argument[0]", "ReturnValue", "value", "manual"]


The models are incorrectly using the summaryModel extensible predicate with kind: "value". In CodeQL Models-as-Data, kind: "value" indicates that the value (and thus any taint) is preserved from the input to the output, which is the opposite of a sanitiser's purpose. To model a sanitiser that blocks taint flow, you should use the sanitizerModel extensible predicate.

Additionally, the data format for sanitizerModel in Python is simpler, requiring only 3 columns: (package, member, kind). For these helpers, the kind should be "taint".

- addsTo: pack: codeql/python-all extensible: sanitizerModel data: - ["synthorg.observability.redaction", "Member[safe_error_description]", "taint"] - ["synthorg.observability", "Member[safe_error_description]", "taint"] - ["synthorg.observability.redaction", "Member[scrub_secret_tokens]", "taint"]

gemini-code-assist · 2026-05-09T11:55:00Z

+  - addsTo:
+      pack: codeql/go-all
+      extensible: summaryModel
+    data:
+      - ["github.com/Aureliolo/synthorg/cli/internal/config", "", "False", "SecurePath", "", "", "Argument[0]", "ReturnValue[0]", "value", "manual"]


Similar to the Python models, Go sanitisers should be modelled using the sanitizerModel extensible predicate rather than summaryModel. Using kind: "value" in a summary model explicitly tells CodeQL that the return value is tainted if the input is tainted.

The sanitizerModel for Go uses 9 columns (removing the Output column used in summaries).

- addsTo: pack: codeql/go-all extensible: sanitizerModel data: - ["github.com/Aureliolo/synthorg/cli/internal/config", "", "False", "SecurePath", "", "", "Argument[0]", "taint", "manual"]

gemini-code-assist · 2026-05-09T11:55:00Z

+  - addsTo:
+      pack: codeql/javascript-all
+      extensible: summaryModel
+    data:
+      - ["web/src/utils/logging", "Member[sanitizeForLog]", "", "", "", "", "Argument[0]", "ReturnValue", "value", "manual"]


JavaScript sanitisers should use the sanitizerModel extensible predicate. The summaryModel with kind: "value" indicates that taint is preserved. The sanitizerModel format for JavaScript requires 3 columns: (package, member, kind).

- addsTo: pack: codeql/javascript-all extensible: sanitizerModel data: - ["web/src/utils/logging", "Member[sanitizeForLog]", "taint"]

gemini-code-assist · 2026-05-09T11:55:00Z

+  - addsTo:
+      pack: codeql/python-all
+      extensible: summaryModel
+    data:
+      - ["scripts.check_currency_aggregation_invariant", "Member[_resolve_root]", "", "", "", "", "Argument[0]", "ReturnValue", "value", "manual"]
+      - ["scripts.check_dependency_inversion", "Member[_resolve_root]", "", "", "", "", "Argument[0]", "ReturnValue", "value", "manual"]
+      - ["scripts.check_domain_error_hierarchy", "Member[_resolve_root]", "", "", "", "", "Argument[0]", "ReturnValue", "value", "manual"]
+      - ["scripts.check_forbidden_literals", "Member[_resolve_root]", "", "", "", "", "Argument[0]", "ReturnValue", "value", "manual"]
+      - ["scripts.check_no_magic_numbers", "Member[_resolve_root]", "", "", "", "", "Argument[0]", "ReturnValue", "value", "manual"]
+      - ["scripts.check_no_migration_framing", "Member[_resolve_root]", "", "", "", "", "Argument[0]", "ReturnValue", "value", "manual"]
+      - ["scripts.check_no_review_origin_in_code", "Member[_resolve_root]", "", "", "", "", "Argument[0]", "ReturnValue", "value", "manual"]
+      - ["scripts.check_persistence_boundary", "Member[_resolve_root]", "", "", "", "", "Argument[0]", "ReturnValue", "value", "manual"]


The package names for these scripts (e.g., scripts.check_currency_aggregation_invariant) assume that the scripts/ directory is a Python package. If scripts/ does not contain an __init__.py file, CodeQL will likely treat these as top-level modules (e.g., just check_currency_aggregation_invariant). Please verify the module resolution in the CodeQL database; if they are not in a package, the first column should be empty or match the module name directly.

gemini-code-assist · 2026-05-09T11:55:00Z

+def negative_path_injection_relative_to(
+    user_input: str, project_root: Path
+) -> str | None:
+    """py/path-injection must NOT fire here.
+
+    Pattern: resolve, then assert containment via ``Path.relative_to`` in
+    a try/except block. The ValueError branch is the safe-exit.
+    """
+    candidate = (project_root / user_input).resolve()
+    try:
+        candidate.relative_to(project_root)
+    except ValueError:
+        return None
+    return candidate.read_text()


The function negative_path_injection_relative_to implements path containment logic inline but does not call the _resolve_root helper that is modelled in python-sanitisers.model.yml. As a result, this fixture does not actually validate that the custom CodeQL model for _resolve_root is working correctly.

To properly test the model, this fixture should import and call _resolve_root from one of the scripts listed in the model file.

codspeed-hq · 2026-05-09T11:55:03Z

Merging this PR will not alter performance

✅ 33 untouched benchmarks
⏩ 21 skipped benchmarks¹

_{Comparing chore/issue-1836-codeql-advanced (aae666b) with main (11132cd)}

21 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩

coderabbitai

Actionable comments posted: 7

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/codeql/codeql-config.yml:
- Around line 10-14: Add the missing CodeQL ignore entry for the intentionally
vulnerable fixtures by adding the pattern 'cli/internal/codeqlfixtures/**' to
the paths-ignore list in the CodeQL config (maintain the same YAML indentation
and quoting style as the existing entries so the paths-ignore key remains a YAML
sequence).

In @.github/codeql/extensions/synthorg-sanitisers/js-sanitisers.model.yml:
- Around line 22-39: The summaryModel rows for Member[sanitizeForLog],
Member[sanitizeWsString], and Member[sanitizeWsEnum] use an invalid 10-field
tuple shape which breaks CodeQL; update each data row so it matches the required
5- or 6-field summaryModel shape (remove the extra empty-string entries and the
manual/Argument/ReturnValue/value tail so each row is a 5- or 6-element list),
ensure any 6th element is an integer origin when used, and keep the
module/member names (e.g., "web/src/utils/logging" Member[sanitizeForLog],
"web/src/utils/ws-sanitize" Member[sanitizeWsString], Member[sanitizeWsEnum])
unchanged; then run the provided Python validation snippet to confirm all
summaryModel rows now have valid 5/6-field shape.

In @.github/codeql/fixtures/javascript/negative.ts:
- Around line 7-8: The deep relative imports of sanitizeForLog,
sanitizeWsString, and sanitizeWsEnum from '../../../../web/src/utils/' are
fragile; update the imports to a stable reference by either co-locating the
fixture with the source utilities, publishing/using an internal package
(importing from that package name), or adding a module path alias (e.g., via
tsconfig/webpack) and changing the import statements to use the package or
alias; ensure sanitizeForLog, sanitizeWsString, and sanitizeWsEnum are resolved
from the new stable location and update any build/test config accordingly so
tests still run under the CodeQL fixtures constraint.

In @.github/codeql/fixtures/python/negative.py:
- Around line 14-18: The test fixture imports private helpers
_validate_image_tag and _validate_repo_prefix alongside ImageTag; to fix, either
make those validators public (rename _validate_image_tag and
_validate_repo_prefix to validate_image_tag and validate_repo_prefix in
scripts.check_image_signatures and update imports), or remove the private
imports and duplicate the needed validator logic inside the fixture, or move the
validator functions into a new public module and import them there; update all
references to use the new public names (e.g., validate_image_tag,
validate_repo_prefix) or the locally duplicated functions to avoid importing
private symbols.

In `@scripts/check_codeql_fixtures.py`:
- Line 96: The code currently defaults start_line = int(region.get("startLine",
0)) which can mask malformed SARIF; update the logic around region and startLine
to explicitly validate existence and type: check region contains "startLine" and
that it's convertible to int (referencing the region variable and start_line
assignment), and if missing/invalid either log a warning with context (e.g.,
which result/file) or raise an exception so parsing fails fast; ensure any
logging uses the existing logger in this module to provide useful diagnostic
info.

In `@scripts/run_codeql_pack_validation.sh`:
- Around line 81-83: The script currently copies javascript-negative.sarif to
javascript-positive.sarif producing identical results; replace that cp operation
with an actual run_fixture invocation for the positive JS fixtures (mirror how
Python was handled) so run_fixture produces
"$RESULTS_DIR/javascript-positive.sarif" instead of copying; update the code
around the existing run_fixture call and remove the cp of
"$RESULTS_DIR/javascript-negative.sarif" to ensure distinct analysis outputs for
the positive fixture.
- Around line 70-72: The script currently copies python-negative.sarif to
python-positive.sarif which yields identical results; instead re-run analysis
for the positive fixture (call the existing run_fixture invocation or the CodeQL
analysis step that produced python-negative.sarif) to generate a true
python-positive.sarif output, i.e. invoke run_fixture (or the analyze step that
takes the DB and source-root) with the positive fixture and write its output to
"$RESULTS_DIR/python-positive.sarif" rather than using cp; if the positive and
negative use the exact same source and database, add a comment or assertion
explaining why differing must_fire/must_not_fire expectations are valid for
identical SARIFs.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 73cb4f85-2e65-4fe9-8264-e31de2a0e067

📥 Commits

Reviewing files that changed from the base of the PR and between 4515c8e and d300a7e.

📒 Files selected for processing (21)

.github/codeql/codeql-config.yml
.github/codeql/extensions/synthorg-sanitisers/go-sanitisers.model.yml
.github/codeql/extensions/synthorg-sanitisers/js-sanitisers.model.yml
.github/codeql/extensions/synthorg-sanitisers/python-sanitisers.model.yml
.github/codeql/extensions/synthorg-sanitisers/qlpack.yml
.github/codeql/fixtures/expected.json
.github/codeql/fixtures/javascript/negative.ts
.github/codeql/fixtures/javascript/positive.ts
.github/codeql/fixtures/python/negative.py
.github/codeql/fixtures/python/positive.py
.github/workflows/codeql-pack-validate.yml
.github/workflows/codeql.yml
CLAUDE.md
cli/internal/codeqlfixtures/fixture.go
data/runtime_stats.yaml
docs/reference/convention-gates.md
docs/reference/sec-prompt-safety.md
pyproject.toml
scripts/check_codeql_fixtures.py
scripts/convention_gate_map.yaml
scripts/run_codeql_pack_validation.sh

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)

GitHub Check: Build Fine-Tune (cpu, fine-tune-cpu)
GitHub Check: Build Fine-Tune (gpu, fine-tune-gpu)
GitHub Check: Build Backend
GitHub Check: Build Web Assets (melange)
GitHub Check: CLI Test (windows-latest)
GitHub Check: CLI Bench Regression
GitHub Check: CodSpeed Python benchmarks
GitHub Check: Lighthouse Site
GitHub Check: Dashboard Test
GitHub Check: Test (Python 3.14)
GitHub Check: Analyze (python)

🧰 Additional context used

📓 Path-based instructions (7)

**/*.md

📄 CodeRabbit inference engine (CLAUDE.md)

Read the relevant design specification from docs/design/ before implementing; deviations require approval

Files:

CLAUDE.md
docs/reference/convention-gates.md
docs/reference/sec-prompt-safety.md

**/*

📄 CodeRabbit inference engine (CLAUDE.md)

Apply Regional Defaults: no region/currency/locale privileged; use metric units; use British English

Files:

CLAUDE.md
docs/reference/convention-gates.md
pyproject.toml
scripts/run_codeql_pack_validation.sh
cli/internal/codeqlfixtures/fixture.go
scripts/convention_gate_map.yaml
data/runtime_stats.yaml
docs/reference/sec-prompt-safety.md
scripts/check_codeql_fixtures.py

**/*.{d2,md}

📄 CodeRabbit inference engine (CLAUDE.md)

Diagrams: use d2 for architecture/nested containers (theme 200 Dark Mauve, CI pinned to v0.7.1), mermaid for flowcharts/sequence/pipelines. Use Markdown tables for tabular data

Files:

CLAUDE.md
docs/reference/convention-gates.md
docs/reference/sec-prompt-safety.md

{.github/codeql/extensions/synthorg-sanitisers/*.model.yml,scripts/check_codeql_fixtures.py,cli/internal/codeqlfixtures/**}

📄 CodeRabbit inference engine (CLAUDE.md)

Any change to a project sanitiser requires a matching row in .github/codeql/extensions/synthorg-sanitisers/*.model.yml and a fixture pair under .github/codeql/fixtures/ or cli/internal/codeqlfixtures/. Enforced by .github/workflows/codeql-pack-validate.yml + scripts/check_codeql_fixtures.py

Files:

.github/codeql/extensions/synthorg-sanitisers/go-sanitisers.model.yml
.github/codeql/extensions/synthorg-sanitisers/js-sanitisers.model.yml
cli/internal/codeqlfixtures/fixture.go
.github/codeql/extensions/synthorg-sanitisers/python-sanitisers.model.yml
scripts/check_codeql_fixtures.py

{README.md,docs/**/*.md}

📄 CodeRabbit inference engine (CLAUDE.md)

Numeric claims in README and public docs must be sourced from data/runtime_stats.yaml via  markers

Files:

docs/reference/convention-gates.md
docs/reference/sec-prompt-safety.md

cli/**/*.go

📄 CodeRabbit inference engine (cli/CLAUDE.md)

Use hardcoded localhost, postgres:5432, and nats:4222 literals in CLI code by design (correct by security/audit rationale; do not make these configurable). See docs/reference/cli-env-vars.md for audit justification

Files:

cli/internal/codeqlfixtures/fixture.go

cli/**

📄 CodeRabbit inference engine (CLAUDE.md)

CLI is Docker-only (init/start/stop/status); features go in dashboard + REST API. Use go -C cli, never cd cli. See cli/CLAUDE.md

Files:

cli/internal/codeqlfixtures/fixture.go

🧠 Learnings (2)

📓 Common learnings

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T11:53:18.276Z
Learning: Present every plan for accept/deny before coding

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T11:53:18.276Z
Learning: Every convention PR ships its enforcement gate; details in `docs/reference/convention-gates.md`

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T11:53:18.276Z
Learning: After issue completion: branch + commit + push (no auto-PR); use `/pre-pr-review`. After PR: use `/aurelio-review-pr` for external feedback. Fix everything valid; never defer

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T11:53:18.276Z
Learning: Git commits: `<type>: <description>` (feat/fix/refactor/docs/test/chore/perf/ci); commitizen-enforced

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T11:53:18.276Z
Learning: Signed commits required on protected refs (GPG/SSH or GitHub App via `synthorg-repo-bot`)

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T11:53:18.276Z
Learning: Branches: `<type>/<slug>` from main

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T11:53:18.276Z
Learning: Squash merge PRs. PR body becomes squash commit; trailers (`Release-As`, `Closes `#N``) must be in PR body

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T11:53:18.276Z
Learning: GitHub queries: use `gh issue list` via Bash, NOT MCP `list_issues`

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T11:53:18.276Z
Learning: Finished issue: branch + commit + push (no auto-PR). Use `/pre-pr-review`. After every squash merge: `/post-merge-cleanup`. Always read `docs/design/` before implementing

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T11:53:18.276Z
Learning: PR review: fix everything valid; never skip as out-of-scope

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T11:53:18.276Z
Learning: Framework for synthetic organisations (autonomous AI agents orchestrated as a virtual company). Python 3.14+. Layout: `src/synthorg/` (src layout), `tests/`, `web/` (React 19), `cli/` (Go binary). BUSL-1.1 → Apache 2.0 after Change Date

📚 Learning: 2026-05-05T09:04:46.195Z

Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1760
File: scripts/_dual_backend_parity_lib.py:215-216
Timestamp: 2026-05-05T09:04:46.195Z
Learning: This repository targets Python 3.14+ and follows PEP 758. Therefore, reviewer tooling should NOT treat unparenthesized multi-exception `except` clauses written without an `as` clause (e.g., `except MemoryError, RecursionError:`) as syntax errors. Only flag `except`-clause problems when they are genuinely invalid for Python 3.14+.

Applied to files:

scripts/check_codeql_fixtures.py

🪛 GitHub Actions: CodeQL / 0_Analyze (javascript-typescript).txt

.github/codeql/extensions/synthorg-sanitisers/js-sanitisers.model.yml

[error] 22-124: CodeQL extension validation failed for codeql/javascript-all:summaryModel. Row 1 has invalid data: found '"web/src/uti...", "Member[sani...", ... "ReturnValue", "value", "manual"' which does not match signature 'summaryModel(string type, string path, string input, string output, string kind, [int origin])'. ( /opt/hostedtoolcache/.../ApiGraphModelsExtensions.qll:51,1-53,3 )

[error] 30-130: CodeQL extension validation failed for codeql/javascript-all:summaryModel. Row 1 has invalid data: found '"web/src/uti...", "Member[sani...", ... "ReturnValue", "value", "manual"' which does not match signature 'summaryModel(string type, string path, string input, string output, string kind, [int origin])'. ( /opt/hostedtoolcache/.../ApiGraphModelsExtensions.qll:51,1-53,3 )

[error] 39-128: CodeQL extension validation failed for codeql/javascript-all:summaryModel. Row 1 has invalid data: found '"web/src/uti...", "Member[sani...", ... "ReturnValue", "value", "manual"' which does not match signature 'summaryModel(string type, string path, string input, string output, string kind, [int origin])'. ( /opt/hostedtoolcache/.../ApiGraphModelsExtensions.qll:51,1-53,3 )

🪛 GitHub Actions: CodeQL / Analyze (javascript-typescript)

.github/codeql/extensions/synthorg-sanitisers/js-sanitisers.model.yml

[error] 22-22: In extension for codeql/javascript-all:summaryModel, row 1 is invalid. Found entries that do not match signature 'summaryModel(string type, string path, string input, string output, string kind, [int origin])'.

[error] 30-30: In extension for codeql/javascript-all:summaryModel, row 1 is invalid. Found entries that do not match signature 'summaryModel(string type, string path, string input, string output, string kind, [int origin])'.

[error] 39-39: In extension for codeql/javascript-all:summaryModel, row 1 is invalid. Found entries that do not match signature 'summaryModel(string type, string path, string input, string output, string kind, [int origin])'.

🪛 GitHub Actions: CodeQL / Analyze (typescript)

.github/codeql/codeql-config.yml

[error] 1-1: Command failed fatally during CodeQL resolve extensions-by-pack (error code 2). Analysis step failed while running: codeql database run-queries --ram=14575 --threads=4 --expect-discarded-cache --additional-packs=/home/runner/work/_temp/pr-diff-range --extension-packs=codeql-action/pr-diff-range /home/runner/work/_temp/codeql_databases/javascript --min-disk-free=1024 -v

.github/codeql/fixtures/expected.json

[error] 1-1: Command failed fatally during CodeQL resolve extensions-by-pack (error code 2). Analysis step failed while running: codeql database run-queries --ram=14575 --threads=4 --expect-discarded-cache --additional-packs=/home/runner/work/_temp/pr-diff-range --extension-packs=codeql-action/pr-diff-range /home/runner/work/_temp/codeql_databases/javascript --min-disk-free=1024 -v

.github/codeql/extensions/synthorg-sanitisers/qlpack.yml

[error] 1-1: Command failed fatally during CodeQL resolve extensions-by-pack (error code 2). Analysis step failed while running: codeql database run-queries --ram=14575 --threads=4 --expect-discarded-cache --additional-packs=/home/runner/work/_temp/pr-diff-range --extension-packs=codeql-action/pr-diff-range /home/runner/work/_temp/codeql_databases/javascript --min-disk-free=1024 -v

.github/codeql/fixtures/javascript/positive.ts

[error] 1-1: Command failed fatally during CodeQL resolve extensions-by-pack (error code 2). Analysis step failed while running: codeql database run-queries --ram=14575 --threads=4 --expect-discarded-cache --additional-packs=/home/runner/work/_temp/pr-diff-range --extension-packs=codeql-action/pr-diff-range /home/runner/work/_temp/codeql_databases/javascript --min-disk-free=1024 -v

.github/codeql/fixtures/javascript/negative.ts

[error] 1-1: Command failed fatally during CodeQL resolve extensions-by-pack (error code 2). Analysis step failed while running: codeql database run-queries --ram=14575 --threads=4 --expect-discarded-cache --additional-packs=/home/runner/work/_temp/pr-diff-range --extension-packs=codeql-action/pr-diff-range /home/runner/work/_temp/codeql_databases/javascript --min-disk-free=1024 -v

.github/codeql/extensions/synthorg-sanitisers/go-sanitisers.model.yml

[error] 1-1: Command failed fatally during CodeQL resolve extensions-by-pack (error code 2). Analysis step failed while running: codeql database run-queries --ram=14575 --threads=4 --expect-discarded-cache --additional-packs=/home/runner/work/_temp/pr-diff-range --extension-packs=codeql-action/pr-diff-range /home/runner/work/_temp/codeql_databases/javascript --min-disk-free=1024 -v

.github/codeql/extensions/synthorg-sanitisers/js-sanitisers.model.yml

[error] 1-1: Command failed fatally during CodeQL resolve extensions-by-pack (error code 2). Analysis step failed while running: codeql database run-queries --ram=14575 --threads=4 --expect-discarded-cache --additional-packs=/home/runner/work/_temp/pr-diff-range --extension-packs=codeql-action/pr-diff-range /home/runner/work/_temp/codeql_databases/javascript --min-disk-free=1024 -v

.github/workflows/codeql-pack-validate.yml

[error] 1-1: Command failed fatally during CodeQL resolve extensions-by-pack (error code 2). Analysis step failed while running: codeql database run-queries --ram=14575 --threads=4 --expect-discarded-cache --additional-packs=/home/runner/work/_temp/pr-diff-range --extension-packs=codeql-action/pr-diff-range /home/runner/work/_temp/codeql_databases/javascript --min-disk-free=1024 -v

scripts/run_codeql_pack_validation.sh

[error] 1-1: Command failed fatally during CodeQL resolve extensions-by-pack (error code 2). Analysis step failed while running: codeql database run-queries --ram=14575 --threads=4 --expect-discarded-cache --additional-packs=/home/runner/work/_temp/pr-diff-range --extension-packs=codeql-action/pr-diff-range /home/runner/work/_temp/codeql_databases/javascript --min-disk-free=1024 -v

cli/internal/codeqlfixtures/fixture.go

[error] 1-1: Command failed fatally during CodeQL resolve extensions-by-pack (error code 2). Analysis step failed while running: codeql database run-queries --ram=14575 --threads=4 --expect-discarded-cache --additional-packs=/home/runner/work/_temp/pr-diff-range --extension-packs=codeql-action/pr-diff-range /home/runner/work/_temp/codeql_databases/javascript --min-disk-free=1024 -v

.github/workflows/codeql.yml

[error] 1-1: Command failed fatally during CodeQL resolve extensions-by-pack (error code 2). Analysis step failed while running: codeql database run-queries --ram=14575 --threads=4 --expect-discarded-cache --additional-packs=/home/runner/work/_temp/pr-diff-range --extension-packs=codeql-action/pr-diff-range /home/runner/work/_temp/codeql_databases/javascript --min-disk-free=1024 -v

.github/codeql/extensions/synthorg-sanitisers/python-sanitisers.model.yml

[error] 1-1: Command failed fatally during CodeQL resolve extensions-by-pack (error code 2). Analysis step failed while running: codeql database run-queries --ram=14575 --threads=4 --expect-discarded-cache --additional-packs=/home/runner/work/_temp/pr-diff-range --extension-packs=codeql-action/pr-diff-range /home/runner/work/_temp/codeql_databases/javascript --min-disk-free=1024 -v

.github/codeql/fixtures/python/negative.py

[error] 1-1: Command failed fatally during CodeQL resolve extensions-by-pack (error code 2). Analysis step failed while running: codeql database run-queries --ram=14575 --threads=4 --expect-discarded-cache --additional-packs=/home/runner/work/_temp/pr-diff-range --extension-packs=codeql-action/pr-diff-range /home/runner/work/_temp/codeql_databases/javascript --min-disk-free=1024 -v

.github/codeql/fixtures/python/positive.py

[error] 1-1: Command failed fatally during CodeQL resolve extensions-by-pack (error code 2). Analysis step failed while running: codeql database run-queries --ram=14575 --threads=4 --expect-discarded-cache --additional-packs=/home/runner/work/_temp/pr-diff-range --extension-packs=codeql-action/pr-diff-range /home/runner/work/_temp/codeql_databases/javascript --min-disk-free=1024 -v

scripts/check_codeql_fixtures.py

[error] 1-1: Command failed fatally during CodeQL resolve extensions-by-pack (error code 2). Analysis step failed while running: codeql database run-queries --ram=14575 --threads=4 --expect-discarded-cache --additional-packs=/home/runner/work/_temp/pr-diff-range --extension-packs=codeql-action/pr-diff-range /home/runner/work/_temp/codeql_databases/javascript --min-disk-free=1024 -v

🪛 GitHub Actions: CodeQL Pack Validate / 0_Validate sanitiser pack.txt

scripts/run_codeql_pack_validation.sh

[error] 1-1: CodeQL validation failed: "codeql CLI not on PATH". Install via codeql-action bundle (see script header).

🪛 GitHub Actions: CodeQL Pack Validate / Validate sanitiser pack

scripts/run_codeql_pack_validation.sh

[error] 1-1: CodeQL CLI not on PATH. Install via codeql-action bundle (see script header).

🪛 LanguageTool

CLAUDE.md

[uncategorized] ~19-~19: The official name of this software platform is spelled with a capital “H”.
Context: ...a-flow rule) requires a matching row in `.github/codeql/extensions/synthorg-sanitisers/*...

(GITHUB)

[uncategorized] ~19-~19: The official name of this software platform is spelled with a capital “H”.
Context: ...s/*.model.ymland a fixture pair under.github/codeql/fixtures/(orcli/internal/cod...

(GITHUB)

[uncategorized] ~19-~19: The official name of this software platform is spelled with a capital “H”.
Context: ...internal/codeqlfixtures/). Enforced by .github/workflows/codeql-pack-validate.yml+...

(GITHUB)

docs/reference/sec-prompt-safety.md

[uncategorized] ~151-~151: The official name of this software platform is spelled with a capital “H”.
Context: ...CodeQL Models-as-Data extension pack at .github/codeql/extensions/synthorg-sanitisers/...

(GITHUB)

[uncategorized] ~151-~151: The official name of this software platform is spelled with a capital “H”.
Context: ... to fire. The pack is provably precise: .github/workflows/codeql-pack-validate.yml run...

(GITHUB)

[uncategorized] ~166-~166: The official name of this software platform is spelled with a capital “H”.
Context: ...ixture pair (negative + positive) under .github/codeql/fixtures/ (or `cli/internal/cod...

(GITHUB)

[uncategorized] ~174-~174: The official name of this software platform is spelled with a capital “H”.
Context: ... paths-ignore: ['tests/**'] clause in .github/codeql/codeql-config.yml. Every histor...

(GITHUB)

🪛 Shellcheck (0.11.0)

scripts/run_codeql_pack_validation.sh

[info] 36-36: The mentioned syntax error was in this brace group.

(SC1009)

[error] 46-46: Couldn't parse this shellcheck directive. Fix to allow more checks.

(SC1073)

[error] 46-46: Expected '=' after directive key. Fix any mentioned problems and try again.

(SC1072)

🔇 Additional comments (25)

docs/reference/sec-prompt-safety.md (1)

149-180: LGTM! Clear barrier inventory and validation guarantees.

The new CodeQL barrier inventory section clearly documents the sanitiser-to-rule mappings and the validation mechanism. The cross-references to the fixture validation workflow and the note about three-part changes (helper + model + fixture) align with the coding guideline for sanitiser changes.

Note on static analysis: The LanguageTool hints flagging .github/... paths as needing capitalization are false positives—these are literal file paths in inline code, not references to the GitHub platform.

.github/codeql/fixtures/python/negative.py (1)

39-52: Fixture pattern matches documented Path.relative_to containment check.

The negative path-injection fixture correctly implements the containment pattern: resolve the candidate, then assert it's within project_root via relative_to() in a try/except block, returning None on escape. This aligns with the barrier inventory in sec-prompt-safety.md.

pyproject.toml (1)

270-275: LGTM! Fixture-specific lint ignores are appropriate.

The new per-file ignores for CodeQL fixture files are well-justified:

TRY301: Deliberate raise patterns for exercising data-flow

EM101: String-literal exceptions are part of the fixture test shape

S310: urllib calls are intentional data-flow sinks for SSRF testing

D: Fixture files are not public API surface

.github/codeql/fixtures/python/positive.py (3)

16-24: LGTM! Deliberate cleartext logging vulnerability.

The positive fixture correctly demonstrates the anti-pattern: using str(exc) directly in the logger's extra dict. This should trigger py/clear-text-logging-sensitive-data even with the sanitiser pack loaded, proving the models don't over-suppress.

27-33: LGTM! Deliberate path injection vulnerability.

The positive fixture passes user input directly to Path().read_text() with no containment validation, which should trigger py/path-injection. This validates that the Path.relative_to model in the negative fixture isn't over-broad.

36-45: LGTM! Deliberate partial SSRF vulnerability.

The positive fixture constructs a URL with unvalidated user input (repo, tag) and passes it to urlopen, which should trigger py/partial-ssrf. This ensures the _validate_repo_prefix / _validate_image_tag models only suppress when the validators are called.
.github/codeql/extensions/synthorg-sanitisers/python-sanitisers.model.yml (2)
44-57: LGTM! SSRF validator models are well-scoped.

The _validate_repo_prefix and _validate_image_tag models correctly target scripts.check_image_signatures and document the regex anchoring + SystemExit(2) enforcement. The note about urllib.parse.quote() being belt-and-braces is helpful context.

25-42: ⚡ Quick win

Remove check_domain_error_hierarchy from the model: it does not define _resolve_root.

The model lists 8 modules, but only 7 scripts in scripts/ actually define _resolve_root: check_currency_aggregation_invariant, check_dependency_inversion, check_forbidden_literals, check_no_magic_numbers, check_no_migration_framing, check_no_review_origin_in_code, and check_persistence_boundary. The file check_domain_error_hierarchy.py exists but does not contain _resolve_root and should be removed from the model data block. Additionally, the comment claiming _resolve_root is "duplicated across 12 scripts" is inaccurate; it exists in only 7 scripts.
			> Likely an incorrect or invalid review comment.
docs/reference/convention-gates.md (1)

15-15: LGTM! New gate correctly inserted in alphabetical order.

The check_codeql_fixtures.py entry is correctly positioned alphabetically in the convention gate inventory, and the total count on line 51 has been updated to 38 with the proper RS marker.
.github/codeql/fixtures/javascript/positive.ts (1)
1-16: ⚡ Quick win

js-sanitisers.model.yml exists and is properly configured.

The file is declared in .github/codeql/extensions/synthorg-sanitisers/qlpack.yml under dataExtensions and contains valid JavaScript/TypeScript sanitizer models for sanitizeForLog, sanitizeWsString, and sanitizeWsEnum. The pack configuration is complete with all extension targets (codeql/python-all, codeql/go-all, codeql/javascript-all) correctly specified, and .github/codeql/codeql-config.yml properly references the pack. No configuration issues were found.
			> Likely an incorrect or invalid review comment.
.github/codeql/extensions/synthorg-sanitisers/go-sanitisers.model.yml (1)
23-33: ⚡ Quick win

Incorrect premise: cli/cmd is package cmd, not package main, and CodeQL can model it.

The package declaration in cli/cmd is package cmd—a regular importable Go package, not package main. CodeQL's Models-as-Data can model non-main packages without issue, so the concern about unresolvable package paths does not apply here.

However, note that the model entry for safeStateDir currently lacks test fixtures in cli/internal/codeqlfixtures/fixture.go. Per the coding guidelines, any sanitiser model should have corresponding positive and negative fixtures to validate it works as expected. The existing fixture covers config.SecurePath directly but not the safeStateDir wrapper.
			> Likely an incorrect or invalid review comment.
data/runtime_stats.yaml (1)

46-47: LGTM: convention gate count update is consistent.

The numeric bump to 38 is coherent for this metadata file.

scripts/convention_gate_map.yaml (1)

78-82: LGTM: new mandatory gate is correctly registered.

The new rule entry is well-formed and wired to an explicit enforcement script.

CLAUDE.md (1)

19-19: LGTM: policy text and gate count update are coherent.

The new mandatory CodeQL sanitiser requirement is clearly documented and the gate inventory count is synchronised.

Also applies to: 46-46

.github/workflows/codeql-pack-validate.yml (1)

1-73: LGTM: validation workflow is well-scoped and hardened.

Permissions, trigger paths, SHA pinning, and failure-only SARIF artifact upload all look solid.

.github/workflows/codeql.yml (1)

1-83: LGTM: advanced setup workflow is structured correctly.

The job-level permissions, concurrency policy, pinned action SHAs, and scheduling setup are all in good shape.

.github/codeql/fixtures/expected.json (1)

1-59: LGTM: fixture expectation manifest is clear and complete.

The positive/negative split and Go function-scoped expectations are well specified for deterministic validation.

scripts/run_codeql_pack_validation.sh (1)

46-46: Shellcheck false positive on disable directive.

The shellcheck errors (SC1073, SC1072) flagging this line appear to be a shellcheck version or parser issue. The syntax # shellcheck disable=SC2086 -- <comment> is the standard format for shellcheck disable directives and is correct.

.github/codeql/fixtures/javascript/negative.ts (1)

13-28: Fixture logic looks correct.

The three exported functions correctly exercise the sanitizer barriers:

sanitizeForLog wrapping unknown input

sanitizeWsString clamping WebSocket payloads

sanitizeWsEnum validating against an allowlist

Each should prevent js/log-injection from firing when the synthorg-sanitisers pack is loaded.

cli/internal/codeqlfixtures/fixture.go (2)

26-39: Path injection fixture pair is well-designed.

The negative case properly demonstrates the sanitizer idiom (config.SecurePath validation before os.ReadFile), while the positive case shows the vulnerable pattern (direct user input to file operation). This clear contrast should correctly validate the Models-as-Data barrier definition.

54-63: Log injection fixture pair correctly models safe vs. unsafe logging.

The negative case uses only static labels and non-string scalars (int, bool), while the positive case directly logs a user-controlled string. This properly tests that CodeQL's go/log-injection rule distinguishes between safe structured logging and tainted string interpolation.

scripts/check_codeql_fixtures.py (3)

113-124: Scan path matching logic is correct.

The prefix matching correctly handles:

Empty scan_paths → match all alerts

Exact path match

Prefix match with / separator to avoid false substring matches

This will correctly filter alerts to the intended fixture scope.

127-170: Fixture validation logic is comprehensive and correct.

The checks properly validate:

must_not_fire: reports violations with file:line context

must_fire: reports missing expected alerts

must_not_fire_at / must_fire_at: function-scoped variants

The failure messages clearly distinguish under-modeling (unexpected fires) from over-suppression (missing fires).

181-238: Main orchestration logic is well-structured.

The function:

Validates inputs before processing

Uses appropriate exit codes for different failure modes (USAGE_EXIT_CODE vs FAILURE_EXIT_CODE)

Accumulates failures across all fixtures before reporting

Provides clear pass/fail output

.github/codeql/extensions/synthorg-sanitisers/qlpack.yml (1)

1-11: ⚡ Quick win

The CodeQL pack manifest syntax and file structure are valid.

All referenced files exist and the pack configuration follows the correct syntax for CodeQL model packs. The extensionTargets using wildcard "*" is valid and idiomatic, and codeql/javascript-all correctly covers TypeScript analysis.

The pipeline error "resolve extensions-by-pack (error code 2)" is therefore unlikely to be caused by the manifest structure itself. Investigate the content and syntax of the referenced model files (python-sanitisers.model.yml, go-sanitisers.model.yml, js-sanitisers.model.yml) instead, as they may contain invalid definitions or syntax errors.

coderabbitai · 2026-05-09T12:00:20Z

+import { sanitizeForLog } from '../../../../web/src/utils/logging'
+import { sanitizeWsString, sanitizeWsEnum } from '../../../../web/src/utils/ws-sanitize'


🧹 Nitpick | 🔵 Trivial | ⚖️ Poor tradeoff

Deep relative imports create fragile coupling.

The ../../../../web/src/utils/ imports are brittle—moving either the fixture or the source files will break the tests. Consider whether the fixtures could be co-located with the source code, or whether the source utilities could be published as an internal package that can be imported by path.

That said, this may be a necessary trade-off given the CodeQL pack validation requirements for fixture placement under .github/codeql/fixtures/.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/codeql/fixtures/javascript/negative.ts around lines 7 - 8, The deep relative imports of sanitizeForLog, sanitizeWsString, and sanitizeWsEnum from '../../../../web/src/utils/' are fragile; update the imports to a stable reference by either co-locating the fixture with the source utilities, publishing/using an internal package (importing from that package name), or adding a module path alias (e.g., via tsconfig/webpack) and changing the import statements to use the package or alias; ensure sanitizeForLog, sanitizeWsString, and sanitizeWsEnum are resolved from the new stable location and update any build/test config accordingly so tests still run under the CodeQL fixtures constraint.

coderabbitai · 2026-05-09T12:00:20Z

+from scripts.check_image_signatures import (
+    ImageTag,
+    _validate_image_tag,
+    _validate_repo_prefix,
+)


🧹 Nitpick | 🔵 Trivial | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Description: Check if scripts.check_image_signatures exports the required symbols # Check if scripts/ is a package (has __init__.py) fd __init__.py scripts/ # Check if the functions exist in check_image_signatures.py rg -n "^(def _validate_image_tag|def _validate_repo_prefix|class ImageTag)" scripts/check_image_signatures.py # Try importing in Python to verify module resolution python3 -c "from scripts.check_image_signatures import ImageTag, _validate_image_tag, _validate_repo_prefix; print('OK')"

Repository: Aureliolo/synthorg

Length of output: 193

The imports are valid and work correctly, but consider the style implications of importing private helpers.

The fixture successfully imports ImageTag, _validate_image_tag, and _validate_repo_prefix from scripts.check_image_signatures. While the imports function correctly (despite scripts/ lacking an __init__.py), importing functions prefixed with _ violates convention as they are intended as private implementation details. Consider either:

Making the validators public by removing the _ prefix (if they are legitimately part of the API)

Duplicating the validator logic within the fixture

Moving validators to a public module and importing from there

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/codeql/fixtures/python/negative.py around lines 14 - 18, The test fixture imports private helpers _validate_image_tag and _validate_repo_prefix alongside ImageTag; to fix, either make those validators public (rename _validate_image_tag and _validate_repo_prefix to validate_image_tag and validate_repo_prefix in scripts.check_image_signatures and update imports), or remove the private imports and duplicate the needed validator logic inside the fixture, or move the validator functions into a new public module and import them there; update all references to use the new public names (e.g., validate_image_tag, validate_repo_prefix) or the locally duplicated functions to avoid importing private symbols.

coderabbitai · 2026-05-09T12:00:20Z

+                artifact = physical.get("artifactLocation", {})
+                region = physical.get("region", {})
+                file_uri = artifact.get("uri") or ""
+                start_line = int(region.get("startLine", 0))


🧹 Nitpick | 🔵 Trivial | 💤 Low value

Consider validating startLine is present.

Defaulting to 0 when startLine is missing may hide SARIF parsing issues. Consider logging a warning or failing if this critical field is absent:

🔍 Proposed defensive check

- start_line = int(region.get("startLine", 0)) + start_line_raw = region.get("startLine") + if start_line_raw is None: + print( + f"warning: missing startLine in SARIF result for {rule_id} in {file_uri}", + file=sys.stderr + ) + start_line = 0 + else: + start_line = int(start_line_raw)

That said, CodeQL SARIF output is generally well-formed, so this may be overly defensive.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@scripts/check_codeql_fixtures.py` at line 96, The code currently defaults start_line = int(region.get("startLine", 0)) which can mask malformed SARIF; update the logic around region and startLine to explicitly validate existence and type: check region contains "startLine" and that it's convertible to int (referencing the region variable and start_line assignment), and if missing/invalid either log a warning with context (e.g., which result/file) or raise an exception so parsing fails fast; ensure any logging uses the existing logger in this module to provide useful diagnostic info.

codecov · 2026-05-09T12:02:17Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 84.79%. Comparing base (11132cd) to head (aae666b).
✅ All tests successful. No failed tests found.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1839      +/-   ##
==========================================
- Coverage   84.79%   84.79%   -0.01%     
==========================================
  Files        1800     1800              
  Lines      104855   104855              
  Branches     9200     9200              
==========================================
- Hits        88910    88907       -3     
- Misses      13712    13714       +2     
- Partials     2233     2234       +1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/codeql/extensions/synthorg-sanitisers/python-sanitisers.model.yml:
- Around line 27-43: The comment and model are inconsistent: update the
inventory comment to reflect the actual number of scripts that define
_resolve_root (7, not 12) and remove the incorrect model entry for
"scripts.check_domain_error_hierarchy" from the data list (or replace it with
the correct module that actually defines _resolve_root); ensure the YAML entries
only reference real members like "Member[_resolve_root]" for modules that
contain that function so the model and comment match the true set of scripts.

In `@scripts/run_codeql_pack_validation.sh`:
- Line 60: The ShellCheck directive on the line containing "# shellcheck
disable=SC2086 -- extra_args is a deliberately-split arg list" is malformed;
remove the inline "--" comment and separate the directive from the explanatory
comment so ShellCheck recognizes it. Replace that single invalid line with a
directive-only line "# shellcheck disable=SC2086" followed by a plain comment
line like "# extra_args is a deliberately-split arg list" so the directive
applies to SC2086 while preserving the explanation for maintainers.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: bc1e09e8-8d85-488d-be14-b0232494717f

📥 Commits

Reviewing files that changed from the base of the PR and between d300a7e and 3049fad.

📒 Files selected for processing (6)

.github/codeql/codeql-config.yml
.github/codeql/extensions/synthorg-sanitisers/go-sanitisers.model.yml
.github/codeql/extensions/synthorg-sanitisers/js-sanitisers.model.yml
.github/codeql/extensions/synthorg-sanitisers/python-sanitisers.model.yml
.github/codeql/fixtures/python/negative.py
scripts/run_codeql_pack_validation.sh

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (18)

GitHub Check: Build Backend
GitHub Check: Deploy Preview
GitHub Check: Build Fine-Tune (cpu, fine-tune-cpu)
GitHub Check: Build Fine-Tune (gpu, fine-tune-gpu)
GitHub Check: CodSpeed Python benchmarks
GitHub Check: Lighthouse Site
GitHub Check: CLI Test (windows-latest)
GitHub Check: CLI Bench Regression
GitHub Check: Dashboard Build
GitHub Check: Dashboard Storybook Build
GitHub Check: Build Web Assets (melange)
GitHub Check: Dashboard Test
GitHub Check: Test (Python 3.14)
GitHub Check: Type Check
GitHub Check: Lint
GitHub Check: Analyze (python)
GitHub Check: Validate sanitiser pack
GitHub Check: Analyze (python)

🧰 Additional context used

📓 Path-based instructions (1)

**/*.sh

📄 CodeRabbit inference engine (CLAUDE.md)

Shell scripting: see ~/.claude/rules/common/bash.md (canonical for cd / git -C / Bash file-write rules)

Files:

scripts/run_codeql_pack_validation.sh

🧠 Learnings (1)

📓 Common learnings

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T13:11:29.851Z
Learning: Read design specification from `docs/design/` page before implementing; deviations require approval per DESIGN_SPEC.md

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T13:11:29.851Z
Learning: Present every plan for acceptance/denial before coding (MANDATORY planning requirement)

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T13:11:29.851Z
Learning: No region/currency/locale privileged in code; use metric units; use British English

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T13:11:29.851Z
Learning: Every convention PR must ship its enforcement gate per docs/reference/convention-gates.md

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T13:11:29.851Z
Learning: Timeout/slow test failures indicate source-code regression; never edit `tests/baselines/unit_timing.json`

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T13:11:29.851Z
Learning: Post-Implementation workflow (MANDATORY): branch + commit + push (no auto-PR); use `/pre-pr-review` before PR creation; use `/aurelio-review-pr` for external feedback; fix EVERYTHING valid; no deferring

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T13:11:29.851Z
Learning: xdist: use `-n 8 --dist=loadfile` (prevents 3.14+Windows ProactorEventLoop leak); auto-applied via pyproject addopts

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T13:11:29.851Z
Learning: Git commits: use `<type>: <description>` format (feat/fix/refactor/docs/test/chore/perf/ci); commitizen-enforced

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T13:11:29.851Z
Learning: Signed commits required on protected refs (GPG/SSH or GitHub App via `synthorg-repo-bot`)

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T13:11:29.851Z
Learning: Branch naming: use `<type>/<slug>` format from main

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T13:11:29.851Z
Learning: Pre-commit/pre-push hooks configured in `.pre-commit-config.yaml`; Hookify rules in `.claude/hookify.*.md`

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T13:11:29.851Z
Learning: Squash merge with PR body as squash commit message; trailers (`Release-As`, `Closes `#N``) must be in PR body

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T13:11:29.851Z
Learning: GitHub queries: use `gh issue list` via Bash, NOT MCP `list_issues` tool

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T13:11:29.851Z
Learning: Web implementation: see `web/CLAUDE.md` for React 19 dashboard guidelines

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T13:11:29.851Z
Learning: Python 3.14+ required for synthorg framework; BUSL-1.1 → Apache 2.0 after Change Date

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T13:11:29.851Z
Learning: Project layout: src layout with `src/synthorg/` (Python), `tests/`, `web/` (React 19), `cli/` (Go binary)

🪛 Shellcheck (0.11.0)

scripts/run_codeql_pack_validation.sh

[info] 50-50: The mentioned syntax error was in this brace group.

(SC1009)

[error] 60-60: Couldn't parse this shellcheck directive. Fix to allow more checks.

(SC1073)

[error] 60-60: Expected '=' after directive key. Fix any mentioned problems and try again.

(SC1072)

🔇 Additional comments (4)

.github/codeql/codeql-config.yml (1)

1-14: Configuration looks correct and complete.

This matches the advanced CodeQL setup goals, and the fixture ignore patterns (including cli/internal/codeqlfixtures/**) correctly prevent intentional fixture findings from polluting main scan results.

.github/codeql/extensions/synthorg-sanitisers/go-sanitisers.model.yml (1)

14-35: Go sanitiser modelling block looks consistent.

The two sanitizerModel entries are clear and scoped/documented well for the intended path-safety flow.

.github/codeql/extensions/synthorg-sanitisers/js-sanitisers.model.yml (1)

19-40: JS/TS sanitiser rows look correctly structured and focused.

This keeps the model surface narrow to the documented leaf sanitisers.

.github/codeql/fixtures/python/negative.py (1)

26-69: Negative fixture cases are well-formed and map cleanly to the modelled barriers.

The three functions are explicit and suitable for must_not_fire validation.

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/codeql/fixtures/javascript/negative.ts:
- Around line 24-28: The test fixture negativeWsEnumLogInjection returns the
result of sanitizeWsEnum(…) but never passes it into a js/log-injection sink, so
CodeQL cannot validate the sanitizer; change negativeWsEnumLogInjection to send
the sanitized value into a recognized log-injection sink (e.g., call the
project’s logging function or a test sink that CodeQL recognizes) instead of
only returning it, and when you modify the sanitizer model also add the required
CodeQL binding and matching fixture pair under .github/codeql/fixtures (or
cli/internal/codeqlfixtures/) so sanitizeWsEnum and the sink invocation are both
covered.

In @.github/workflows/codeql.yml:
- Around line 34-40: The language matrix under the "language" key currently
lists duplicate analyzers ("javascript", "typescript", and
"javascript-typescript"); remove the redundant "javascript" and "typescript"
entries and keep only "javascript-typescript" so the CodeQL job analyzes JS/TS
code once (update the language array where it lists these three values).
- Around line 69-74: Remove the workflow-level "queries" parameter from the
CodeQL init step (the uses: github/codeql-action/init block) so the config-file
(.github/codeql/codeql-config.yml) can control which queries run; specifically
delete the line setting queries: security-extended in that init step to allow
your custom queries (synthorg-cleartext-logging, synthorg-partial-ssrf) and
language-specific packs from the config file to be applied.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 7d618099-e652-47f9-9f4c-d12e34031cac

📥 Commits

Reviewing files that changed from the base of the PR and between 3049fad and ee744f1.

📒 Files selected for processing (27)

.github/codeql/codeql-config.yml
.github/codeql/extensions/synthorg-sanitisers/go-sanitisers.model.yml
.github/codeql/extensions/synthorg-sanitisers/js-sanitisers.model.yml
.github/codeql/extensions/synthorg-sanitisers/python-sanitisers.model.yml
.github/codeql/extensions/synthorg-sanitisers/qlpack.yml
.github/codeql/fixtures/expected.json
.github/codeql/fixtures/javascript/negative.ts
.github/codeql/fixtures/javascript/positive.ts
.github/codeql/fixtures/python/negative.py
.github/codeql/fixtures/python/positive.py
.github/codeql/queries/synthorg-cleartext-logging/CleartextLoggingWithBarriers.ql
.github/codeql/queries/synthorg-cleartext-logging/Sanitizers.qll
.github/codeql/queries/synthorg-cleartext-logging/qlpack.yml
.github/codeql/queries/synthorg-partial-ssrf/PartialServerSideRequestForgeryWithBarriers.ql
.github/codeql/queries/synthorg-partial-ssrf/Sanitizers.qll
.github/codeql/queries/synthorg-partial-ssrf/qlpack.yml
.github/workflows/codeql-pack-validate.yml
.github/workflows/codeql.yml
CLAUDE.md
cli/internal/codeqlfixtures/fixture.go
data/runtime_stats.yaml
docs/reference/convention-gates.md
docs/reference/sec-prompt-safety.md
pyproject.toml
scripts/check_codeql_fixtures.py
scripts/convention_gate_map.yaml
scripts/run_codeql_pack_validation.sh

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (13)

GitHub Check: Build Backend
GitHub Check: Build Fine-Tune (cpu, fine-tune-cpu)
GitHub Check: Build Fine-Tune (gpu, fine-tune-gpu)
GitHub Check: Build Web Assets (melange)
GitHub Check: CLI Test (windows-latest)
GitHub Check: CLI Bench Regression
GitHub Check: CodSpeed Python benchmarks
GitHub Check: Dashboard Test
GitHub Check: Test (Python 3.14)
GitHub Check: Lighthouse Site
GitHub Check: Analyze (python)
GitHub Check: Validate sanitiser pack
GitHub Check: Analyze (python)

🧰 Additional context used

📓 Path-based instructions (5)

{README.md,docs/**/*.md}

📄 CodeRabbit inference engine (CLAUDE.md)

Numerics in README and public docs must be sourced from data/runtime_stats.yaml via  markers

Files:

docs/reference/convention-gates.md
docs/reference/sec-prompt-safety.md

docs/**/*.{md,d2}

📄 CodeRabbit inference engine (CLAUDE.md)

Use d2 for architecture/nested container diagrams; use mermaid for flowcharts/sequence/pipelines; use markdown tables for tabular data; D2 theme 200 (Dark Mauve); CI pinned to v0.7.1

Files:

docs/reference/convention-gates.md
docs/reference/sec-prompt-safety.md

pyproject.toml

📄 CodeRabbit inference engine (CLAUDE.md)

Apply xdist -n 8 --dist=loadfile via pyproject addopts (loadfile prevents Python 3.14+ Windows ProactorEventLoop leak)

Files:

pyproject.toml

{src/synthorg/**/*.py,.github/codeql/**,cli/internal/codeqlfixtures/**}

📄 CodeRabbit inference engine (CLAUDE.md)

Any change to a project sanitiser requires a matching CodeQL binding (barrierModel in .github/codeql/extensions/synthorg-sanitisers/*.model.yml or Sanitizer subclass in .github/codeql/queries/synthorg-<rule>/Sanitizers.qll) plus a fixture pair under .github/codeql/fixtures/ or cli/internal/codeqlfixtures/

Files:

.github/codeql/fixtures/javascript/positive.ts
.github/codeql/extensions/synthorg-sanitisers/qlpack.yml
.github/codeql/queries/synthorg-cleartext-logging/qlpack.yml
.github/codeql/queries/synthorg-partial-ssrf/qlpack.yml
.github/codeql/queries/synthorg-partial-ssrf/Sanitizers.qll
.github/codeql/queries/synthorg-cleartext-logging/Sanitizers.qll
.github/codeql/codeql-config.yml
.github/codeql/extensions/synthorg-sanitisers/python-sanitisers.model.yml
.github/codeql/extensions/synthorg-sanitisers/go-sanitisers.model.yml
.github/codeql/fixtures/python/positive.py
.github/codeql/queries/synthorg-cleartext-logging/CleartextLoggingWithBarriers.ql
.github/codeql/fixtures/javascript/negative.ts
.github/codeql/queries/synthorg-partial-ssrf/PartialServerSideRequestForgeryWithBarriers.ql
.github/codeql/fixtures/python/negative.py
.github/codeql/extensions/synthorg-sanitisers/js-sanitisers.model.yml
cli/internal/codeqlfixtures/fixture.go
.github/codeql/fixtures/expected.json

cli/**/*.go

📄 CodeRabbit inference engine (cli/CLAUDE.md)

Use hardcoded localhost, postgres:5432, and nats:4222 literals in CLI code by design (correct by security/audit rationale; do not make these configurable). See docs/reference/cli-env-vars.md for audit justification

Files:

cli/internal/codeqlfixtures/fixture.go

🧠 Learnings (2)

📓 Common learnings

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T14:11:42.550Z
Learning: Read design specification in `docs/design/` before implementing; deviations require approval

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T14:11:42.550Z
Learning: Present every plan for accept/deny before coding

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T14:11:42.550Z
Learning: No region/currency/locale should be privileged; use metric units and British English

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T14:11:42.550Z
Learning: Every convention PR must ship its enforcement gate

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T14:11:42.550Z
Learning: Timeout/slow test failures indicate source-code regression; never edit `tests/baselines/unit_timing.json`

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T14:11:42.550Z
Learning: After issue: create branch + commit + push (no auto-PR); use `/pre-pr-review`; after PR: use `/aurelio-review-pr` for external feedback; fix everything valid and do not defer

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T14:11:42.550Z
Learning: Commits use format `<type>: <description>` where type is {feat,fix,refactor,docs,test,chore,perf,ci}; enforced by commitizen

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T14:11:42.550Z
Learning: Signed commits required on protected refs (GPG/SSH or GitHub App via `synthorg-repo-bot`)

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T14:11:42.550Z
Learning: Branch naming: `<type>/<slug>` from main

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T14:11:42.550Z
Learning: Use squash merge with PR body becoming squash commit; trailers (`Release-As`, `Closes `#N``) must be in PR body

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T14:11:42.550Z
Learning: Use `gh issue list` via Bash for GitHub queries, NOT MCP `list_issues`

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T14:11:42.550Z
Learning: Finished issue: branch + commit + push (no auto-PR); use `/pre-pr-review`; after squash merge use `/post-merge-cleanup`

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T14:11:42.550Z
Learning: CLI is Docker-only (init/start/stop/status); features go in dashboard + REST API

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T14:11:42.550Z
Learning: Use `go -C cli` never `cd cli` for CLI operations

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T14:11:42.550Z
Learning: See `web/CLAUDE.md` for web framework guidelines and `cli/CLAUDE.md` for CLI guidelines

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T14:11:42.550Z
Learning: See `~/.claude/rules/common/bash.md` for canonical Bash rules (`cd` / `git -C` / file-write rules)

📚 Learning: 2026-05-05T09:04:46.195Z

Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1760
File: scripts/_dual_backend_parity_lib.py:215-216
Timestamp: 2026-05-05T09:04:46.195Z
Learning: This repository targets Python 3.14+ and follows PEP 758. Therefore, reviewer tooling should NOT treat unparenthesized multi-exception `except` clauses written without an `as` clause (e.g., `except MemoryError, RecursionError:`) as syntax errors. Only flag `except`-clause problems when they are genuinely invalid for Python 3.14+.

Applied to files:

scripts/check_codeql_fixtures.py

🪛 LanguageTool

docs/reference/sec-prompt-safety.md

[uncategorized] ~151-~151: The official name of this software platform is spelled with a capital “H”.
Context: ...aces. Each binding is provably precise: .github/workflows/codeql-pack-validate.yml run...

(GITHUB)

[uncategorized] ~153-~153: The official name of this software platform is spelled with a capital “H”.
Context: ...e 1 -- Models-as-Data extension pack** (.github/codeql/extensions/synthorg-sanitisers/...

(GITHUB)

[uncategorized] ~155-~155: The official name of this software platform is spelled with a capital “H”.
Context: ...Surface 2 -- custom QL query packs (.github/codeql/queries/synthorg-*/). For queri...

(GITHUB)

[uncategorized] ~155-~155: Do not mix variants of the same word (‘sanitizer’ and ‘sanitiser’) within a single text.
Context: ...nthorg-*/). For queries whose abstract Sanitizerclass has nobarrierNode` MaD hook up...

(EN_WORD_COHERENCY)

[uncategorized] ~155-~155: Do not mix variants of the same word (‘sanitizer’ and ‘sanitiser’) within a single text.
Context: ...aD hook upstream, the pack defines a QL Sanitizer subclass and ships a custom query (re-...

(EN_WORD_COHERENCY)

[uncategorized] ~170-~170: Do not mix variants of the same word (‘sanitizer’ and ‘sanitiser’) within a single text.
Context: ...-sanitisers.model.yml(Surface 1) or aSanitizersubclass under.github/codeql/queries...

(EN_WORD_COHERENCY)

[uncategorized] ~170-~170: Do not mix variants of the same word (‘sanitizer’ and ‘sanitiser’) within a single text.
Context: ...face 1) or a Sanitizer subclass under .github/codeql/queries/synthorg-<rule>/Sanitizers.qll (Surface 2), (3) a fixture pair (n...

(EN_WORD_COHERENCY)

[uncategorized] ~170-~170: The official name of this software platform is spelled with a capital “H”.
Context: ...ixture pair (negative + positive) under .github/codeql/fixtures/ (or `cli/internal/cod...

(GITHUB)

[uncategorized] ~178-~178: The official name of this software platform is spelled with a capital “H”.
Context: ... paths-ignore: ['tests/**'] clause in .github/codeql/codeql-config.yml. Every histor...

(GITHUB)

CLAUDE.md

[uncategorized] ~19-~19: The official name of this software platform is spelled with a capital “H”.
Context: ...tching binding (a barrierModel row in `.github/codeql/extensions/synthorg-sanitisers/*...

(GITHUB)

[uncategorized] ~19-~19: Do not mix variants of the same word (‘sanitizer’ and ‘sanitiser’) within a single text.
Context: ...rNodeModels-as-Data hook, otherwise aSanitizersubclass under.github/codeql/queries...

(EN_WORD_COHERENCY)

[uncategorized] ~19-~19: Do not mix variants of the same word (‘sanitizer’ and ‘sanitiser’) within a single text.
Context: ... otherwise a Sanitizer subclass under .github/codeql/queries/synthorg-<rule>/Sanitizers.qll plus the matching custom .ql) an...

(EN_WORD_COHERENCY)

[uncategorized] ~19-~19: The official name of this software platform is spelled with a capital “H”.
Context: ... custom .ql) and a fixture pair under .github/codeql/fixtures/ (or `cli/internal/cod...

(GITHUB)

[uncategorized] ~19-~19: The official name of this software platform is spelled with a capital “H”.
Context: ...internal/codeqlfixtures/). Enforced by .github/workflows/codeql-pack-validate.yml+...

(GITHUB)

🔇 Additional comments (25)

pyproject.toml (1)

270-275: Per-file ignores are appropriate for CodeQL test fixtures.

The Ruff rule exceptions (TRY301, EM101, S310, D) are well-justified for security-scanner fixtures that must contain deliberate anti-patterns to validate CodeQL detection. Verification confirms:

✓ Fixture files are isolated (no imports outside .github/codeql/fixtures/)

✓ Pytest excludes fixtures via testpaths = ["tests"]

Fixtures contain intentional vulnerable patterns (e.g., S310 urllib security sinks) needed to validate CodeQL's detection capabilities

No blocking concerns. The fixture structure prevents accidental execution in test suites and eliminates risk of anti-patterns leaking into production code.

.github/codeql/queries/synthorg-cleartext-logging/qlpack.yml (1)

1-5: LGTM!

The query pack manifest is correctly structured with appropriate naming (synthorg/cleartext-logging-queries), library: false for a query pack, and the standard codeql/python-all dependency.

.github/codeql/queries/synthorg-partial-ssrf/Sanitizers.qll (1)

28-35: LGTM!

The sanitizer correctly models the OCI validators using the CodeQL API graph. The characteristic predicate properly resolves scripts.check_image_signatures._validate_repo_prefix and _validate_image_tag call sites. The documentation clearly explains why the custom QL approach is needed (no barrierNode MaD hook for partial-SSRF).

.github/codeql/queries/synthorg-cleartext-logging/Sanitizers.qll (1)

24-38: LGTM!

The sanitizer correctly models both import paths for the redaction helpers: the full path via synthorg.observability.redaction and the re-exported path via synthorg.observability. The disjunction ensures coverage regardless of how callers import the functions.

docs/reference/sec-prompt-safety.md (1)

148-183: LGTM!

The CodeQL barrier inventory documentation is comprehensive and well-structured. The two-surface approach (Models-as-Data vs custom QL query packs) is clearly explained, and the inventory table provides actionable information for maintainers adding new sanitisers. The mixed spelling of "sanitiser" (prose) vs "Sanitizer" (CodeQL class names) is appropriate—the latter are fixed upstream identifiers.

.github/codeql/extensions/synthorg-sanitisers/go-sanitisers.model.yml (1)

17-33: LGTM!

The barrierModel rows correctly model both Go sanitisers with proper package paths, ReturnValue[0] targeting the cleaned path (not the error), and path-injection kind. The comment explaining why safeStateDir needs separate modeling (inter-procedural precision loss) is a helpful maintenance note.

.github/codeql/queries/synthorg-cleartext-logging/CleartextLoggingWithBarriers.ql (1)

1-36: LGTM!

The custom query correctly re-uses the upstream CleartextLoggingFlow with project-specific sanitisers in scope. The import Sanitizers brings SynthorgRedactionSanitizer into the analysis, and the path-problem structure with source classification is appropriate for security queries.

.github/codeql/queries/synthorg-partial-ssrf/PartialServerSideRequestForgeryWithBarriers.ql (1)

1-34: LGTM!

The partial-SSRF query correctly targets cases where only part of the URL is user-controlled (not fullyControlledRequest(request)). The high @security-severity 9.1 is appropriate for SSRF, and @precision medium acknowledges the inherent noise in partial-control detection. The sanitiser import ensures SynthorgImageRefSanitizer is in scope.

.github/codeql/queries/synthorg-partial-ssrf/qlpack.yml (1)

1-5: LGTM!

The query pack manifest is correctly structured, consistent with the cleartext-logging pack format.

.github/codeql/extensions/synthorg-sanitisers/python-sanitisers.model.yml (1)

17-35: _resolve_root barrier inventory looks consistent.

The explicit 7-module barrierModel mapping is coherent and matches the inline inventory notes.

.github/codeql/fixtures/python/negative.py (1)

26-69: Negative fixture flows are well-constructed.

All three paths keep the intended sanitiser/barrier on the dataflow route for “must not fire” assertions.

.github/codeql/fixtures/python/positive.py (1)

16-45: Positive fixtures correctly preserve genuine leak paths.

The three functions are clean “must fire” counterparts to the negative cases.

cli/internal/codeqlfixtures/fixture.go (1)

26-63: Go fixture coverage is solid for both path and log injection cases.

The positive/negative function pairs are precise and suitable for pack validation.

.github/codeql/codeql-config.yml (1)

11-29: Config-level fixture exclusions and Python query replacement filters look correct.

docs/reference/convention-gates.md (1)

15-51: Gate inventory update is consistent with the RS-backed total.

data/runtime_stats.yaml (1)

46-47: Runtime stat bump for convention_gates is aligned with the docs inventory update.

scripts/convention_gate_map.yaml (1)

78-82: Good gate registration for the new CodeQL sanitiser convention.

This mandatory entry correctly wires the enforcement script into the convention inventory.

.github/workflows/codeql-pack-validate.yml (1)

20-73: Workflow hardening and validation flow look solid.

Pinned actions, least-privilege permissions, path-scoped triggers, and failure artifact upload are all implemented well.

CLAUDE.md (1)

19-19: The new MANDATORY CodeQL sanitiser rule is clearly specified and enforceable.

The linkage to workflow + fixture checker is explicit and actionable.

.github/codeql/fixtures/javascript/positive.ts (1)

7-15: Positive JS fixtures are well targeted for over-suppression checks.

Raw payloads flow directly to logging sinks, which is exactly what this harness needs.

.github/codeql/fixtures/expected.json (1)

4-58: Expectation manifest is structured well for cross-language validation.

The split between must_fire / must_not_fire and Go’s function-scoped assertions is appropriate.

.github/codeql/extensions/synthorg-sanitisers/qlpack.yml (1)

1-11: Pack manifest wiring looks correct.

extensionTargets and dataExtensions are set up cleanly for the three model files.

scripts/run_codeql_pack_validation.sh (1)

83-119: Validation harness now correctly runs distinct positive/negative analyses.

Good to see explicit per-variant analysis and clear rationale in comments.

.github/codeql/extensions/synthorg-sanitisers/js-sanitisers.model.yml (1)

13-39: JS barrier models look correctly scoped and structured.

The barrierModel entries are consistent and target the intended log-injection sanitisers with a minimal model surface.

scripts/check_codeql_fixtures.py (1)

80-242: Fixture validator implementation is solid and CI-friendly.

Parsing, scope filtering, expectation checks, and exit-code handling are all well structured for reliable gate enforcement.

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@CLAUDE.md`:
- Line 19: The long single-line bullet under "CodeQL Sanitisers (MANDATORY)" in
CLAUDE.md is hard to read; split it into a short lead sentence and a bulleted
sub-list that separates: 1) the requirement (matching binding/barrierModel or
Sanitizer subclass), 2) where to put the binding (language-specific
.github/codeql/extensions/.../*.model.yml or
.github/codeql/queries/.../Sanitizers.qll + custom .ql), 3) the required
fixtures (.github/codeql/fixtures/ or cli/internal/codeqlfixtures/), and 4) the
enforcing checks (.github/workflows/codeql-pack-validate.yml and
scripts/check_codeql_fixtures.py) and reference link
(docs/reference/sec-prompt-safety.md#codeql-barrier-inventory); keep each item
on its own line to improve scannability.
- Line 19: The text mixes British "sanitiser" with American
"Sanitizer"/"Sanitizers" in code artifact names; standardize to British spelling
across prose and referenced artifacts to match the codebase policy: rename
usages of "Sanitizer", "Sanitizers.qll" and any occurrences in filenames or
class names referenced (e.g.
`.github/codeql/queries/synthorg-<rule>/Sanitizers.qll`, `Sanitizer` subclass,
`.github/codeql/extensions/.../*.model.yml` bindings, and fixture references
under `.github/codeql/fixtures/`) to use "Sanitiser"/"Sanitisers" (and update
CLAUDE.md line accordingly), and update any docs or references such as
docs/reference/sec-prompt-safety.md to the British spelling so everything is
consistent.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 794e1005-2715-4630-be48-044a7e5f6bba

📥 Commits

Reviewing files that changed from the base of the PR and between ee744f1 and f01f6c2.

📒 Files selected for processing (11)

.github/codeql/extensions/synthorg-go-sanitisers/go-sanitisers.model.yml
.github/codeql/extensions/synthorg-go-sanitisers/qlpack.yml
.github/codeql/extensions/synthorg-js-sanitisers/js-sanitisers.model.yml
.github/codeql/extensions/synthorg-js-sanitisers/qlpack.yml
.github/codeql/extensions/synthorg-python-sanitisers/python-sanitisers.model.yml
.github/codeql/extensions/synthorg-python-sanitisers/qlpack.yml
.github/codeql/fixtures/javascript/negative.ts
.github/workflows/codeql.yml
CLAUDE.md
docs/reference/sec-prompt-safety.md
scripts/run_codeql_pack_validation.sh

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (13)

GitHub Check: Build Backend
GitHub Check: Build Fine-Tune (cpu, fine-tune-cpu)
GitHub Check: Build Fine-Tune (gpu, fine-tune-gpu)
GitHub Check: Validate sanitiser pack
GitHub Check: CodSpeed Python benchmarks
GitHub Check: Build Web Assets (melange)
GitHub Check: Lighthouse Site
GitHub Check: CLI Test (windows-latest)
GitHub Check: CLI Bench Regression
GitHub Check: Dashboard Test
GitHub Check: Test (Python 3.14)
GitHub Check: Analyze (python)
GitHub Check: Analyze (python)

🧰 Additional context used

📓 Path-based instructions (3)

{.github/codeql/extensions/**,cli/internal/codeqlfixtures/**,.github/codeql/fixtures/**}

📄 CodeRabbit inference engine (CLAUDE.md)

Any change to a CodeQL sanitiser requires matching binding in language-specific .github/codeql/extensions/synthorg-{python,js,go}-sanitisers/*.model.yml or custom .ql files plus fixture pair in .github/codeql/fixtures/ or cli/internal/codeqlfixtures/

Files:

.github/codeql/extensions/synthorg-python-sanitisers/qlpack.yml
.github/codeql/extensions/synthorg-js-sanitisers/qlpack.yml
.github/codeql/extensions/synthorg-js-sanitisers/js-sanitisers.model.yml
.github/codeql/extensions/synthorg-go-sanitisers/qlpack.yml
.github/codeql/extensions/synthorg-go-sanitisers/go-sanitisers.model.yml
.github/codeql/extensions/synthorg-python-sanitisers/python-sanitisers.model.yml
.github/codeql/fixtures/javascript/negative.ts

**/*.sh

📄 CodeRabbit inference engine (CLAUDE.md)

Shell scripts: reference ~/.claude/rules/common/bash.md for canonical rules on cd / git -C / Bash file-write

Files:

scripts/run_codeql_pack_validation.sh

docs/**/*.md

📄 CodeRabbit inference engine (CLAUDE.md)

D2 diagrams for architecture/nested containers; mermaid for flowcharts/sequence/pipelines. Markdown tables for tabular data. D2 theme 200 (Dark Mauve), CI pinned to v0.7.1.

Files:

docs/reference/sec-prompt-safety.md

🧠 Learnings (1)

📓 Common learnings

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T14:26:51.406Z
Learning: Read `docs/design/` page before implementing; deviations need approval

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T14:26:51.406Z
Learning: Present every plan for accept/deny before coding

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T14:26:51.406Z
Learning: No region/currency/locale privileged; use metric units; British English in text and code

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T14:26:51.406Z
Learning: Every convention PR must ship its enforcement gate

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T14:26:51.406Z
Learning: Timeout/slow test failures are source-code regressions; never edit `tests/baselines/unit_timing.json`

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T14:26:51.406Z
Learning: After issue completion: branch + commit + push (no auto-PR); use `/pre-pr-review`. After PR merge: `/aurelio-review-pr` for external feedback. Fix EVERYTHING valid; no deferring.

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T14:26:51.406Z
Learning: Git commits: `<type>: <description>` format (feat/fix/refactor/docs/test/chore/perf/ci); commitizen-enforced

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T14:26:51.406Z
Learning: Signed commits required on protected refs (GPG/SSH or GitHub App via `synthorg-repo-bot`)

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T14:26:51.406Z
Learning: Branches: `<type>/<slug>` format from main

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T14:26:51.406Z
Learning: Use `.pre-commit-config.yaml` for pre-commit/pre-push hooks; see `.claude/hookify.*.md` for rules

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T14:26:51.406Z
Learning: Squash merge PRs; PR body becomes squash commit; trailers (`Release-As`, `Closes `#N``) must be in PR body

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T14:26:51.406Z
Learning: GitHub queries: use `gh issue list` via Bash, NOT MCP `list_issues`

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T14:26:51.406Z
Learning: Finished issue: create branch + commit + push (no auto-PR). Use `/pre-pr-review` slash command.

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T14:26:51.406Z
Learning: After every squash merge: run `/post-merge-cleanup` slash command

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T14:26:51.406Z
Learning: CLI is Docker-only (init/start/stop/status); features go in dashboard + REST API

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T14:26:51.406Z
Learning: Web dashboard: reference `web/CLAUDE.md` for additional guidelines

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T14:26:51.406Z
Learning: CLI (Go binary): reference `cli/CLAUDE.md` for additional guidelines; use `go -C cli`, never `cd cli`

🪛 LanguageTool

docs/reference/sec-prompt-safety.md

[uncategorized] ~151-~151: The official name of this software platform is spelled with a capital “H”.
Context: ...aces. Each binding is provably precise: .github/workflows/codeql-pack-validate.yml run...

(GITHUB)

[uncategorized] ~153-~153: The official name of this software platform is spelled with a capital “H”.
Context: ... 1 -- Models-as-Data extension packs** (`.github/codeql/extensions/synthorg-{python,js,g...

(GITHUB)

[uncategorized] ~155-~155: The official name of this software platform is spelled with a capital “H”.
Context: ...Surface 2 -- custom QL query packs (.github/codeql/queries/synthorg-*/). For queri...

(GITHUB)

[uncategorized] ~155-~155: Do not mix variants of the same word (‘sanitizer’ and ‘sanitiser’) within a single text.
Context: ...nthorg-*/). For queries whose abstract Sanitizerclass has nobarrierNode` MaD hook up...

(EN_WORD_COHERENCY)

[uncategorized] ~155-~155: Do not mix variants of the same word (‘sanitizer’ and ‘sanitiser’) within a single text.
Context: ...aD hook upstream, the pack defines a QL Sanitizer subclass and ships a custom query (re-...

(EN_WORD_COHERENCY)

[uncategorized] ~170-~170: Do not mix variants of the same word (‘sanitizer’ and ‘sanitiser’) within a single text.
Context: ...-sanitisers.model.yml(Surface 1) or aSanitizersubclass under.github/codeql/queries...

(EN_WORD_COHERENCY)

[uncategorized] ~170-~170: Do not mix variants of the same word (‘sanitizer’ and ‘sanitiser’) within a single text.
Context: ...face 1) or a Sanitizer subclass under .github/codeql/queries/synthorg-<rule>/Sanitizers.qll (Surface 2), (3) a fixture pair (n...

(EN_WORD_COHERENCY)

[uncategorized] ~170-~170: The official name of this software platform is spelled with a capital “H”.
Context: ...ixture pair (negative + positive) under .github/codeql/fixtures/ (or `cli/internal/cod...

(GITHUB)

[uncategorized] ~178-~178: The official name of this software platform is spelled with a capital “H”.
Context: ... paths-ignore: ['tests/**'] clause in .github/codeql/codeql-config.yml. Every histor...

(GITHUB)

CLAUDE.md

[uncategorized] ~19-~19: The official name of this software platform is spelled with a capital “H”.
Context: ...rierModelrow in the language-specific.github/codeql/extensions/synthorg-{python,js,g...

(GITHUB)

[uncategorized] ~19-~19: Do not mix variants of the same word (‘sanitizer’ and ‘sanitiser’) within a single text.
Context: ...rNodeModels-as-Data hook, otherwise aSanitizersubclass under.github/codeql/queries...

(EN_WORD_COHERENCY)

[uncategorized] ~19-~19: Do not mix variants of the same word (‘sanitizer’ and ‘sanitiser’) within a single text.
Context: ... otherwise a Sanitizer subclass under .github/codeql/queries/synthorg-<rule>/Sanitizers.qll plus the matching custom .ql) an...

(EN_WORD_COHERENCY)

[uncategorized] ~19-~19: The official name of this software platform is spelled with a capital “H”.
Context: ... custom .ql) and a fixture pair under .github/codeql/fixtures/ (or `cli/internal/cod...

(GITHUB)

[uncategorized] ~19-~19: The official name of this software platform is spelled with a capital “H”.
Context: ...internal/codeqlfixtures/). Enforced by .github/workflows/codeql-pack-validate.yml+...

(GITHUB)

🔇 Additional comments (14)

.github/workflows/codeql.yml (3)

15-30: Least-privilege permissions are well implemented.

Workflow-level permissions: {} plus narrowly scoped job permissions is solid and reduces token blast radius.

34-38: Language matrix is correctly de-duplicated.

Using javascript-typescript once (instead of separate JS/TS entries) is the right setup for CodeQL and avoids duplicate analysis cost.

67-79: CodeQL init/analyse wiring looks correct.

Using only config-file in init and a language-scoped analysis category keeps query selection centralised and avoids accidental override behaviour.

.github/codeql/extensions/synthorg-js-sanitisers/js-sanitisers.model.yml (1)

13-39: JS barrier models are correctly scoped and explicit.

The three barrierModel bindings are clear, minimal, and aligned to the intended log-injection kind.

.github/codeql/extensions/synthorg-js-sanitisers/qlpack.yml (1)

1-7: Pack metadata and data extension linkage look good.

extensionTargets and dataExtensions are correctly configured for the JS sanitiser pack.

.github/codeql/extensions/synthorg-go-sanitisers/go-sanitisers.model.yml (1)

12-33: Go path-injection barrier bindings are well-formed.

The two modeled return-value barriers are explicit and appropriately constrained to path-injection.

.github/codeql/extensions/synthorg-go-sanitisers/qlpack.yml (1)

1-7: Go extension pack wiring is correct.

The qlpack cleanly targets codeql/go-all and includes the expected model file.

.github/codeql/fixtures/javascript/negative.ts (1)

24-32: sanitizeWsEnum fixture now validates the barrier path properly.

Routing the sanitized enum into console.warn makes this negative test non-vacuous and meaningfully checks the model.

.github/codeql/extensions/synthorg-python-sanitisers/qlpack.yml (1)

1-7: Python extension pack configuration is solid.

The pack metadata and model registration are correctly set for Python.

docs/reference/sec-prompt-safety.md (1)

149-179: Barrier inventory section is clear and operationally complete.

The two-surface model explanation plus rule-to-sanitiser tables make maintenance expectations explicit and auditable.

.github/codeql/extensions/synthorg-python-sanitisers/python-sanitisers.model.yml (1)

17-35: Python path-injection barrier mappings are explicit and maintainable.

Enumerating each _resolve_root module keeps the model precise for current script entry points.

scripts/run_codeql_pack_validation.sh (1)

1-133: LGTM!

Well-structured validation script with good practices:

Uses set -euo pipefail for robust error handling

Avoids cd in favour of absolute paths derived from git rev-parse --show-toplevel

Properly handles CI environment PATH discovery for CodeQL CLI

Clear documentation explaining fixture strategy per language

The previous critical issues (copying SARIF instead of running separate analyses, malformed ShellCheck directive) have been correctly addressed — Python and JavaScript fixtures now run separate run_fixture invocations, and the ShellCheck disable is properly formatted on its own line.
CLAUDE.md (2)
46-46: LGTM: Convention-gates count updated correctly.

The update from 35 to 38 enforcement gates aligns with this PR adding CodeQL sanitiser validation as a new enforcement gate, consistent with the mandatory requirement that convention PRs ship their enforcement mechanisms.

19-19: ⚡ Quick win

Path pattern in line 19 is accurate.

The directory structure verification confirms that .github/codeql/extensions/synthorg-{python,js,go}-sanitisers/*.model.yml correctly describes the actual per-language extension pack layout. The three separate directories (synthorg-go-sanitisers/, synthorg-js-sanitisers/, synthorg-python-sanitisers/), each containing a language-specific .model.yml file, align exactly with the documented path pattern.
			> Likely an incorrect or invalid review comment.

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/reference/sec-prompt-safety.md`:
- Around line 153-170: The doc contains hard-coded numeric claims (e.g., counts
referenced near the "Surface 1 -- Models-as-Data extension packs" and "Surface 2
-- custom QL query packs" sections and in the table rows describing sanitiser
coverage) that must be replaced with runtime-stats markers or rewritten
non-numerically; update the markdown to use <!--RS:NAME--> markers pulling from
runtime_stats.yaml (or change phrasing to avoid explicit numbers) where the text
currently asserts concrete counts, and ensure references to specific sanitiser
examples like sanitizeForLog, sanitizeWsString, safe_error_description,
scrub_secret_tokens, and the Surface 1/Surface 2 headings reflect the marker
usage or non-numeric wording.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 1083ddb6-b777-47d9-9a96-ac45d23abfb8

📥 Commits

Reviewing files that changed from the base of the PR and between f01f6c2 and 04a9fac.

📒 Files selected for processing (29)

.github/codeql/codeql-config.yml
.github/codeql/extensions/synthorg-go-sanitisers/go-sanitisers.model.yml
.github/codeql/extensions/synthorg-go-sanitisers/qlpack.yml
.github/codeql/extensions/synthorg-js-sanitisers/js-sanitisers.model.yml
.github/codeql/extensions/synthorg-js-sanitisers/qlpack.yml
.github/codeql/extensions/synthorg-python-sanitisers/python-sanitisers.model.yml
.github/codeql/extensions/synthorg-python-sanitisers/qlpack.yml
.github/codeql/fixtures/expected.json
.github/codeql/fixtures/javascript/negative.ts
.github/codeql/fixtures/javascript/positive.ts
.github/codeql/fixtures/python/negative.py
.github/codeql/fixtures/python/positive.py
.github/codeql/queries/synthorg-cleartext-logging/CleartextLoggingWithBarriers.ql
.github/codeql/queries/synthorg-cleartext-logging/Sanitisers.qll
.github/codeql/queries/synthorg-cleartext-logging/qlpack.yml
.github/codeql/queries/synthorg-partial-ssrf/PartialServerSideRequestForgeryWithBarriers.ql
.github/codeql/queries/synthorg-partial-ssrf/Sanitisers.qll
.github/codeql/queries/synthorg-partial-ssrf/qlpack.yml
.github/workflows/codeql-pack-validate.yml
.github/workflows/codeql.yml
CLAUDE.md
cli/internal/codeqlfixtures/fixture.go
data/runtime_stats.yaml
docs/reference/convention-gates.md
docs/reference/sec-prompt-safety.md
pyproject.toml
scripts/check_codeql_fixtures.py
scripts/convention_gate_map.yaml
scripts/run_codeql_pack_validation.sh

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (13)

GitHub Check: Build Backend
GitHub Check: Build Fine-Tune (gpu, fine-tune-gpu)
GitHub Check: Build Fine-Tune (cpu, fine-tune-cpu)
GitHub Check: Dashboard Test
GitHub Check: Test (Python 3.14)
GitHub Check: Lighthouse Site
GitHub Check: Build Web Assets (melange)
GitHub Check: CLI Test (windows-latest)
GitHub Check: CLI Bench Regression
GitHub Check: CodSpeed Python benchmarks
GitHub Check: Validate sanitiser pack
GitHub Check: Analyze (python)
GitHub Check: Analyze (python)

🧰 Additional context used

📓 Path-based instructions (6)

.github/codeql/**/*.{yml,yaml,qll,ql}

📄 CodeRabbit inference engine (CLAUDE.md)

CodeQL Sanitisers: any change to a project sanitiser requires a matching binding in language-specific .github/codeql/extensions/synthorg-{python,js,go}-sanitisers/*.model.yml or custom .ql query, plus fixture pairs under .github/codeql/fixtures/ or cli/internal/codeqlfixtures/. Enforced by .github/workflows/codeql-pack-validate.yml and scripts/check_codeql_fixtures.py.

Files:

.github/codeql/queries/synthorg-cleartext-logging/qlpack.yml
.github/codeql/extensions/synthorg-js-sanitisers/qlpack.yml
.github/codeql/extensions/synthorg-go-sanitisers/qlpack.yml
.github/codeql/codeql-config.yml
.github/codeql/queries/synthorg-partial-ssrf/qlpack.yml
.github/codeql/queries/synthorg-partial-ssrf/Sanitisers.qll
.github/codeql/extensions/synthorg-js-sanitisers/js-sanitisers.model.yml
.github/codeql/queries/synthorg-cleartext-logging/Sanitisers.qll
.github/codeql/extensions/synthorg-python-sanitisers/python-sanitisers.model.yml
.github/codeql/extensions/synthorg-python-sanitisers/qlpack.yml
.github/codeql/queries/synthorg-cleartext-logging/CleartextLoggingWithBarriers.ql
.github/codeql/queries/synthorg-partial-ssrf/PartialServerSideRequestForgeryWithBarriers.ql
.github/codeql/extensions/synthorg-go-sanitisers/go-sanitisers.model.yml

**/*.{md,rst}

📄 CodeRabbit inference engine (CLAUDE.md)

Doc Numeric Claims: numerics in README and public docs sourced from data/runtime_stats.yaml via  markers.

Files:

docs/reference/convention-gates.md
CLAUDE.md
docs/reference/sec-prompt-safety.md

**/*.md

📄 CodeRabbit inference engine (CLAUDE.md)

Diagrams: use d2 for architecture / nested containers, mermaid for flowcharts / sequence / pipelines. Use markdown tables for tabular data. D2 theme 200 (Dark Mauve), D2 CLI pinned to v0.7.1 in CI.

Files:

docs/reference/convention-gates.md
CLAUDE.md
docs/reference/sec-prompt-safety.md

cli/**/*.go

📄 CodeRabbit inference engine (cli/CLAUDE.md)

cli/**/*.go: Organize CLI Go code with cmd/ for Cobra commands and internal/ for packages (version, config, docker, compose, health, diagnostics, images, selfupdate, completion, ui, verify, backup, scaffold)
Set SYNTHORG_IMAGE_VERIFY_TIMEOUT default to 120s with a hard minimum of 1s
Set SYNTHORG_IMAGE_PULL_ATTEMPTS to range 1..100 with default 3, and SYNTHORG_IMAGE_PULL_RETRY_DELAY to default 2s with exponential backoff
Set SYNTHORG_MAX_API_RESPONSE_BYTES default to 4MiB, SYNTHORG_MAX_BINARY_BYTES to 256MiB, and SYNTHORG_MAX_ARCHIVE_ENTRY_BYTES to 128MiB
When both SYNTHORG_DATABASE_URL and SYNTHORG_DB_PATH are present, SYNTHORG_DATABASE_URL takes precedence; a malformed URL raises loudly at startup rather than silently falling back
Use port 3000 for web, 3001 for backend, 3002 for postgres, and 3003 for NATS client; validate port collisions across all enabled services in generate.go
In generated compose.yml, include a one-shot data-init helper that chowns each named volume to its non-root owner (65532:65532 for backend/NATS, 70:70 with mode 0700 for Postgres) before stateful services start
Declare Postgres and NATS services with depends_on: data-init: condition: service_completed_successfully to ensure volume initialization completes before services start
In hint tier selection, use HintError for error recovery (always visible unless --quiet), HintNextStep for natural next actions, HintTip for config automation suggestions (once/session deduplication), and HintGuidance for flag/feature discovery (invisible in default auto mode)
Use HintTip deduplication within a session (same message shown at most once) and make HintGuidance invisible in default auto mode, visible only when synthorg config set hints always
Config-driven overrides apply: color never implies --no-color, color always forces color on non-TTYs, output json implies --json, and hints mode is config-only with options `always/auto/...

Files:

cli/internal/codeqlfixtures/fixture.go

**/*.{sh,bash}

📄 CodeRabbit inference engine (CLAUDE.md)

Shell: follow ~/.claude/rules/common/bash.md (canonical for cd / git -C / Bash file-write rules).

Files:

scripts/run_codeql_pack_validation.sh

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Read docs/design/ page before implementing; deviations need approval. See DESIGN_SPEC.md.

Present every plan for accept/deny before coding.

Apply Regional Defaults: no region/currency/locale privileged; use metric units; British English.

Every convention PR ships its enforcement gate.

Configuration Precedence: DB > env > YAML > code default via SettingsService/ConfigResolver; no os.environ.get outside startup.

No Hardcoded Numeric Values: numerics live in settings/definitions/; allowlist only 0/1/-1, HTTP codes, hex masks, powers-of-2. Enforced by scripts/check_no_magic_numbers.py.

Post-Implementation + Pre-PR Review: after issue work, branch + commit + push (no auto-PR); use /pre-pr-review command. After PR creation: use /aurelio-review-pr for external feedback. Fix EVERYTHING valid; no deferring.

Comments WHY only; no reviewer citations / issue back-refs / migration framing. Enforced by check_no_review_origin_in_code.py and check_no_migration_framing.py.

No from __future__ import annotations (3.14 has PEP 649). Use PEP 758 except: except A, B: no parens unless binding.

Type hints on public functions; mypy strict. Google-style docstrings. Line length 88; functions <50 lines; files <800 lines.

Errors: define as <Domain><Condition>Error inheriting from DomainError; never inherit from Exception/RuntimeError directly. Enforced by check_domain_error_hierarchy.py.

Pydantic v2: use frozen + extra='forbid' on API DTOs (Request/Response/Snapshot/Result/Envelope/Status/Info/Summary suffixes); @computed_field for derived; NotBlankStr for identifiers.

Args models at every system boundary; use parse_typed() for every external dict ingestion. Enforced by check_boundary_typed.py.

Immutability: use model_copy(update=...) or copy.deepcopy(); apply deepcopy at system boundaries.

Async patterns: use asyncio.TaskGroup for fan-out/fan-in; helpers must catch Exception and re-raise MemoryError/RecursionError.

Clock se...

Files:

scripts/check_codeql_fixtures.py

🧠 Learnings (2)

📓 Common learnings

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T14:56:10.785Z
Learning: Git commits: use format `<type>: <description>` (feat/fix/refactor/docs/test/chore/perf/ci); enforced by commitizen.

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T14:56:10.785Z
Learning: Signed commits required on protected refs (GPG/SSH or GitHub App via `synthorg-repo-bot`).

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T14:56:10.785Z
Learning: Branches: use format `<type>/<slug>` from main.

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T14:56:10.785Z
Learning: Squash merge: PR body becomes squash commit; trailers (`Release-As`, `Closes `#N``) must be in PR body.

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T14:56:10.785Z
Learning: Workflow: after every squash merge, run `/post-merge-cleanup`.

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T14:56:10.785Z
Learning: GitHub queries: use `gh issue list` via Bash, NOT MCP `list_issues`.

📚 Learning: 2026-05-05T09:04:46.195Z

Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1760
File: scripts/_dual_backend_parity_lib.py:215-216
Timestamp: 2026-05-05T09:04:46.195Z
Learning: This repository targets Python 3.14+ and follows PEP 758. Therefore, reviewer tooling should NOT treat unparenthesized multi-exception `except` clauses written without an `as` clause (e.g., `except MemoryError, RecursionError:`) as syntax errors. Only flag `except`-clause problems when they are genuinely invalid for Python 3.14+.

Applied to files:

scripts/check_codeql_fixtures.py

🪛 LanguageTool

CLAUDE.md

[uncategorized] ~19-~19: The official name of this software platform is spelled with a capital “H”.
Context: ...rierModelrow in the language-specific.github/codeql/extensions/synthorg-{python,js,g...

(GITHUB)

[uncategorized] ~19-~19: The official name of this software platform is spelled with a capital “H”.
Context: ...k, otherwise a sanitiser subclass under `.github/codeql/queries/synthorg-/Sanitise...

(GITHUB)

[uncategorized] ~19-~19: The official name of this software platform is spelled with a capital “H”.
Context: ... custom .ql) and a fixture pair under .github/codeql/fixtures/ (or `cli/internal/cod...

(GITHUB)

[uncategorized] ~19-~19: The official name of this software platform is spelled with a capital “H”.
Context: ...internal/codeqlfixtures/). Enforced by .github/workflows/codeql-pack-validate.yml+...

(GITHUB)

docs/reference/sec-prompt-safety.md

[uncategorized] ~151-~151: The official name of this software platform is spelled with a capital “H”.
Context: ...aces. Each binding is provably precise: .github/workflows/codeql-pack-validate.yml run...

(GITHUB)

[uncategorized] ~153-~153: The official name of this software platform is spelled with a capital “H”.
Context: ... 1 -- Models-as-Data extension packs** (`.github/codeql/extensions/synthorg-{python,js,g...

(GITHUB)

[uncategorized] ~155-~155: The official name of this software platform is spelled with a capital “H”.
Context: ...Surface 2 -- custom QL query packs (.github/codeql/queries/synthorg-*/). For queri...

(GITHUB)

[uncategorized] ~155-~155: Do not mix variants of the same word (‘sanitizer’ and ‘sanitiser’) within a single text.
Context: ...nthorg-*/). For queries whose abstract Sanitizerclass has nobarrierNode` MaD hook up...

(EN_WORD_COHERENCY)

[uncategorized] ~155-~155: Do not mix variants of the same word (‘sanitizer’ and ‘sanitiser’) within a single text.
Context: ...ass extends the upstream abstract base (CleartextLogging::Sanitizer, ServerSideRequestForgery::Sanitizer...

(EN_WORD_COHERENCY)

[uncategorized] ~155-~155: Do not mix variants of the same word (‘sanitizer’ and ‘sanitiser’) within a single text.
Context: ...ct base (CleartextLogging::Sanitizer, ServerSideRequestForgery::Sanitizer, etc.), which uses American spelling b...

(EN_WORD_COHERENCY)

[uncategorized] ~170-~170: The official name of this software platform is spelled with a capital “H”.
Context: ...urface 1) or a sanitiser subclass under `.github/codeql/queries/synthorg-/Sanitise...

(GITHUB)

[uncategorized] ~170-~170: The official name of this software platform is spelled with a capital “H”.
Context: ...ixture pair (negative + positive) under .github/codeql/fixtures/ (or `cli/internal/cod...

(GITHUB)

[uncategorized] ~178-~178: The official name of this software platform is spelled with a capital “H”.
Context: ... paths-ignore: ['tests/**'] clause in .github/codeql/codeql-config.yml. Every histor...

(GITHUB)

🔇 Additional comments (30)

pyproject.toml (1)

270-275: Scoped Ruff fixture ignores are well-contained.

The new per-file ignore block is narrowly targeted to CodeQL Python fixtures and avoids weakening linting globally.

data/runtime_stats.yaml (1)

46-47: Convention gate count update is consistent.

convention_gates now reflects the new enforcement gate added in this PR.

CLAUDE.md (2)

19-19: New mandatory CodeQL sanitiser policy is explicit and actionable.

The rule clearly links sanitiser changes to binding requirements, fixture coverage, and enforcement checks.

46-46: Gate inventory reference is in sync with this PR.

Updating to “38 enforcement gates + meta-gate” matches the introduced convention-gate expansion.

.github/codeql/queries/synthorg-partial-ssrf/Sanitisers.qll (1)

1-36: Custom partial-SSRF sanitiser modelling looks correct.

Good choice using a custom ServerSideRequestForgery::Sanitizer subclass for the partial-SSRF path where Models-as-Data barrierNode isn’t available.

.github/codeql/extensions/synthorg-js-sanitisers/js-sanitisers.model.yml (1)

13-39: JS barrier models are appropriately scoped.

The three barrierModel entries are tightly scoped to log-injection and target the intended sanitizer return values.

.github/codeql/extensions/synthorg-js-sanitisers/qlpack.yml (1)

1-7: QL pack wiring is clean and sufficient.

extensionTargets and dataExtensions are correctly configured for the JS sanitiser model pack.

.github/codeql/fixtures/javascript/positive.ts (1)

7-15: Positive fixture cases are clear and purposeful.

These functions provide straightforward unsanitized sink flows to validate that js/log-injection still fires under the extension pack.

.github/codeql/fixtures/javascript/negative.ts (1)

13-33: Negative fixtures now exercise each sanitiser barrier effectively.

The sanitizeWsEnum path correctly includes sink-flow coverage, so this file now validates all three JS models in non-alert scenarios.

.github/codeql/queries/synthorg-partial-ssrf/PartialServerSideRequestForgeryWithBarriers.ql (1)

21-34: Partial-SSRF query wiring looks correct.

The flow/path construction and project-barrier import are consistent with a custom path-problem override.

.github/codeql/queries/synthorg-partial-ssrf/qlpack.yml (1)

1-5: QLPack manifest is clean and valid.

Pack metadata and dependency declaration are correctly structured for this custom query pack.

cli/internal/codeqlfixtures/fixture.go (2)

22-39: Path-injection fixture pair is well-formed.

The negative/positive split is explicit and suitable for deterministic CodeQL sanitizer validation.

As per coding guidelines, CodeQL sanitiser changes should include fixture pairs, and this segment provides that coverage pattern.

41-63: Log-injection fixture pair is clear and testable.

The interface shape and argument choices create a good sanitized-vs-unsanitized contrast for rule assertions.

docs/reference/convention-gates.md (1)

15-51: Convention-gate inventory update is consistent.

The new gate entry and RS total increment are synchronized correctly.

.github/codeql/queries/synthorg-cleartext-logging/qlpack.yml (1)

1-5: Cleartext-logging qlpack manifest looks good.

Metadata and dependency declaration are correctly defined.

scripts/convention_gate_map.yaml (1)

78-81: New mandatory rule registration is correctly wired.

The CLAUDE.md header mapping to scripts/check_codeql_fixtures.py is complete and consistent.

As per coding guidelines, new conventions should be backed by enforcement gates, and this mapping adds that enforcement hook.

.github/codeql/extensions/synthorg-go-sanitisers/qlpack.yml (1)

1-7: Go sanitiser extension pack manifest is correct.

Target binding and model extension registration are properly configured.

.github/codeql/extensions/synthorg-python-sanitisers/qlpack.yml (1)

1-7: Python sanitiser extension pack manifest is sound.

The manifest correctly wires Python target scope to the model file.

.github/codeql/codeql-config.yml (1)

3-33: CodeQL config wiring looks consistent.

The query replacement (query-filters + local custom packs) and fixture exclusions are correctly aligned.

.github/codeql/queries/synthorg-cleartext-logging/Sanitisers.qll (1)

24-38: Sanitiser subclass implementation is well-scoped.

The API graph bindings cover both direct redaction-module calls and the observability re-export path.

.github/codeql/fixtures/expected.json (1)

4-57: Fixture expectation manifest is comprehensive.

Positive/negative assertions are clearly separated per language and should support stable SARIF diffing.

.github/codeql/fixtures/python/negative.py (1)

26-69: Negative fixture paths are correctly constructed for barrier validation.

The three functions cleanly represent sanitized flows for the target Python rules.

.github/workflows/codeql-pack-validate.yml (1)

20-72: Workflow hardening is solid.

Least-privilege permissions, pinned actions, scoped triggers, and failure artifact upload are all in place.

.github/codeql/extensions/synthorg-python-sanitisers/python-sanitisers.model.yml (1)

25-35: Python barrier rows are explicit and well-targeted.

Module-specific _resolve_root modeling is clear and avoids overbroad matching.

.github/codeql/extensions/synthorg-go-sanitisers/go-sanitisers.model.yml (1)

17-33: Go sanitizer barrier entries look correct.

Both SecurePath and wrapper safeStateDir are modeled with the expected Go barrierModel shape and path-injection kind.

.github/workflows/codeql.yml (1)

15-31: Strong hardening in the advanced CodeQL workflow.

Pinned SHAs, root-level empty permissions with per-job grants, and clean init/analyze wiring look solid.

Also applies to: 67-79

.github/codeql/queries/synthorg-cleartext-logging/CleartextLoggingWithBarriers.ql (1)

29-37: Query implementation looks correct for barrier-aware flow reporting.

The source/sink path selection and custom rule metadata are consistent with the intended sanitizer-aware variant.

scripts/run_codeql_pack_validation.sh (1)

56-87: Validation harness structure is clean and maintainable.

run_fixture plus the final expectation diff step gives a clear single-path validation flow for local and CI runs.

Also applies to: 103-133

.github/codeql/fixtures/python/positive.py (1)

16-45: Good positive-fixture coverage for leak paths.

The three unsanitized cases (cleartext logging, path injection, partial SSRF) are well targeted for must-fire regression checks.

scripts/check_codeql_fixtures.py (1)

127-170: The SARIF assertion pipeline is robust.

Scope filtering plus rule/function assertions and aggregated failure reporting make this a strong convention-gate checker.

Also applies to: 181-238

coderabbitai · 2026-05-09T14:59:43Z

+**Surface 1 -- Models-as-Data extension packs** (`.github/codeql/extensions/synthorg-{python,js,go}-sanitisers/`). One pack per target language, each declaring `barrierModel` rows for the queries whose CodeQL standard-library `Customizations.qll` exposes a `barrierNode("<kind>")` Models-as-Data hook. The per-language split is required because `barrierModel` signatures differ across language packs (Python and JS take 3 user-provided columns, Go takes 9), and a single mixed pack with multi-language `extensionTargets` causes CodeQL to validate every YAML row's arity against every target pack's signature -- the Go 9-col rows would then fail the Python and JS signature checks. Today: Python `path-injection`, JavaScript `log-injection`, Go `path-injection`.
+
+**Surface 2 -- custom QL query packs** (`.github/codeql/queries/synthorg-*/`). For queries whose abstract `Sanitizer` class has no `barrierNode` MaD hook upstream, the pack defines a QL sanitiser subclass (in `Sanitisers.qll`) and ships a custom query (re-running the standard flow analysis with the subclass in scope). The subclass extends the upstream abstract base (`CleartextLogging::Sanitizer`, `ServerSideRequestForgery::Sanitizer`, etc.), which uses American spelling because the upstream class is named that way; our subclass names use the British "Sanitiser" form. The standard rule is excluded via `query-filters` in `codeql-config.yml`, and the custom query emits results under a `synthorg/<rule>` ID. Today that covers Python `clear-text-logging-sensitive-data` and `partial-ssrf`.
+
+| Sanitiser | File | Rule | Surface | Runtime guarantee |
+|---|---|---|---|---|
+| `safe_error_description` | `src/synthorg/observability/redaction.py` | `synthorg/clear-text-logging-sensitive-data` | Custom QL query | Strips OAuth tokens, JSON credential values, URI userinfo, `Authorization:` headers, and Fernet ciphertexts from exception messages; truncates to `MAX_SCRUBBED_LENGTH` |
+| `scrub_secret_tokens` | `src/synthorg/observability/redaction.py` | `synthorg/clear-text-logging-sensitive-data` | Custom QL query | Lower-level helper called by `safe_error_description`; pattern-replace credential shapes idempotently |
+| `_resolve_root` (7 sites under `scripts/`) | `scripts/check_*.py` | `py/path-injection` | `barrierModel` | Resolves the candidate, asserts containment via `os.path.commonpath` and `Path.relative_to`; returns `None` on escape |
+| `_validate_repo_prefix` | `scripts/check_image_signatures.py` | `synthorg/partial-ssrf` | Custom QL query | Anchored regex matches the OCI repo grammar; `\A...\Z` rejects newlines and CR |
+| `_validate_image_tag` | `scripts/check_image_signatures.py` | `synthorg/partial-ssrf` | Custom QL query | Anchored regex matches the OCI image-name + tag grammar |
+| `config.SecurePath` | `cli/internal/config/paths.go` | `go/path-injection` | `barrierModel` | `filepath.Clean` + absolute-path assertion |
+| `safeStateDir` | `cli/cmd/root.go` | `go/path-injection` | `barrierModel` | Wrapper around `config.SecurePath` keyed on the loaded `State.DataDir` |
+| `sanitizeForLog` | `web/src/utils/logging.ts` | `js/log-injection` | `barrierModel` | Strips C0/C1 controls + Unicode BIDI overrides; truncates to `maxLen` |
+| `sanitizeWsString` | `web/src/utils/ws-sanitize.ts` | `js/log-injection` | `barrierModel` | Calls `sanitizeForLog` with the WS-payload length cap |
+| `sanitizeWsEnum` | `web/src/utils/ws-sanitize.ts` | `js/log-injection` | `barrierModel` | Allowlist-validated; returns either the input or a literal fallback constant |
+
+Adding a new sanitiser is a four-part change in one PR: (1) the helper itself, (2) a matching binding -- either a row in the language-specific `synthorg-{python,js,go}-sanitisers/*-sanitisers.model.yml` (Surface 1) or a sanitiser subclass under `.github/codeql/queries/synthorg-<rule>/Sanitisers.qll` (Surface 2), (3) a fixture pair (negative + positive) under `.github/codeql/fixtures/` (or `cli/internal/codeqlfixtures/` for Go), and (4) its `expected.json` entry. Pick Surface 1 if the upstream query exposes a `barrierNode` MaD hook for the relevant kind; otherwise Surface 2. The standalone `codeql-pack-validate.yml` workflow runs on every pack-related change and fails if either fixture diverges from its expectation.


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Replace hard-coded numeric claims with runtime-stats markers (or non-numeric phrasing).

Line 153, Line 170, and Line 174 introduce fixed numeric claims in a public doc without  markers. Please source these values via runtime-stats markers or rewrite to avoid explicit counts.

As per coding guidelines: **/*.{md,rst}: Doc Numeric Claims: numerics in README and public docs sourced from data/runtime_stats.yaml via  markers.

Also applies to: 174-179

🧰 Tools

🪛 LanguageTool

[uncategorized] ~153-~153: The official name of this software platform is spelled with a capital “H”.
Context: ... 1 -- Models-as-Data extension packs** (`.github/codeql/extensions/synthorg-{python,js,g...

(GITHUB)

[uncategorized] ~155-~155: The official name of this software platform is spelled with a capital “H”.
Context: ...Surface 2 -- custom QL query packs (.github/codeql/queries/synthorg-*/). For queri...

(GITHUB)

[uncategorized] ~155-~155: Do not mix variants of the same word (‘sanitizer’ and ‘sanitiser’) within a single text.
Context: ...nthorg-*/). For queries whose abstract Sanitizerclass has nobarrierNode` MaD hook up...

(EN_WORD_COHERENCY)

[uncategorized] ~155-~155: Do not mix variants of the same word (‘sanitizer’ and ‘sanitiser’) within a single text.
Context: ...ass extends the upstream abstract base (CleartextLogging::Sanitizer, ServerSideRequestForgery::Sanitizer...

(EN_WORD_COHERENCY)

[uncategorized] ~155-~155: Do not mix variants of the same word (‘sanitizer’ and ‘sanitiser’) within a single text.
Context: ...ct base (CleartextLogging::Sanitizer, ServerSideRequestForgery::Sanitizer, etc.), which uses American spelling b...

(EN_WORD_COHERENCY)

[uncategorized] ~170-~170: The official name of this software platform is spelled with a capital “H”.
Context: ...urface 1) or a sanitiser subclass under `.github/codeql/queries/synthorg-/Sanitise...

(GITHUB)

[uncategorized] ~170-~170: The official name of this software platform is spelled with a capital “H”.
Context: ...ixture pair (negative + positive) under .github/codeql/fixtures/ (or `cli/internal/cod...

(GITHUB)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/reference/sec-prompt-safety.md` around lines 153 - 170, The doc contains hard-coded numeric claims (e.g., counts referenced near the "Surface 1 -- Models-as-Data extension packs" and "Surface 2 -- custom QL query packs" sections and in the table rows describing sanitiser coverage) that must be replaced with runtime-stats markers or rewritten non-numerically; update the markdown to use  markers pulling from runtime_stats.yaml (or change phrasing to avoid explicit numbers) where the text currently asserts concrete counts, and ensure references to specific sanitiser examples like sanitizeForLog, sanitizeWsString, safe_error_description, scrub_secret_tokens, and the Surface 1/Surface 2 headings reflect the marker usage or non-numeric wording.

… discovery + per-variant SARIF analysis + _resolve_root fixture call)

…r queries lacking MaD hooks (cleartext-logging, partial-ssrf), drop stale _resolve_root entry, fix shellcheck directive

…void go 9-col rows tripping js/py 3-col signature check), remove duplicate codeql language matrix entries + workflow-level queries override, fix sanitizeWsEnum fixture to flow into a sink

…ig (was forcing cross-language barrierModel signature collisions), rename Sanitizers.qll to British Sanitisers.qll

…erModel arity collision blocks every analysis); keep custom QL query packs (cleartext-logging + partial-ssrf) which work via Sanitiser subclass mechanism; defer remaining barriers to follow-up

+            extra={
+                "error_type": type(exc).__name__,
+                "error": safe_error_description(exc),
+            },


+    safe_repo = urllib.parse.quote(f"{repo_prefix}{pair.image}", safe="/")
+    safe_tag = urllib.parse.quote(pair.tag, safe="")
+    url = f"https://ghcr.io/v2/{safe_repo}/manifests/{safe_tag}"
+    req = urllib.request.Request(url, method="HEAD")


+    components flow straight into the URL path.
+    """
+    url = f"https://ghcr.io/v2/{repo}/manifests/{tag}"
+    req = urllib.request.Request(url, method="HEAD")


coderabbitai

Actionable comments posted: 2

♻️ Duplicate comments (4)

CLAUDE.md (1)

19-19: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

The CodeQL sanitiser rule should allow both binding mechanisms.

This wording mandates only the custom Sanitisers.qll path. It should also allow Models-as-Data barrierModel rows when the upstream query exposes a barrierNode hook, otherwise the policy is stricter than the intended implementation model.

Suggested patch

-- **CodeQL Sanitisers (MANDATORY)**: any change to a project sanitiser (helper that closes a CodeQL data-flow rule) requires a sanitiser subclass under `.github/codeql/queries/synthorg-<rule>/Sanitisers.qll` plus the matching custom `.ql` and a fixture pair under `.github/codeql/fixtures/`. Enforced by `.github/workflows/codeql-pack-validate.yml` + `scripts/check_codeql_fixtures.py`. See [docs/reference/sec-prompt-safety.md](docs/reference/sec-prompt-safety.md#codeql-barrier-inventory).
+- **CodeQL Sanitisers (MANDATORY)**: any change to a project sanitiser (helper that closes a CodeQL data-flow rule) requires either a `barrierModel` row in `.github/codeql/extensions/synthorg-*/**/*.model.yml` (when the upstream query exposes a `barrierNode` hook) or a sanitiser subclass under `.github/codeql/queries/synthorg-<rule>/Sanitisers.qll` plus the matching custom `.ql`, and a fixture pair under `.github/codeql/fixtures/` (or `cli/internal/codeqlfixtures/`). Enforced by `.github/workflows/codeql-pack-validate.yml` + `scripts/check_codeql_fixtures.py`. See [docs/reference/sec-prompt-safety.md](docs/reference/sec-prompt-safety.md#codeql-barrier-inventory).

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@CLAUDE.md` at line 19, The CLAUDE.md sentence forces only custom
Sanitisers.qll path; change it to permit either a sanitiser subclass under
Sanitisers.qll OR Models-as-Data barrierModel rows when the upstream query
exposes a barrierNode hook, and clarify that the barrierModel mechanism is
acceptable and equivalent to the Sanitisers.qll requirement; reference
Sanitisers.qll, barrierModel, and barrierNode in the updated wording so
reviewers can verify both binding mechanisms are allowed.

.github/codeql/codeql-config.yml (1)

7-12: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Exclude cli/internal/codeqlfixtures/** from the main CodeQL scan.

Intentionally vulnerable Go fixtures can be reported as real repository alerts if this directory is not ignored.

Suggested patch

 paths-ignore:
   - 'tests/**'
   - 'web/__mocks__/**'
   - 'web/e2e/**'
   - '.github/codeql/fixtures/**'
+  - 'cli/internal/codeqlfixtures/**'

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/codeql/codeql-config.yml around lines 7 - 12, Add the
cli/internal/codeqlfixtures/** pattern to the CodeQL paths-ignore list so the
intentionally vulnerable Go fixtures are excluded from scans; update the
paths-ignore block (the existing paths-ignore entry containing 'tests/**',
'web/__mocks__/**', etc.) to include 'cli/internal/codeqlfixtures/**' ensuring
the pattern matches that directory and its contents.

docs/reference/sec-prompt-safety.md (1)

160-168: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Use runtime-stats markers (or non-numeric phrasing) for numeric claims in this public doc.

This section hard-codes numeric claims in prose. Please replace with  markers or rewrite to avoid explicit numerics.

As per coding guidelines: {README.md,docs/**/*.md}: Numerics in README + public docs sourced from data/runtime_stats.yaml via  markers per data/README.md.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/reference/sec-prompt-safety.md` around lines 160 - 168, The doc section
currently contains hard-coded numeric claims; replace those explicit numerics
with runtime-stats markers using the exact syntax  (or rephrase to
avoid numeric claims), update the corresponding NAME key in
data/runtime_stats.yaml if missing, and ensure the README.md/docs/**/*.md files
follow the guideline by referencing the same  marker; locate the
text in sec-prompt-safety.md and swap the numerics for the marker(s) and verify
the marker name exists in data/runtime_stats.yaml and is documented in README.md
per the project's coding guidelines.

scripts/run_codeql_pack_validation.sh (1)

81-102: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Fixture validation is currently Python-only, leaving non-Python regressions unchecked.

The script states it validates fixture expectations, but it only runs python-negative/python-positive. Add Go and JavaScript/TypeScript fixture analyses (or drive execution from expected.json) so pack validation actually covers all declared fixture sets.

Suggested direction

 run_fixture python-negative python "." \
     "codeql/python-queries:codeql-suites/python-security-extended.qls" \
     "$PYTHON_CUSTOM_QUERIES"
 run_fixture python-positive python "." \
     "codeql/python-queries:codeql-suites/python-security-extended.qls" \
     "$PYTHON_CUSTOM_QUERIES"

+# Add language fixtures that exist in expected.json
+# (or replace this block with a manifest-driven loop).
+run_fixture javascript-negative javascript-typescript "." \
+    "codeql/javascript-queries:codeql-suites/javascript-security-extended.qls"
+run_fixture javascript-positive javascript-typescript "." \
+    "codeql/javascript-queries:codeql-suites/javascript-security-extended.qls"
+run_fixture go-negative go "cli/internal" \
+    "codeql/go-queries:codeql-suites/go-security-extended.qls"
+run_fixture go-positive go "cli/internal" \
+    "codeql/go-queries:codeql-suites/go-security-extended.qls"

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@scripts/run_codeql_pack_validation.sh` around lines 81 - 102, The validation
only runs Python fixtures (PYTHON_CUSTOM_QUERIES + run_fixture calls for
python-negative and python-positive); add equivalent runs for Go and
JavaScript/TypeScript fixture sets (e.g., invoke run_fixture with
go-negative/go-positive and js-negative/js-positive using appropriate CodeQL
suites and any custom queries) or update the harness to iterate expected.json to
drive run_fixture for every declared fixture set; update the script around the
existing run_fixture calls so run_fixture is invoked for Go and JS/TS fixtures
(or replace the hardcoded python calls with a loop over expected.json) and
ensure rule-filtering behavior mirrors the Python path.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/codeql/fixtures/expected.json:
- Around line 4-24: The manifest only lists Python fixtures ("python-negative"
and "python-positive"), so add equivalent fixture objects for
JavaScript/TypeScript and Go to ensure those languages are validated; create
negative and positive entries for JS/TS (e.g., "js-negative"/"js-positive" or
"javascript-negative"/"javascript-positive") and for Go (e.g.,
"go-negative"/"go-positive") that mirror the Python structure, include
appropriate "scan_paths" pointing to the JS/TS and Go fixture files, and set
"must_fire" / "must_not_fire" arrays to the same rule IDs
("synthorg/clear-text-logging-sensitive-data", "synthorg/partial-ssrf") so the
pack validates multi-language sanitiser modelling.

In @.github/codeql/queries/synthorg-partial-ssrf/Sanitisers.qll:
- Around line 30-35: Update the binding to also match a fully-qualified import
of the target module: add an alternate pattern that uses
API::moduleImport("scripts.check_image_signatures") with
getMember(["_validate_repo_prefix", "_validate_image_tag"]).getACall() (in
addition to the existing
API::moduleImport("scripts").getMember("check_image_signatures") chain) so both
chained imports and direct moduleImport("scripts.check_image_signatures") cases
are covered for the functions _validate_repo_prefix and _validate_image_tag.

---

Duplicate comments:
In @.github/codeql/codeql-config.yml:
- Around line 7-12: Add the cli/internal/codeqlfixtures/** pattern to the CodeQL
paths-ignore list so the intentionally vulnerable Go fixtures are excluded from
scans; update the paths-ignore block (the existing paths-ignore entry containing
'tests/**', 'web/__mocks__/**', etc.) to include
'cli/internal/codeqlfixtures/**' ensuring the pattern matches that directory and
its contents.

In `@CLAUDE.md`:
- Line 19: The CLAUDE.md sentence forces only custom Sanitisers.qll path; change
it to permit either a sanitiser subclass under Sanitisers.qll OR Models-as-Data
barrierModel rows when the upstream query exposes a barrierNode hook, and
clarify that the barrierModel mechanism is acceptable and equivalent to the
Sanitisers.qll requirement; reference Sanitisers.qll, barrierModel, and
barrierNode in the updated wording so reviewers can verify both binding
mechanisms are allowed.

In `@docs/reference/sec-prompt-safety.md`:
- Around line 160-168: The doc section currently contains hard-coded numeric
claims; replace those explicit numerics with runtime-stats markers using the
exact syntax <!--RS:NAME--> (or rephrase to avoid numeric claims), update the
corresponding NAME key in data/runtime_stats.yaml if missing, and ensure the
README.md/docs/**/*.md files follow the guideline by referencing the same
<!--RS:...--> marker; locate the text in sec-prompt-safety.md and swap the
numerics for the marker(s) and verify the marker name exists in
data/runtime_stats.yaml and is documented in README.md per the project's coding
guidelines.

In `@scripts/run_codeql_pack_validation.sh`:
- Around line 81-102: The validation only runs Python fixtures
(PYTHON_CUSTOM_QUERIES + run_fixture calls for python-negative and
python-positive); add equivalent runs for Go and JavaScript/TypeScript fixture
sets (e.g., invoke run_fixture with go-negative/go-positive and
js-negative/js-positive using appropriate CodeQL suites and any custom queries)
or update the harness to iterate expected.json to drive run_fixture for every
declared fixture set; update the script around the existing run_fixture calls so
run_fixture is invoked for Go and JS/TS fixtures (or replace the hardcoded
python calls with a loop over expected.json) and ensure rule-filtering behavior
mirrors the Python path.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: d2aefa2b-007c-48a2-bd2c-496095557c41

📥 Commits

Reviewing files that changed from the base of the PR and between 04a9fac and aae666b.

📒 Files selected for processing (20)

.github/codeql/codeql-config.yml
.github/codeql/fixtures/expected.json
.github/codeql/fixtures/python/negative.py
.github/codeql/fixtures/python/positive.py
.github/codeql/queries/synthorg-cleartext-logging/CleartextLoggingWithBarriers.ql
.github/codeql/queries/synthorg-cleartext-logging/Sanitisers.qll
.github/codeql/queries/synthorg-cleartext-logging/qlpack.yml
.github/codeql/queries/synthorg-partial-ssrf/PartialServerSideRequestForgeryWithBarriers.ql
.github/codeql/queries/synthorg-partial-ssrf/Sanitisers.qll
.github/codeql/queries/synthorg-partial-ssrf/qlpack.yml
.github/workflows/codeql-pack-validate.yml
.github/workflows/codeql.yml
CLAUDE.md
data/runtime_stats.yaml
docs/reference/convention-gates.md
docs/reference/sec-prompt-safety.md
pyproject.toml
scripts/check_codeql_fixtures.py
scripts/convention_gate_map.yaml
scripts/run_codeql_pack_validation.sh

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)

GitHub Check: Build Backend
GitHub Check: Build Fine-Tune (cpu, fine-tune-cpu)
GitHub Check: Build Fine-Tune (gpu, fine-tune-gpu)
GitHub Check: Build Web Assets (melange)
GitHub Check: Dashboard Test
GitHub Check: Test (Python 3.14)
GitHub Check: Lighthouse Site
GitHub Check: CodSpeed Python benchmarks
GitHub Check: Analyze (python)
GitHub Check: Validate sanitiser pack
GitHub Check: Analyze (python)

🧰 Additional context used

📓 Path-based instructions (4)

.github/codeql/**/*.qll

📄 CodeRabbit inference engine (CLAUDE.md)

Any change to project sanitiser requires subclass under .github/codeql/queries/synthorg-<rule>/Sanitisers.qll plus matching custom .ql and fixture pair under .github/codeql/fixtures/ per sec-prompt-safety.md

Files:

.github/codeql/queries/synthorg-partial-ssrf/Sanitisers.qll
.github/codeql/queries/synthorg-cleartext-logging/Sanitisers.qll

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

No from __future__ import annotations (Python 3.14 has PEP 649). Use PEP 758 except for except A, B: which needs parens when binding

Files:

scripts/check_codeql_fixtures.py

{README.md,docs/**/*.md}

📄 CodeRabbit inference engine (CLAUDE.md)

Numerics in README + public docs sourced from data/runtime_stats.yaml via  markers per data/README.md

Files:

docs/reference/sec-prompt-safety.md
docs/reference/convention-gates.md

docs/**/*.md

📄 CodeRabbit inference engine (CLAUDE.md)

Use d2 for architecture / nested containers; use mermaid for flowcharts / sequence / pipelines; markdown tables for tabular data; D2 theme 200 (Dark Mauve), D2 CLI pinned to v0.7.1 in CI

Files:

docs/reference/sec-prompt-safety.md
docs/reference/convention-gates.md

🧠 Learnings (2)

📓 Common learnings

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T15:24:52.675Z
Learning: Read `docs/design/` page before implementing; deviations need approval per DESIGN_SPEC.md

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T15:24:52.675Z
Learning: Present every plan for accept/deny before coding

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T15:24:52.675Z
Learning: No region/currency/locale privileged; use metric units; use British English per `docs/reference/regional-defaults.md`

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T15:24:52.675Z
Learning: Every convention PR ships its enforcement gate per `docs/reference/convention-gates.md`

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T15:24:52.675Z
Learning: Timeout/slow test failures indicate source-code regression; never edit `tests/baselines/unit_timing.json`

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T15:24:52.675Z
Learning: After issue: branch + commit + push (no auto-PR); use `/pre-pr-review`; use `/aurelio-review-pr` for external feedback after PR; fix EVERYTHING valid; no deferring

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T15:24:52.675Z
Learning: xdist `-n 8 --dist=loadfile` auto-applied via pyproject `addopts` (`loadfile` prevents 3.14+Windows ProactorEventLoop leak)

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T15:24:52.675Z
Learning: Git commits: `<type>: <description>` (feat/fix/refactor/docs/test/chore/perf/ci); commitizen-enforced

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T15:24:52.675Z
Learning: Signed commits required on protected refs (GPG/SSH or GitHub App via `synthorg-repo-bot`)

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T15:24:52.675Z
Learning: Git branches: `<type>/<slug>` from main

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T15:24:52.675Z
Learning: Squash merge with PR body as squash commit; trailers (`Release-As`, `Closes `#N``) must be in PR body

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T15:24:52.675Z
Learning: After every squash merge → `/post-merge-cleanup`

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T15:24:52.675Z
Learning: CLI is Docker-only (init/start/stop/status); features go in dashboard + REST API

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T15:24:52.675Z
Learning: Web: see `web/CLAUDE.md`; CLI: see `cli/CLAUDE.md` (use `go -C cli`, never `cd cli`); Shell: see `~/.claude/rules/common/bash.md` (canonical for `cd` / `git -C` / Bash file-write rules)

Learnt from: CR
Repo: Aureliolo/synthorg

Timestamp: 2026-05-09T15:24:52.675Z
Learning: Use `uv sync` for dependency management and all project commands via UV

📚 Learning: 2026-05-05T09:04:46.195Z

Learnt from: Aureliolo
Repo: Aureliolo/synthorg PR: 1760
File: scripts/_dual_backend_parity_lib.py:215-216
Timestamp: 2026-05-05T09:04:46.195Z
Learning: This repository targets Python 3.14+ and follows PEP 758. Therefore, reviewer tooling should NOT treat unparenthesized multi-exception `except` clauses written without an `as` clause (e.g., `except MemoryError, RecursionError:`) as syntax errors. Only flag `except`-clause problems when they are genuinely invalid for Python 3.14+.

Applied to files:

scripts/check_codeql_fixtures.py

🪛 LanguageTool

CLAUDE.md

[uncategorized] ~19-~19: The official name of this software platform is spelled with a capital “H”.
Context: ...le) requires a sanitiser subclass under `.github/codeql/queries/synthorg-/Sanitise...

(GITHUB)

[uncategorized] ~19-~19: The official name of this software platform is spelled with a capital “H”.
Context: ...g custom .ql and a fixture pair under .github/codeql/fixtures/. Enforced by `.github...

(GITHUB)

[uncategorized] ~19-~19: The official name of this software platform is spelled with a capital “H”.
Context: ....github/codeql/fixtures/. Enforced by .github/workflows/codeql-pack-validate.yml + `...

(GITHUB)

docs/reference/sec-prompt-safety.md

[uncategorized] ~151-~151: The official name of this software platform is spelled with a capital “H”.
Context: ...tiser helpers via custom query packs at .github/codeql/queries/synthorg-*/. Each pack ...

(GITHUB)

[uncategorized] ~151-~151: Do not mix variants of the same word (‘sanitizer’ and ‘sanitiser’) within a single text.
Context: ...ass extends the upstream abstract base (CleartextLogging::Sanitizer, ServerSideRequestForgery::Sanitizer...

(EN_WORD_COHERENCY)

[uncategorized] ~151-~151: Do not mix variants of the same word (‘sanitizer’ and ‘sanitiser’) within a single text.
Context: ...ct base (CleartextLogging::Sanitizer, ServerSideRequestForgery::Sanitizer); the upstream base uses American spel...

(EN_WORD_COHERENCY)

[uncategorized] ~151-~151: The official name of this software platform is spelled with a capital “H”.
Context: ... ID. The bindings are provably precise: .github/workflows/codeql-pack-validate.yml run...

(GITHUB)

[uncategorized] ~160-~160: The official name of this software platform is spelled with a capital “H”.
Context: ... itself, (2) a sanitiser subclass under `.github/codeql/queries/synthorg-/Sanitise...

(GITHUB)

[uncategorized] ~172-~172: The official name of this software platform is spelled with a capital “H”.
Context: ... paths-ignore: ['tests/**'] clause in .github/codeql/codeql-config.yml. Every histor...

(GITHUB)

🔇 Additional comments (8)

pyproject.toml (1)

270-275: Scoped Ruff ignores for CodeQL fixtures look correct.

The new per-file ignore target is narrowly scoped and appropriate for intentionally unsafe fixture patterns.

.github/codeql/queries/synthorg-cleartext-logging/CleartextLoggingWithBarriers.ql (1)

29-36: Query wiring and result projection look solid.

Flow, classification extraction, and path-problem reporting are consistent with the custom-barrier rule intent.

.github/codeql/queries/synthorg-cleartext-logging/qlpack.yml (1)

1-5: Pack manifest is correctly structured for a non-library query pack.

.github/codeql/queries/synthorg-partial-ssrf/PartialServerSideRequestForgeryWithBarriers.ql (1)

26-34: Partial-SSRF query logic is well-formed.

Source/sink/request binding and the fullyControlledRequest exclusion are correctly composed for this variant.

.github/codeql/queries/synthorg-partial-ssrf/qlpack.yml (1)

1-5: Pack declaration is clean and consistent with the custom query setup.

.github/codeql/fixtures/python/negative.py (1)

24-52: Negative fixtures are well-targeted for barrier validation.

Both functions cleanly exercise the intended “must not fire” paths for cleartext logging and partial SSRF.
.github/codeql/queries/synthorg-cleartext-logging/Sanitisers.qll (1)
27-37: ⚡ Quick win

The current sanitizer implementation is correct and already covers both import patterns used throughout the codebase (from synthorg.observability.redaction import ... and from synthorg.observability import ...). The chained getMember() pattern is the standard CodeQL idiom used consistently across this repository's sanitizers. The proposed fully-qualified moduleImport() paths do not appear in any existing CodeQL examples in the codebase and do not represent a valid improvement to the current working implementation.
			> Likely an incorrect or invalid review comment.
.github/codeql/fixtures/python/positive.py (1)

15-35: Positive fixture intent is clear and correctly implemented.

The deliberate unsanitised paths are appropriate for must_fire validation and make regression signals explicit.

coderabbitai · 2026-05-09T15:29:30Z

+    {
+      "name": "python-negative",
+      "language": "python",
+      "source_root": ".",
+      "scan_paths": [".github/codeql/fixtures/python/negative.py"],
+      "must_not_fire": [
+        "synthorg/clear-text-logging-sensitive-data",
+        "synthorg/partial-ssrf"
+      ]
+    },
+    {
+      "name": "python-positive",
+      "language": "python",
+      "source_root": ".",
+      "scan_paths": [".github/codeql/fixtures/python/positive.py"],
+      "must_fire": [
+        "synthorg/clear-text-logging-sensitive-data",
+        "synthorg/partial-ssrf"
+      ]
+    }
+  ]


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Fixture expectation coverage is Python-only.

The manifest currently validates only Python fixtures, so JS/TS and Go sanitiser modelling can regress without failing this gate. Please add fixture entries for JavaScript/TypeScript and Go to keep validation aligned with the multi-language pack objectives.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/codeql/fixtures/expected.json around lines 4 - 24, The manifest only lists Python fixtures ("python-negative" and "python-positive"), so add equivalent fixture objects for JavaScript/TypeScript and Go to ensure those languages are validated; create negative and positive entries for JS/TS (e.g., "js-negative"/"js-positive" or "javascript-negative"/"javascript-positive") and for Go (e.g., "go-negative"/"go-positive") that mirror the Python structure, include appropriate "scan_paths" pointing to the JS/TS and Go fixture files, and set "must_fire" / "must_not_fire" arrays to the same rule IDs ("synthorg/clear-text-logging-sensitive-data", "synthorg/partial-ssrf") so the pack validates multi-language sanitiser modelling.

coderabbitai · 2026-05-09T15:29:30Z

+    this =
+      API::moduleImport("scripts")
+          .getMember("check_image_signatures")
+          .getMember(["_validate_repo_prefix", "_validate_image_tag"])
+          .getACall()
+  }


🧹 Nitpick | 🔵 Trivial | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Verify import style used by Python callsites and compare with current QL API binding. # Expected: callsites use scripts.check_image_signatures directly. rg -n --type=py 'from\s+scripts\.check_image_signatures\s+import|import\s+scripts\.check_image_signatures' rg -n --type=ql 'moduleImport\("scripts"\)\s*\.getMember\("check_image_signatures"\)|moduleImport\("scripts\.check_image_signatures"\)' \ .github/codeql/queries/synthorg-partial-ssrf/Sanitisers.qll

Repository: Aureliolo/synthorg

Length of output: 89

🏁 Script executed:

#!/bin/bash # Search for Python imports of scripts.check_image_signatures echo "=== Python imports of scripts.check_image_signatures ===" rg -n --type py 'scripts\.check_image_signatures' 2>/dev/null || echo "No Python matches found" echo -e "\n=== Python imports using 'from scripts' ===" rg -n --type py 'from\s+scripts' 2>/dev/null | head -20 echo -e "\n=== Examining the actual Sanitisers.qll file ===" cat -n .github/codeql/queries/synthorg-partial-ssrf/Sanitisers.qll 2>/dev/null | head -50 echo -e "\n=== Search for other moduleImport patterns in .qll files ===" find .github/codeql -name '*.qll' -exec grep -l "moduleImport" {} \; 2>/dev/null echo -e "\n=== Show moduleImport patterns from found files ===" find .github/codeql -name '*.qll' -exec grep -n "API::moduleImport" {} + 2>/dev/null | head -30

Repository: Aureliolo/synthorg

Length of output: 4801

🏁 Script executed:

#!/bin/bash echo "=== Search for check_image_signatures definitions and usage ===" rg -n "check_image_signatures" --type py echo -e "\n=== Fixture files for partial-ssrf query ===" find .github/codeql/fixtures -name "*ssrf*" -o -name "*partial*" 2>/dev/null | head -20 echo -e "\n=== Check if there's a check_image_signatures.py file ===" find . -name "check_image_signatures.py" 2>/dev/null echo -e "\n=== Scripts directory structure ===" ls -la scripts/ 2>/dev/null | grep -i "check\|image" || echo "Listing scripts directory..." ls -la scripts/ 2>/dev/null | head -20 echo -e "\n=== Examine the partial-ssrf query file ===" cat -n .github/codeql/queries/synthorg-partial-ssrf/PartialServerSideRequestForgeryWithBarriers.ql 2>/dev/null | head -50

Repository: Aureliolo/synthorg

Length of output: 8648

🏁 Script executed:

#!/bin/bash echo "=== Contents of check_image_signatures.py ===" cat -n scripts/check_image_signatures.py | head -100 echo -e "\n=== Search for _validate_repo_prefix and _validate_image_tag ===" rg -n "_validate_repo_prefix|_validate_image_tag" scripts/check_image_signatures.py

Repository: Aureliolo/synthorg

Length of output: 4924

🏁 Script executed:

#!/bin/bash echo "=== Search for actual imports of check_image_signatures ===" rg -n "from.*check_image_signatures|import.*check_image_signatures" echo -e "\n=== Search for direct calls to _validate_repo_prefix or _validate_image_tag ===" rg -n "_validate_repo_prefix|_validate_image_tag" --type py echo -e "\n=== Check if scripts is a Python package ===" ls -la scripts/__init__.py 2>/dev/null || echo "No __init__.py found - scripts is not a package" echo -e "\n=== Look for fixture files ===" find .github/codeql/fixtures -type f -name "*.py" | head -10

Repository: Aureliolo/synthorg

Length of output: 1677

🏁 Script executed:

#!/bin/bash echo "=== Contents of positive.py fixture ===" cat -n .github/codeql/fixtures/python/positive.py echo -e "\n=== Contents of negative.py fixture ===" cat -n .github/codeql/fixtures/python/negative.py echo -e "\n=== Check for partial-ssrf specific fixtures ===" find .github/codeql/fixtures -type f | xargs grep -l "partial.*ssrf\|check_image_signatures" 2>/dev/null

Repository: Aureliolo/synthorg

Length of output: 3981

🏁 Script executed:

#!/bin/bash echo "=== Examine the cleartext-logging Sanitisers.qll for comparison ===" cat -n .github/codeql/queries/synthorg-cleartext-logging/Sanitisers.qll echo -e "\n=== Check the fixture for cleartext-logging ===" find .github/codeql/fixtures -name "*.py" -exec grep -l "synthorg.observability.redaction\|safe_error_description" {} \; echo -e "\n=== Verify how safe_error_description is imported ===" rg -n "from.*redaction|import.*safe_error_description" --type py

Repository: Aureliolo/synthorg

Length of output: 50374

Consider using fully-qualified module import for defensive binding.

At Line 31, the binding uses a chained member access pattern API::moduleImport("scripts").getMember("check_image_signatures"). While this matches the actual import style in the fixture (from scripts.check_image_signatures import ...), using a fully-qualified module path would be more defensive against future code patterns where someone might import scripts and access check_image_signatures as a member.

The synthorg-cleartext-logging Sanitisers.qll demonstrates this defensive approach by binding both chained and direct patterns separately. If you want stronger future-proofing:

Proposed fix

- API::moduleImport("scripts") - .getMember("check_image_signatures") + API::moduleImport("scripts.check_image_signatures") .getMember(["_validate_repo_prefix", "_validate_image_tag"]) .getACall()

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

this =

API::moduleImport("scripts")

.getMember("check_image_signatures")

.getMember(["_validate_repo_prefix", "_validate_image_tag"])

.getACall()

}

this =

API::moduleImport("scripts.check_image_signatures")

.getMember(["_validate_repo_prefix", "_validate_image_tag"])

.getACall()

}

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In @.github/codeql/queries/synthorg-partial-ssrf/Sanitisers.qll around lines 30 - 35, Update the binding to also match a fully-qualified import of the target module: add an alternate pattern that uses API::moduleImport("scripts.check_image_signatures") with getMember(["_validate_repo_prefix", "_validate_image_tag"]).getACall() (in addition to the existing API::moduleImport("scripts").getMember("check_image_signatures") chain) so both chained imports and direct moduleImport("scripts.check_image_signatures") cases are covered for the functions _validate_repo_prefix and _validate_image_tag.

Aureliolo · 2026-05-09T15:29:46Z

Closing without merging.

After 5 rounds of fixes, the architecture turned out to cost more than it delivered:

Original scope: model 9 project sanitisers across Python, JS, Go via CodeQL Models-as-Data barrierModel extension packs.
Architectural blocker: CodeQL CLI's source-root scan validates every YAML row's arity against the current language's signature regardless of addsTo.pack. Python/JS take 3 user-cols, Go takes 9. They cannot coexist in one repo's extension surface — every cross-language analysis collides with wrong-language rows.
After per-language pack split + dropping top-level packs: + every other reasonable mitigation, the cross-language collision persisted.
Stripped Models-as-Data entirely (round 5): scope reduced from 9 to 4 sanitisers (only the cleartext-logging + partial-ssrf custom QL query packs survived).
Final blocker: GitHub repo has CodeQL default setup enabled, which is mutually exclusive with the advanced setup workflow this PR ships. Would require a manual repo-settings change to land any of the remaining work.

Net: substantial implementation effort, marginal residual value, and the surviving 4 sanitisers only matter if the advanced setup is enabled (which itself requires manual disabling of default setup).

If there's ever a concrete reason to switch from default to advanced CodeQL setup, the architecture decisions and dead-ends explored here are documented in the babysit-pr round history at _audit/babysit-pr/1839.state.json for the next attempt.

Aureliolo temporarily deployed to atlas May 9, 2026 11:52 — with GitHub Actions Inactive

Aureliolo temporarily deployed to cloudflare-preview May 9, 2026 11:53 — with GitHub Actions Inactive

gemini-code-assist Bot reviewed May 9, 2026

View reviewed changes

coderabbitai Bot requested changes May 9, 2026

View reviewed changes

Aureliolo temporarily deployed to atlas May 9, 2026 13:10 — with GitHub Actions Inactive

Aureliolo temporarily deployed to cloudflare-preview May 9, 2026 13:12 — with GitHub Actions Inactive

coderabbitai Bot requested changes May 9, 2026

View reviewed changes

Comment thread .github/codeql/extensions/synthorg-sanitisers/python-sanitisers.model.yml Outdated

Comment thread scripts/run_codeql_pack_validation.sh Outdated

Aureliolo force-pushed the chore/issue-1836-codeql-advanced branch from 3049fad to ee744f1 Compare May 9, 2026 14:10

Aureliolo temporarily deployed to atlas May 9, 2026 14:10 — with GitHub Actions Inactive

Aureliolo temporarily deployed to cloudflare-preview May 9, 2026 14:12 — with GitHub Actions Inactive

coderabbitai Bot requested changes May 9, 2026

View reviewed changes

Comment thread .github/codeql/fixtures/javascript/negative.ts Outdated

Comment thread .github/workflows/codeql.yml Outdated

Comment thread .github/workflows/codeql.yml

Aureliolo temporarily deployed to atlas May 9, 2026 14:26 — with GitHub Actions Inactive

Aureliolo temporarily deployed to cloudflare-preview May 9, 2026 14:27 — with GitHub Actions Inactive

coderabbitai Bot requested changes May 9, 2026

View reviewed changes

Comment thread CLAUDE.md Outdated

Aureliolo force-pushed the chore/issue-1836-codeql-advanced branch from f01f6c2 to 04a9fac Compare May 9, 2026 14:55

Aureliolo temporarily deployed to atlas May 9, 2026 14:55 — with GitHub Actions Inactive

Aureliolo temporarily deployed to cloudflare-preview May 9, 2026 14:56 — with GitHub Actions Inactive

coderabbitai Bot requested changes May 9, 2026

View reviewed changes

Aureliolo added 7 commits May 9, 2026 17:20

feat: add CodeQL advanced setup + Models-as-Data sanitiser pack (#1836)

9b89217

fix: address pre-PR review findings on #1836 (9 items)

7001ba8

fix: babysit round 1, 7 findings (sanitizerModel switch + codeql PATH…

e16beb8

… discovery + per-variant SARIF analysis + _resolve_root fixture call)

fix: babysit round 2, switch to barrierModel + add custom QL packs fo…

cd4347f

…r queries lacking MaD hooks (cleartext-logging, partial-ssrf), drop stale _resolve_root entry, fix shellcheck directive

fix: babysit round 3, split sanitiser pack into per-language packs (a…

5f0343c

…void go 9-col rows tripping js/py 3-col signature check), remove duplicate codeql language matrix entries + workflow-level queries override, fix sanitizeWsEnum fixture to flow into a sink

fix: babysit round 4, drop redundant top-level packs from codeql-conf…

377368e

…ig (was forcing cross-language barrierModel signature collisions), rename Sanitizers.qll to British Sanitisers.qll

fix: babysit round 5, strip MaD extension packs (cross-language barri…

aae666b

…erModel arity collision blocks every analysis); keep custom QL query packs (cleartext-logging + partial-ssrf) which work via Sanitiser subclass mechanism; defer remaining barriers to follow-up

Aureliolo force-pushed the chore/issue-1836-codeql-advanced branch from 04a9fac to aae666b Compare May 9, 2026 15:23

Aureliolo temporarily deployed to atlas May 9, 2026 15:24 — with GitHub Actions Inactive

Aureliolo temporarily deployed to cloudflare-preview May 9, 2026 15:25 — with GitHub Actions Inactive

github-advanced-security AI found potential problems May 9, 2026

View reviewed changes

coderabbitai Bot requested changes May 9, 2026

View reviewed changes

Aureliolo closed this May 9, 2026

Aureliolo temporarily deployed to cloudflare-preview May 9, 2026 15:29 — with GitHub Actions Inactive

Aureliolo mentioned this pull request May 9, 2026

Adopt CodeQL advanced setup + Models-as-Data pack for project sanitisers #1836

Closed

Aureliolo deleted the chore/issue-1836-codeql-advanced branch May 9, 2026 15:41

		import { sanitizeForLog } from '../../../../web/src/utils/logging'
		import { sanitizeWsString, sanitizeWsEnum } from '../../../../web/src/utils/ws-sanitize'

Conversation

Aureliolo commented May 9, 2026

Why

What changed

Issue FP-table coverage

Test plan

Deploy checklist (BLOCKING — manual repo-settings step)

Review coverage

Out of scope

Uh oh!

github-actions Bot commented May 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Dependency Review

OpenSSF Scorecard

Scanned Files

Uh oh!

coderabbitai Bot commented May 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 9, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 9, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 9, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 9, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 9, 2026

Choose a reason for hiding this comment

Uh oh!

codspeed-hq Bot commented May 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merging this PR will not alter performance

Footnotes

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot May 9, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 9, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 9, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

codecov Bot commented May 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

github-actions Bot commented May 9, 2026 •

edited

Loading

coderabbitai Bot commented May 9, 2026 •

edited

Loading

codspeed-hq Bot commented May 9, 2026 •

edited

Loading

codecov Bot commented May 9, 2026 •

edited

Loading