Skip to content

tools: ast-grep: add static analysis rules#502

Draft
pzmarzly wants to merge 17 commits intofacebook:mainfrom
pzmarzly:push-urpmooyzuslo
Draft

tools: ast-grep: add static analysis rules#502
pzmarzly wants to merge 17 commits intofacebook:mainfrom
pzmarzly:push-urpmooyzuslo

Conversation

@pzmarzly
Copy link
Copy Markdown
Contributor

@pzmarzly pzmarzly commented Apr 8, 2026

If https://ast-grep.github.io/ is installed (e.g. via pip install ast-grep-cli), we can use it to enforce many of our coding standards. See #510.

In this PR, rules with existing violations were introduced with severity warning, and have a known_issues file that silence known occurrances (otherwise, we'd get a ton of spam on every CI run). As we refactor our codebase, we should change all of them to errors, and remove the ignore_known_issues mechanism.

Test

ast-grep scan --config tools/ast-grep/sgconfig.yml src/bfcli/ src/libbpfilter/
Output: https://gist.github.com/pzmarzly/2e7d84b21a3202d1e87710f39c2d3857

List of rules

  • [warning] assert-pointer-params — Functions with pointer parameters must assert() them at entry.
  • bf-prefix-public-functions — Non-static functions must use the bf_/bfc_ prefix (or _bf_/_bfc_ when static).
  • double-pointer-output — Constructor bf_*_new() must take a TYPE ** output parameter for ownership transfer.
  • [warning] doxygen-prefer-backticks — Prefer `name` over @p name, @ref name, @c name in Doxygen comments.
  • [warning] doxygen-public-functions — Public function declarations in headers must have a preceding /** @brief ... */ comment.
  • enum-bf-prefix — Named enums must use bf_/bfc_ prefix.
  • enum-sentinel-format — Enums must end with a _BF_*_MAX / _BFC_*_MAX sentinel (with listed exceptions for bitmasks and kernel ABI mirrors).
  • fd-init-minus-one — Variables declared with _cleanup_close_ must be initialized to -1 to avoid closing fd 0 on error paths.
  • free-function-patternbf_*_free() / _bf_*_free() must return void and accept a double-pointer parameter.
  • negative-errno-returns — Error returns must be negated (-ENOMEM, not ENOMEM).
  • [warning] no-direct-free — Disallows raw free(); use freep() or a typed bf_*_free() cleanup function instead (with exempt callers).
  • no-fprintf-stderr — Use logging macros (bf_err, bf_warn, …) instead of fprintf(stderr, …) / vfprintf(stderr, …).
  • no-ifndef-guards — Headers must use #pragma once, not #ifndef/#define include guards.
  • [warning] single-line-comment-style — Single-line comments must use //, reserving /* */ for multi-line (SPDX excluded).
  • spdx-license-header — Every source file must contain an SPDX-License-Identifier comment.
  • struct-bf-prefix — Struct definitions (with a body) must use the bf_/bfc_ prefix; external type references are ignored.

@pzmarzly pzmarzly requested a review from qdeslandes as a code owner April 8, 2026 14:27
@meta-cla meta-cla Bot added the cla signed label Apr 8, 2026
@pzmarzly pzmarzly marked this pull request as draft April 8, 2026 14:27
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 8, 2026

Claude review of PR #502 (e3f4fe0)

Suggestions

  • CMake undefined SG_PYYAML_RCtests/check/CMakeLists.txt:70 — variable is uninitialized when ast-grep is not found, relying on fragile CMake coercion (fixed: now pre-initialized to 1)
  • Missing bfc_ in exemption regexestools/ast-grep/rules/no-direct-free.yml:26 — exemption regex omits bfc_ prefixed cleanup functions (fixed: bfc_ patterns now included)
  • JSON parse error handlingtools/ast-grep/ignore_known_issues.py:79json.loads has no error handling for malformed input; a non-JSON diagnostic line from ast-grep would crash the script with an opaque traceback
  • Unpinned ast-grep version in CI.github/workflows/ci.yaml:116pipx install --global ast-grep-cli without a version pin; a breaking upstream change to JSON schema would silently break CI
  • CI runtime installation.github/workflows/ci.yaml:115 — consider baking ast-grep and python3-pyyaml into the Fedora Dockerfile (matching how other build deps are handled) rather than installing on every run
  • assert-pointer-params exemption synctools/ast-grep/rules/assert-pointer-params.yml:37freep is in bf-prefix-public-functions exclusion but not here; however freep is static inline so it never matches the rule anyway (non-issue)
  • negative-errno-returns regex gapstools/ast-grep/rules/negative-errno-returns.yml:10 — regex ^E[A-Z0-9]{2,}$ is broader than just errno names, but no false positives currently exist in the codebase

Nits

  • Dangling reference in notetools/ast-grep/rules/negative-errno-returns.yml:7 — note says "See error-handling standard" but no such document exists; should reference doc/developers/style.rst
  • Incomplete test coverage for no-fprintf-stderrtools/ast-grep/rule-tests/no-fprintf-stderr-test.yml:23 — missing vfprintf invalid case and bf_logger_* valid case
  • README claim about error-severity exit behaviortools/ast-grep/known_issues/README.md:19 — states ast-grep scan exits non-zero on error matches, but --json=stream mode may exit 0 regardless; verify and update
  • no-fprintf-stderr brittle exclusiontools/ast-grep/rules/no-fprintf-stderr.yml:17 — exclusion regex enumerates specific function prefixes (acceptable given the small set)
  • YAML note field style — cosmetic inconsistency in note field quoting style (acceptable given YAML equivalence)
  • enum-sentinel exclusion docstools/ast-grep/rules/enum-sentinel-format.yml:33 — now has inline comment documenting exclusion criteria (fixed)

Workflow run

Comment thread tools/ast-grep/rules/enum-sentinel-format.yml
Comment thread tools/ast-grep/rules/no-direct-free.yml Outdated
Comment thread tools/ast-grep/rules/assert-pointer-params.yml
Comment thread tests/check/CMakeLists.txt Outdated
Comment thread tools/ast-grep/rules/doxygen-public-functions.yml
Comment thread tools/ast-grep/rules/doxygen-prefer-backticks.yml Outdated
Comment thread tools/ast-grep/rule-tests/enum-sentinel-format-test.yml Outdated
Comment thread tools/ast-grep/rules/no-fprintf-stderr.yml Outdated
Comment thread tools/ast-grep/rules/spdx-license-header.yml
Comment thread tools/ast-grep/rules/free-function-pattern.yml Outdated
Comment thread tools/ast-grep/rule-tests/assert-pointer-params-test.yml
Comment thread tools/ast-grep/rule-tests/no-direct-free-test.yml
Comment thread tools/ast-grep/rules/bf-prefix-public-functions.yml Outdated
@pzmarzly pzmarzly force-pushed the push-urpmooyzuslo branch 3 times, most recently from 755c27b to d4f68d2 Compare April 17, 2026 18:55
@yaakov-stein
Copy link
Copy Markdown
Contributor

I have not really looked over the code at all, but one thought - would it be easier (for development, review, and not having an overbearing amount of warnings) to start with a single rule that we can begin warning on? After we've seen how that goes, we can begin adding more at a quicker pace.

@pzmarzly pzmarzly force-pushed the push-urpmooyzuslo branch from d4f68d2 to 02608c0 Compare April 21, 2026 13:45
Comment thread tools/ast-grep/rule-tests/__snapshots__/enum-sentinel-format-snapshot.yml Outdated
Comment thread tools/ast-grep/rule-tests/__snapshots__/doxygen-public-functions-snapshot.yml Outdated
Comment thread .github/workflows/ci.yaml Outdated
run: mount bpffs /sys/fs/bpf -t bpf
- name: Install ast-grep
run: |
dnf install -y pipx
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude: suggestion: pipx install --global ast-grep-cli is unpinned. A new upstream release could change rule semantics or introduce incompatibilities, breaking CI unexpectedly. Consider pinning to a specific version (e.g., pipx install --global ast-grep-cli==0.37.0) and bumping deliberately when needed.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's better to fix it when it happens, than to let the pinned version rot.

Comment thread tools/ast-grep/rules/negative-errno-returns.yml Outdated
Comment thread tools/ast-grep/rules/double-pointer-output.yml Outdated
Comment thread tests/check/CMakeLists.txt Outdated
Comment thread tools/ast-grep/rules/enum-sentinel-format.yml Outdated
Comment thread tools/ast-grep/rules/single-line-comment-style.yml
raw = raw.strip()
if not raw:
continue
d = json.loads(raw)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude: suggestion: json.loads(raw) will raise JSONDecodeError on malformed input (e.g., a truncated stream from SIGPIPE or an ast-grep version mismatch mixing diagnostics into the JSON stream). The resulting Python traceback would not indicate which line failed or what the raw content was. Consider wrapping in a try/except json.JSONDecodeError that prints a descriptive message to stderr before re-raising, so CI failures from corrupted streams are easier to triage.

kind: function_declarator
has:
kind: identifier
regex: "^(_bf_print_insn|bf_logger_).*"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude: nit: The exclusion regex ^(_bf_print_insn|bf_logger_).* enumerates a single specific function name rather than a pattern. If another function legitimately needs fprintf(stderr, ...) in the future (e.g., a debug dumper), the rule requires a source edit. Consider generalizing to ^(_bf_print_|bf_logger_).* or adding a comment explaining the maintenance expectation.

language: c
severity: error
message: Error returns must be negative (e.g. -ENOMEM, not ENOMEM)
note: "bpfilter uses negative errno convention: return -ENOMEM, not ENOMEM. See error-handling standard."
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude: nit: The note field uses a quoted string (note: "...") while most other rules in this PR use >- folded scalar style. The same inconsistency appears in no-fprintf-stderr.yml (line 5) and no-ifndef-guards.yml (line 7). Consider using >- for consistency:

note: >-
  bpfilter uses negative errno convention: return -ENOMEM, not ENOMEM.
  See error-handling standard.

- not:
has:
kind: type_identifier
regex: "^(bf_color|bf_style|bf_nf_inet_hooks|bf_bpf_cmd|bf_bpf_prog_type|bf_bpf_attach_type|bf_bpf_map_type|bf_counter_type)$"
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude: nit: The exclusion regex lists 8 specific enum names. Any new kernel-mirror or bitmask enum added to the codebase requires a manual rule edit to this regex. Consider adding a YAML comment above the regex documenting the criteria for inclusion ("bitmask enums and kernel ABI mirrors that intentionally omit sentinels"), so the maintenance contract is clear.

@pzmarzly pzmarzly marked this pull request as draft April 22, 2026 16:03
@pzmarzly pzmarzly force-pushed the push-urpmooyzuslo branch from 49b4b76 to e3f4fe0 Compare May 5, 2026 17:03
@pzmarzly
Copy link
Copy Markdown
Contributor Author

pzmarzly commented May 5, 2026

@claude review

raw = raw.strip()
if not raw:
continue
d = json.loads(raw)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude: suggestion: json.loads(raw) has no error handling. If ast-grep ever writes a non-JSON diagnostic line to stdout (e.g., on a version upgrade or internal warning), the script crashes with an opaque JSONDecodeError traceback. Consider wrapping in try/except:

try:
    d = json.loads(raw)
except json.JSONDecodeError:
    sys.stderr.write(f"warning: skipping non-JSON line: {raw!r}\n")
    continue

Comment thread .github/workflows/ci.yaml
- name: Install ast-grep
run: |
dnf install -y pipx python3-pyyaml
pipx install --global ast-grep-cli
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude: suggestion: pipx install --global ast-grep-cli installs an unpinned version. A breaking upstream change to the JSON stream schema would silently break CI. Consider pinning to a known-good version (e.g., pipx install --global ast-grep-cli==X.Y.Z) and documenting the minimum supported version.

Comment thread .github/workflows/ci.yaml
run: mount bpffs /sys/fs/bpf -t bpf
- name: Install ast-grep
run: |
dnf install -y pipx python3-pyyaml
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude: suggestion: Consider baking pipx, python3-pyyaml, and ast-grep-cli into the Fedora Dockerfile rather than installing at CI runtime. Other build dependencies (clang-format, cmocka, etc.) are already pre-installed in the container image. Installing at runtime adds network latency and a failure point on every run.

message: Error returns must be negative (e.g. -ENOMEM, not ENOMEM)
note: >-
bpfilter uses negative errno convention: return -ENOMEM, not ENOMEM.
See error-handling standard.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude: nit: The note says "See error-handling standard" but no document with that exact title exists. The relevant reference is the "Error codes" section in doc/developers/style.rst. Consider: "See doc/developers/style.rst, Error codes section."

void bf_handle(void)
{
fprintf(stderr, "error: %s", msg);
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude: nit: The test only exercises fprintf(stderr, ...) in the invalid cases but the rule also matches vfprintf(stderr, $$$). Similarly, valid cases test the _bf_print_* exclusion but not the bf_logger_* exclusion. Adding test cases for both branches would improve coverage of the rule.

See `../ignore_known_issues.py` for the filter implementation.

## Severity interaction

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude: nit: The README states that ast-grep scan exits non-zero on severity: error matches, causing pipeline failure before the filter runs. However, ast-grep scan --json=stream typically exits 0 regardless of findings (JSON output modes are designed for piping to tools). If this is incorrect, the documented guarantee about error-severity rules being zero-tolerance may not hold. Please verify the actual exit behavior and update accordingly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants