Skip to content

Skill reviewer enhancements#38

Merged
amir-yogev-gh merged 6 commits intoflightctl:mainfrom
amir-yogev-gh:skill-reviewer-enhancements
May 6, 2026
Merged

Skill reviewer enhancements#38
amir-yogev-gh merged 6 commits intoflightctl:mainfrom
amir-yogev-gh:skill-reviewer-enhancements

Conversation

@amir-yogev-gh
Copy link
Copy Markdown
Collaborator

@amir-yogev-gh amir-yogev-gh commented May 5, 2026

This pull request enhances the documentation and structure of the skill-reviewer workflow to clarify the use of deterministic scripts and prompt templates, and to better separate automated and judgment-based review steps. It introduces new directories for scripts and prompts, updates workflow documentation to reflect these additions, and details how automated checks and sub-agent delegation work in the review process.

Key updates to workflow structure and documentation:

Workflow directory structure and conventions:

  • Added explicit scripts/ and prompts/ directories to the recommended workflow layout in AGENTS.md, explaining their roles in deterministic operations and sub-agent delegation.
  • Updated CONTRIBUTING.md to document conventions for scripts/ (deterministic, code-based checks) and prompts/ (templates for delegated sub-agent tasks), including usage patterns and language requirements.

skill-reviewer workflow enhancements:

  • Added scripts/pre-review-checks.py for automated structural, frontmatter, reference, and sequencing checks before LLM evaluation, and documented its invocation and output (file-hashes.json).
  • Added prompts/analyze-skill.md as a prompt template for an Explore sub-agent to generate a structured skill map for large skills, with detailed instructions and output format.
  • Updated workflow documentation (README.md, SKILL.md, guidelines.md) to describe the revised review process, clarifying the order of automated checks, sub-agent delegation, and manual evaluation, as well as how findings are classified and validated.

Artifacts and outputs:

  • Expanded the list of artifacts for the skill-reviewer workflow to include file-hashes.json and skill-map.md alongside the review report, and updated the documentation to reflect these new outputs.

These changes clarify the boundaries between deterministic (scripted) checks and LLM-driven evaluation, improve the scalability of the review process for large skills, and provide clear guidance and templates for future workflow development.

Summary by CodeRabbit

  • New Features

    • Automated pre-review validation checks for skill structure, metadata, and integrity (new pre-review script)
    • Delegation to an Explore sub-agent for large skills (15+ files)
    • New artifacts: structured skill map and per-file change hashes for tracking
  • Documentation

    • Updated workflow docs and README to reflect the new pre-review stage, artifact outputs, and adjusted review phases
    • Expanded contribution guidelines covering optional scripts/prompts directories and sub-agent conventions

Offloads verifiable work to scripts instead of trusting the LLM. Several of the review dimensions have mechanically checkable aspects:
- Orphaned references: grep SKILL.md for file paths, diff against actual files on disk
- Step numbering gaps: parse skills/*.md for ## Step N headings, check for gaps/duplicates
- Schema field consistency: extract field names from producer files, check they appear in consumer files
- README vs implementation drift: extract feature claims from README, grep for them in skills/

A scripts/ directory with small shell/python checkers could run before the LLM review, feeding concrete findings into the evaluation rather than relying on the LLM to notice them. The LLM then focuses on judgment calls (clarity, cognitive load, edge cases) where it adds real value.
Separates analyze-skill.md prompt into its own file and delegates deep-reading to an Explore agent.
This skill currently reads everything in-context. For large skills (15+ files — you already flag this case), delegating the initial structured read to a sub-agent with a dedicated prompt template would:
- Protect context window
- Produce a structured intermediate artifact (like eval-analyze's YAML analysis block)
- Let the reviewer LLM focus on evaluation rather than comprehension
- Computes SHA-256 hashes of all .md files in the target skill
- Compares against stored hashes in .artifacts/skill-reviewer/{skill-name}/file-hashes.json
- Reports changed, new, removed, and unchanged files
- Writes current hashes for next comparison
Produces a structured YAML analysis (purpose, inputs, outputs, pipeline, quality criteria) before generating the final config.
Current reviewer jumps from reading directly to evaluating. Adding an intermediate "skill map" step — a structured summary of what was read (routing graph, schema fields, step sequence, file inventory) — would:
  - Make the review auditable ("here's what I understood before judging")
  - Catch comprehension errors before they become wrong findings
  - Serve as the input to both automated checks and LLM evaluation
@amir-yogev-gh amir-yogev-gh self-assigned this May 5, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 5, 2026

Walkthrough

Adds an automated pre-review checks script and an Explore-agent prompt for large-skill analysis, updates workflow steps to generate and consume a persisted skill-map and file-hashes artifacts, and documents the new dirs/conventions across AGENTS.md, CONTRIBUTING.md, and multiple skill-reviewer docs.

Changes

Skill-Reviewer: pre-review automation & skill-map

Layer / File(s) Summary
Conventions / Repo Layout
AGENTS.md, CONTRIBUTING.md
Document optional scripts/ and prompts/ directories, describe invocation/placeholder rules, and add skill-reviewer/prompts/ and skill-reviewer/scripts/ to repo tree.
Prompt Template (Explore agent)
skill-reviewer/prompts/analyze-skill.md
New prompt instructs an Explore sub-agent to read the skill directory in order and emit a structured skill-map.md with File Inventory, Routing Graph, Phase Pipeline, Schema Fields, Step Sequences, Command Names, and Constraints.
Pre-Review Automation Implementation
skill-reviewer/scripts/pre-review-checks.py
New Python Checker and main() implement SHA-256 per-file hashing persisted to .artifacts/.../file-hashes.json, change reporting, structure validation (SKILL.md, skills/, commands/), frontmatter checks, orphan/dangling reference detection, and step-sequence integrity checks; prints PASS/WARN/FAIL (exit 0).
Workflow Integration
skill-reviewer/skills/review.md, skill-reviewer/README.md, skill-reviewer/SKILL.md, skill-reviewer/guidelines.md
Workflow updated to (a) delegate large directories (15+ files) to Explore using the new prompt or handle inline and persist .artifacts/.../skill-map.md, (b) run scripts/pre-review-checks.py and interpret FAIL/WARN/PASS, (c) renumber subsequent review steps and pair review.md with the generated skill-map.md and file-hashes.json.

Sequence Diagram(s)

sequenceDiagram
    participant Invoker
    participant Workflow
    participant ExploreAgent
    participant PreReviewScript
    participant ArtifactStore
    participant ReviewerLLM

    Invoker->>Workflow: start skill-review for <skill-dir>
    Workflow->>Workflow: detect size (>15 files)?
    alt large
        Workflow->>ExploreAgent: run prompt `analyze-skill.md` on <skill-dir>
        ExploreAgent->>ArtifactStore: write `.artifacts/.../skill-map.md`
    else small
        Workflow->>Workflow: inline read files -> generate in-memory skill-map
        Workflow->>ArtifactStore: optionally write `.artifacts/.../skill-map.md`
    end
    Workflow->>PreReviewScript: run `pre-review-checks.py <skill-dir>`
    PreReviewScript->>ArtifactStore: write `.artifacts/.../file-hashes.json`
    PreReviewScript->>Workflow: return PASS/WARN/FAIL (info only)
    Workflow->>ReviewerLLM: provide skill-map + pre-review outputs + files
    ReviewerLLM->>ArtifactStore: write `.artifacts/.../review.md`
    Workflow->>Invoker: complete, artifacts written (skill-map, file-hashes, review.md)
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 20.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title 'Skill reviewer enhancements' is vague and generic, using a broad term ('enhancements') that does not convey meaningful specifics about the changeset. Consider a more specific title that highlights the primary change, such as 'Add automated pre-review checks and sub-agent delegation to skill-reviewer workflow' or 'Restructure skill-reviewer with scripts, prompts, and improved scalability'.
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
skill-reviewer/SKILL.md (1)

11-66: 🛠️ Refactor suggestion | 🟠 Major | 🏗️ Heavy lift

SKILL.md should stay thin and under 30 lines; detailed flow belongs in skills/.

This file currently embeds detailed procedural content that exceeds the SKILL.md size rule.

As per coding guidelines, “Keep SKILL.md under 30 lines to follow progressive disclosure pattern, with detailed guidance living in guidelines.md and skills/ directory.”

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@skill-reviewer/SKILL.md` around lines 11 - 66, SKILL.md is too long (exceeds
the 30-line rule) and embeds detailed procedural content that belongs in the
skills and guidelines files; trim SKILL.md to a short header/one-paragraph Quick
Start plus a pointer to the detailed workflow, then move the step-by-step
procedure (steps 1–6), the 8-dimension checklist, and the File Layout block into
the detailed skill document (e.g., skills/review.md) and policy guidance
(guidelines.md); ensure commands/review.md or prompts/analyze-skill.md reference
the detailed locations, and leave only a concise executable hint in SKILL.md so
the file is under 30 lines.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@AGENTS.md`:
- Around line 349-351: Update the repository tree in AGENTS.md so the
skill-reviewer section includes the missing prompts/ directory alongside
scripts/ (i.e., change the snippet under the skill-reviewer node from only
"scripts/" to include both "prompts/" and "scripts/") to match the workflow
layout described elsewhere; ensure the tree shows "prompts/" as a child of
"skill-reviewer/" for consistency with the rest of the document.

In `@CONTRIBUTING.md`:
- Around line 104-115: Update the CONTRIBUTING.md examples that show absolute
paths: replace the two occurrences of "~/.ai-workflows/workflow-name/scripts/"
and "~/.ai-workflows/workflow-name/prompts/" with symlink-safe relative examples
such as "./scripts/" and "./prompts/" (or "workflow-name/scripts/" and
"workflow-name/prompts/") so the "Scripts" and "Prompts" sections use relative
paths; ensure the text around those examples (the bullets under "Scripts" and
the last bullet under "Prompts") remains consistent with the repo guideline to
use relative paths for symlink compatibility.

In `@skill-reviewer/scripts/pre-review-checks.py`:
- Around line 255-259: When handling "../" refs the code currently accepts
(f.parent / ref_path).resolve() as a candidate even if it lies outside the skill
root; change the logic that builds candidates for ref.startswith("../") so that
after resolving the candidate you verify it is contained within self.skill_dir
(use self.skill_dir.resolve() and Path.is_relative_to or try
candidate.relative_to(self.skill_dir.resolve())) and only include it in
candidates if that containment check passes; keep the subsequent
any(c.is_file()) and f.relative_to(self.skill_dir) logic unchanged so external
files are not mistakenly treated as valid.
- Around line 229-236: The orphan detection uses a substring check (f.name in
content) which causes false matches; update the referenced computation to
perform reference-aware matching by checking exact filename tokens and common
reference forms: test for the file's full relative path
(str(f.relative_to(self.skill_dir))), the filename stem (f.stem) and filename
(f.name) with word-boundary/regex matches, and common import/asset patterns like
"./{name}", "/{rel_path}", "{rel_path}", "from .{stem}", or "{stem}.py" using
re.search to avoid substring collisions; replace the current any(...) loop that
uses file_contents and f.name with this regex/token-aware check and keep calling
self.fail(rel) when no reference is found.

In `@skill-reviewer/SKILL.md`:
- Around line 16-17: Replace the brittle example command string `python3
scripts/pre-review-checks.py {target-dir}` in SKILL.md with a
repository-root-safe invocation and guidance: recommend running the script using
a path that forces a relative lookup (e.g., prefix the script with ./) or
instruct users to run it from the repository root or via a module invocation
(python -m ...) so the pre-check works regardless of the current working
directory; update the wording around the example to show the safer invocation
and mention the alternative of changing into the repo root before running.

---

Outside diff comments:
In `@skill-reviewer/SKILL.md`:
- Around line 11-66: SKILL.md is too long (exceeds the 30-line rule) and embeds
detailed procedural content that belongs in the skills and guidelines files;
trim SKILL.md to a short header/one-paragraph Quick Start plus a pointer to the
detailed workflow, then move the step-by-step procedure (steps 1–6), the
8-dimension checklist, and the File Layout block into the detailed skill
document (e.g., skills/review.md) and policy guidance (guidelines.md); ensure
commands/review.md or prompts/analyze-skill.md reference the detailed locations,
and leave only a concise executable hint in SKILL.md so the file is under 30
lines.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 2e8489e8-f4af-4666-bacb-8d92d09cb95e

📥 Commits

Reviewing files that changed from the base of the PR and between 9af1516 and f305d8a.

📒 Files selected for processing (8)
  • AGENTS.md
  • CONTRIBUTING.md
  • skill-reviewer/README.md
  • skill-reviewer/SKILL.md
  • skill-reviewer/guidelines.md
  • skill-reviewer/prompts/analyze-skill.md
  • skill-reviewer/scripts/pre-review-checks.py
  • skill-reviewer/skills/review.md

Comment thread AGENTS.md
Comment thread CONTRIBUTING.md Outdated
Comment thread skill-reviewer/scripts/pre-review-checks.py Outdated
Comment thread skill-reviewer/scripts/pre-review-checks.py
Comment thread skill-reviewer/SKILL.md Outdated
@amir-yogev-gh amir-yogev-gh requested a review from adalton May 5, 2026 18:33
Comment thread skill-reviewer/scripts/pre-review-checks.py Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
skill-reviewer/SKILL.md (2)

9-17: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

SKILL.md does not reference guidelines.md.

The file contains no link to guidelines.md, but the coding guideline requires it. Adding a single line (e.g., For principles and hard limits, see [guidelines.md](guidelines.md).) satisfies the rule and also helps the LLM load behavioral rules without re-reading the full document.

As per coding guidelines: "**/SKILL.md: SKILL.md must reference guidelines.md and optionally skills/controller.md using relative paths from the same directory".

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@skill-reviewer/SKILL.md` around lines 9 - 17, SKILL.md currently lacks the
required reference to guidelines.md; add a single relative-link line in SKILL.md
(e.g., near the top or under "Quick Start") that reads something like: For
principles and hard limits, see [guidelines.md](guidelines.md). Ensure the link
uses the relative path "guidelines.md" from SKILL.md and optionally mention
skills/controller.md if relevant so the file meets the guideline that SKILL.md
must reference guidelines.md (and may reference skills/controller.md).

1-66: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

SKILL.md exceeds the mandatory 30-line limit at 66 lines.

The report output template (lines 30–49) and the file-layout diagram (lines 51–66) add ~36 lines of detail that belong in skills/review.md or guidelines.md. SKILL.md should only contain a thin overview per the progressive-disclosure principle.

🛠️ Suggested slim-down

Move the full report template and file-layout diagram to skills/review.md (or guidelines.md), then reduce SKILL.md to just the numbered steps with a pointer:

-4. Classify each finding by severity — **CRITICAL** / **HIGH** (blockers) or **MEDIUM** / **LOW** (suggestions).
-5. Validate findings: verify each finding cites a specific file, includes a concrete suggestion, and that blocker/suggestion counts are accurate. Drop any finding you cannot substantiate from the files you read.
-6. Produce a structured report and write it to `.artifacts/skill-reviewer/{skill-name}/review.md`:
-
-```
-## Skill Review: {skill-name}
-...
-```
+4. Classify, validate, and produce a report — see `skills/review.md` for the full output format and validation rules.

And trim or replace the File Layout block with a one-line pointer to README.md.

As per coding guidelines: "**/SKILL.md: SKILL.md must be under 30 lines to maintain progressive disclosure (details live in guidelines.md and skills/)".

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@skill-reviewer/SKILL.md` around lines 1 - 66, SKILL.md currently exceeds the
30-line limit by including the full report template and File Layout block; trim
SKILL.md to a concise overview by removing the detailed "## Skill Review:
{skill-name}" report template and the File Layout diagram, move those details
into skills/review.md (or guidelines.md) and update the numbered steps
(specifically step 4) to point to skills/review.md for the full output format
and validation rules, and replace the File Layout block with a single-line
pointer to README.md so SKILL.md remains under 30 lines.
🧹 Nitpick comments (1)
skill-reviewer/scripts/pre-review-checks.py (1)

224-246: ⚡ Quick win

EXEMPT_DIRS asymmetry is a defensive improvement, not a current issue.

The orphan check does not mirror EXEMPT_DIRS (unlike the dangling-reference check), but prompts/analyze-skill.md is currently NOT flagged as orphaned because skills/review.md references it as ../prompts/analyze-skill.md in non-fenced text, which matches the reference pattern.

While the structural asymmetry could theoretically cause false positives if a new prompt file is added without references, it is not currently a problem. Applying the suggested fix would be a good defensive refactor for consistency and future-proofing.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@skill-reviewer/scripts/pre-review-checks.py` around lines 224 - 246, The
orphaned-file loop currently only skips EXEMPT_FILES but not directories in
EXEMPT_DIRS; update the check in the loop that iterates self.md_files (the block
using target_rel, target_name, REF_PATTERN and self._strip_code_blocks) to also
skip any file whose path is inside an exempt directory (e.g., compute rel =
f.relative_to(self.skill_dir) and if rel.parts[0] in EXEMPT_DIRS continue),
mirroring the dangling-reference check behavior so files under EXEMPT_DIRS are
not treated as orphaned.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@skill-reviewer/SKILL.md`:
- Around line 9-17: SKILL.md currently lacks the required reference to
guidelines.md; add a single relative-link line in SKILL.md (e.g., near the top
or under "Quick Start") that reads something like: For principles and hard
limits, see [guidelines.md](guidelines.md). Ensure the link uses the relative
path "guidelines.md" from SKILL.md and optionally mention skills/controller.md
if relevant so the file meets the guideline that SKILL.md must reference
guidelines.md (and may reference skills/controller.md).
- Around line 1-66: SKILL.md currently exceeds the 30-line limit by including
the full report template and File Layout block; trim SKILL.md to a concise
overview by removing the detailed "## Skill Review: {skill-name}" report
template and the File Layout diagram, move those details into skills/review.md
(or guidelines.md) and update the numbered steps (specifically step 4) to point
to skills/review.md for the full output format and validation rules, and replace
the File Layout block with a single-line pointer to README.md so SKILL.md
remains under 30 lines.

---

Nitpick comments:
In `@skill-reviewer/scripts/pre-review-checks.py`:
- Around line 224-246: The orphaned-file loop currently only skips EXEMPT_FILES
but not directories in EXEMPT_DIRS; update the check in the loop that iterates
self.md_files (the block using target_rel, target_name, REF_PATTERN and
self._strip_code_blocks) to also skip any file whose path is inside an exempt
directory (e.g., compute rel = f.relative_to(self.skill_dir) and if rel.parts[0]
in EXEMPT_DIRS continue), mirroring the dangling-reference check behavior so
files under EXEMPT_DIRS are not treated as orphaned.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 7f47a704-a275-428c-a51e-28bf00719f84

📥 Commits

Reviewing files that changed from the base of the PR and between f305d8a and e03624f.

📒 Files selected for processing (4)
  • AGENTS.md
  • CONTRIBUTING.md
  • skill-reviewer/SKILL.md
  • skill-reviewer/scripts/pre-review-checks.py

@amir-yogev-gh amir-yogev-gh merged commit 72c2210 into flightctl:main May 6, 2026
4 checks passed
@amir-yogev-gh amir-yogev-gh deleted the skill-reviewer-enhancements branch May 6, 2026 14:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants