feat(taskspawner): add filePatterns filtering for github based tasks#891
feat(taskspawner): add filePatterns filtering for github based tasks#891knechtionscoding wants to merge 1 commit intokelos-dev:mainfrom
Conversation
|
@gjkim42 the only thing of note is the changedFiles is new line separated rather than allowing it to be a range. I'm ambivalent to this really, so wanted to leave it up to you. |
|
@gjkim42 I think this is ready for your review |
|
/kelos review |
|
🤖 Kelos Task Status Task |
94f10ba to
e24fc22
Compare
There was a problem hiding this comment.
🤖 Kelos Reviewer Agent @gjkim42
Review Summary
Verdict: COMMENT
Scope: Adds filePatterns glob filtering for GitHub PR and webhook TaskSpawner sources with a new {{.ChangedFiles}} template variable.
Overall this is a well-structured feature with thorough tests, good lazy-enrichment patterns, and clear API design. I have two findings I'd like maintainer input on before approving.
Findings
Correctness
-
spawnerNeedsChangedFilesmissing metadata check —internal/webhook/handler.go:493-508only checksPromptTemplateandBranchfor"ChangedFiles", but the equivalenttemplateReferencesChangedFilesincmd/kelos-spawner/main.go:737-756also checksMetadata.LabelsandMetadata.Annotations. If a webhook spawner references{{.ChangedFiles}}only in a metadata template, the webhook handler will not fetch the file list and the template variable will render empty. These two functions should have the same detection logic — consider extracting a shared helper or mirroring the metadata checks inspawnerNeedsChangedFiles. -
Hardcoded
defaultGitHubAPIBaseURLin webhook file fetching —internal/webhook/github_files.go:19useshttps://api.github.com. The polling source (internal/source/github_pr.go) respects a configurableBaseURLwhich can be a GHProxy or GitHub Enterprise URL. Webhook file fetching will not work correctly for GitHub Enterprise or GHProxy deployments. Consider plumbing the API base URL through (e.g., from the spawner's configuration or from the webhook handler's config), similar to how the polling path does it.
Tests
- Test coverage is comprehensive: unit tests for
MatchesFilePatterns(13 cases), integration tests forDiscoverwith file patterns (4 scenarios), push event file extraction with deduplication, webhook filter matching, template rendering, andtemplateReferencesChangedFiles. All tests pass.
Conventions
- Generated files (
zz_generated.deepcopy.go, CRD YAMLs) are in sync —make verifypasses. go vetis clean, no lint issues.- PR description follows the template format correctly.
Code Quality
- The
doublestar.Matcherror is silently discarded (match, _ := doublestar.Match(...)) in multiple places inMatchesFilePatterns(internal/source/github_pr.go:50,57,69). Whiledoublestar.Matchonly returns errors for invalid patterns (not runtime data issues), logging or validating patterns at config load time would be more user-friendly — a typo in a pattern would silently not match rather than producing a clear error.
Suggestions (optional)
- Consider validating
filePatternsglob syntax at admission/reconcile time so users get immediate feedback on malformed patterns rather than silent non-matching behavior. - The contributor noted
ChangedFilesis newline-separated. This works well for prompt templates but could be limiting for metadata labels (which typically cannot contain newlines). If metadata use-cases are anticipated, consider whether a different separator or a list type would be more appropriate there. - Minor:
extractPushEventFiles(internal/webhook/github_filter.go:449-476) iteratescommit.Added,commit.Removed,commit.Modifiedseparately with duplicated dedup logic. A small helper or iterating a combined slice would reduce repetition, but this is purely cosmetic.
/kelos needs-input
70d2fcd to
de8d70b
Compare
…s and webhooks
Allow TaskSpawners to filter GitHub pull requests and webhook events by
changed file paths using doublestar glob patterns. This enables
use cases like reviewing only PRs that touch Go source files, skipping
documentation-only PRs, or triggering security reviews when auth code
changes.
- Add FilePatternFilter type with include, exclude, and excludeOnly
semantics to the TaskSpawner API
- Add filePatterns field to githubPullRequests and githubWebhook filters
- Expose {{.ChangedFiles}} template variable for prompts and branches
- For polling sources, fetch PR file lists from the GitHub API after
cheap label/author/draft filters but before expensive review fetches
- For webhook push events, extract changed files from the payload
- For webhook PR events, lazily fetch files from the GitHub API when
a filter or template requires them
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
de8d70b to
978d083
Compare
|
@gjkim42 comments/issues resolved. |
Add "consistent parallel implementation paths" convention to project config and agent configs, based on recurring review feedback in PRs #880 and #891 where polling and webhook code paths diverged in data structures and field checks. Fix gh pr edit in kelos-workers step 7a: {{.Number}} is the issue number in the workers context, not the PR number. Remove the explicit number so gh pr edit defaults to the current branch's PR. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
What type of PR is this?
/kind feature
What this PR does / why we need it:
Adds
filePatternsfiltering to GitHub pull request and webhook TaskSpawner sources, allowing users to filter work items by changed file paths using doublestar glob patterns.This enables use cases like:
include: ["**/*.go"])exclude: ["docs/**", "*.md"]withexcludeOnly: true)include: ["internal/auth/**"])Key changes:
FilePatternFilterwithinclude,exclude, andexcludeOnlyfields.excludeOnlyinverts exclude logic so items are only rejected when all changed files match exclude patterns (e.g., skip docs-only PRs).filePatternsfield ongithubPullRequests— file lists are fetched from the GitHub API after cheap label/author/draft filters but before expensive per-PR review and comment fetches.filePatternsfield ongithubWebhookfilters — for push events, changed files are extracted directly from the payload (no API call). For pull_request events, files are lazily fetched from the GitHub API using the workspace's secretRef.{{.ChangedFiles}}template variable — available inpromptTemplate,branch, and metadata templates for both polling and webhook sources. File lists are automatically fetched when the template referencesChangedFiles, even withoutfilePatternsconfigured.Which issue(s) this PR is related to:
fixes: #778
Special notes for your reviewer:
doublestar/v4dependency was already present as an indirect dependency; this PR promotes it to a direct dependency.source.MatchesFilePatternsas a shared implementation used by both the polling source and webhook filter paths.ChangedFilesonly when a spawner actually needs them (hasfilePatternsin a filter or referencesChangedFilesin a template), avoiding unnecessary API calls.Does this PR introduce a user-facing change?
Summary by cubic
Adds file-based filtering for GitHub PRs and webhooks using doublestar globs, and exposes
{{.ChangedFiles}}in templates. This lets spawners run only when relevant files change and pass file context to agents.New Features
FilePatternFilterwithinclude,exclude, andexcludeOnly.filePatternsongithubPullRequestsandgithubWebhookfilters.{{.ChangedFiles}}available in prompt, branch, and metadata templates (newline-joined).ChangedFiles.secretRef; auto-fetch when filters usefilePatternsor templates referenceChangedFiles.source.MatchesFilePatterns.examples/12-taskspawner-file-patterns.Dependencies
github.com/bmatcuk/doublestar/v4to a direct dependency.Written for commit 978d083. Summary will update on new commits.