Skip to content

Add an option to generate a reference configuration from a running cluster#267

Open
dcritch wants to merge 12 commits intoopenshift:mainfrom
dcritch:generate-reference
Open

Add an option to generate a reference configuration from a running cluster#267
dcritch wants to merge 12 commits intoopenshift:mainfrom
dcritch:generate-reference

Conversation

@dcritch
Copy link
Copy Markdown

@dcritch dcritch commented Mar 17, 2026

This PR implements a method to generate a reference config from a running cluster or must-gather output. It addresses #262

Assisted by Claude Code/Cursor

Summary by CodeRabbit

  • New Features

    • Added a reference-generation mode to capture resources from live clusters or must-gather archives, with configurable output location, per-resource selection, sanitization of names/paths, omission of runtime metadata, and generation statistics/warnings for missing specs.
  • Documentation

    • Added an example configuration manifest demonstrating generate-mode options, resource selection, and optional omission of annotation/label keys.
  • Tests

    • Added comprehensive unit and end-to-end tests covering config parsing, resource fetching, sanitization, and generation outputs.

@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented Mar 17, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign natifridman for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Mar 17, 2026
@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented Mar 17, 2026

Hi @dcritch. Thanks for your PR.

I'm waiting for a openshift member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work.

Tip

We noticed you've done this a few times! Consider joining the org to skip this step and gain /lgtm and other bot rights. We recommend asking approvers on your previous PRs to sponsor you.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@imiller0
Copy link
Copy Markdown
Contributor

/ok-to-test

@openshift-ci openshift-ci Bot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Mar 17, 2026
Copy link
Copy Markdown
Contributor

@imiller0 imiller0 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR and extended functionality. A lot of good stuff here. I've left a few comments in the code but also a couple high level thoughts.

  1. Can you add go test code to cover the new functionality
  2. As much as possible we should use common CLI flags for functionality to make the feature appear more integrated.
  3. There's a couple places where additional user configurability will (I think) be required to maintain a generate / re-generate workflow where the reference is periodically updated and they won't want to have to make the same updates/edits to the captured reference (eg default fields to omit).
    Thanks!

Comment thread pkg/compare/compare.go Outdated
Comment thread pkg/compare/compare.go Outdated
Comment thread pkg/generate/config.go
Required bool `json:"required"`
Namespace string `json:"namespace,omitempty"`
Names []string `json:"names,omitempty"`
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect that users will want to extend the list of ignored labels (defaults you have below) explicitly in the config file. This will allow them to capture any changes in the config file and re-generate after changes to the cluster, new versions, etc.

Comment thread pkg/generate/generator.go Outdated
Comment on lines +85 to +87
if ann, ok := metadata["annotations"].(map[string]any); ok && len(ann) == 0 {
delete(metadata, "annotations")
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are times when the annotations are important to correctness. eg operatorframework.io/bundle-unpack-min-retry-interval on OperatorGroup CRs. I don't think we can suppress all annotations but maybe there are some we can add to the default fieldToOmit?

Comment thread pkg/generate/generator.go Outdated
}

func (g *Generator) writeCRFiles(spec *ResourceSpec, resources []*unstructured.Unstructured) error {
kindDir := filepath.Join(g.outputDir, spec.Kind)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This passes the Kind and outputDir directly from user input into a path that will be opened. The filenames should be sanitized to eliminate possibility of going outside current directory

@dcritch
Copy link
Copy Markdown
Author

dcritch commented Mar 20, 2026

Thanks for the PR and extended functionality. A lot of good stuff here. I've left a few comments in the code but also a couple high level thoughts.

  1. Can you add go test code to cover the new functionality
  2. As much as possible we should use common CLI flags for functionality to make the feature appear more integrated.
  3. There's a couple places where additional user configurability will (I think) be required to maintain a generate / re-generate workflow where the reference is periodically updated and they won't want to have to make the same updates/edits to the captured reference (eg default fields to omit).
    Thanks!

Thanks Ian! I'll pushed some changes to address 2, and will continue to work on your other points over the next few days.

@dcritch dcritch marked this pull request as draft March 24, 2026 16:54
@openshift-ci openshift-ci Bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 24, 2026
@dcritch dcritch marked this pull request as ready for review March 25, 2026 15:23
@openshift-ci openshift-ci Bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 25, 2026
@dcritch
Copy link
Copy Markdown
Author

dcritch commented Mar 25, 2026

hey @imiller0 I think I covered most of your comments. please take a look and let me know what you think

@openshift-ci openshift-ci Bot requested a review from lack March 25, 2026 15:24
@dcritch
Copy link
Copy Markdown
Author

dcritch commented Apr 16, 2026

/retest

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 17, 2026

Warning

Rate limit exceeded

@dcritch has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 34 minutes and 14 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 34 minutes and 14 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 72e97625-fb02-4a89-bcfb-1e5d670d37f1

📥 Commits

Reviewing files that changed from the base of the PR and between a4df30a and c65401e.

📒 Files selected for processing (2)
  • pkg/generate/fetcher.go
  • pkg/generate/generator.go

Walkthrough

Adds a new "generate" mode to produce reference directories from live clusters or must-gather archives, plus config parsing, resource fetchers (cluster and must-gather), YAML reference generation with metadata omission rules, and an example config manifest.

Changes

Cohort / File(s) Summary
Documentation
docs/example/generate-config.yaml
New example configuration demonstrating refgen/v1, outputDir, optional omitAnnotations/omitLabels, and a resources list capturing Namespace objects.
Command integration
pkg/compare/compare.go
Added -g/--generate-config and --output-dir flags; changed CLI flow to construct and run generate.Options (early return) when generate mode is set; added generate-mode validations (mutual exclusivity with -r, single -f, forbid -k) and updated help text.
Configuration loader & tests
pkg/generate/config.go, pkg/generate/config_test.go
New RefgenConfig / ResourceSpec types and LoadConfig() with strict YAML unmarshalling, defaults (refgen/v1, ./generated-reference), apiVersion validation, require ≥1 resource, and omit-key validation (reject empty or keys containing "). Tests cover valid configs and multiple error cases.
Resource fetching & tests
pkg/generate/fetcher.go, pkg/generate/fetcher_test.go
Added Fetcher interface with ClusterFetcher (GVK→GVR resolution, dynamic client listing with pagination, name/namespace filtering) and MustGatherFetcher (must-gather dir discovery, multi-doc YAML and list expansion, deduplication, caching). Tests exercise must-gather parsing, list expansion, namespace filtering, and error conditions.
Generation engine & tests
pkg/generate/generator.go, pkg/generate/generator_test.go
Added Generator to write per-kind directories and resource YAMLs, sanitize names/paths, remove runtime-managed metadata and configured annotation/label keys (exact and prefix), prevent path traversal, handle filename collisions, and emit metadata.yaml. Tests validate sanitization, omission rules, filename collisions, and metadata layout.
Orchestration
pkg/generate/generate.go
Added generate.Options and Run() to load config, determine output dir, pick fetcher (cluster vs must-gather), fetch resources per spec (warnings vs required failures), aggregate results, invoke generator, and print stats/warnings.
Misc tests
pkg/compare/compare_test.go
Added TestOmitFieldsLabelPrefixRemovesKeyedEntries validating label-prefix omission behavior used by compare logic.
Dependencies
go.mod
Promoted gopkg.in/yaml.v3 to a direct dependency (same version).

Sequence Diagram(s)

sequenceDiagram
    participant CLI as CLI
    participant Gen as generate.Options.Run()
    participant Cfg as LoadConfig()
    participant Fetch as Fetcher (Cluster / MustGather)
    participant Genr as Generator

    CLI->>Gen: invoke with -g config.yaml [--output-dir]
    Gen->>Cfg: LoadConfig(configPath)
    Cfg-->>Gen: RefgenConfig (validated, defaults applied)
    Gen->>Fetch: NewClusterFetcher() or NewMustGatherFetcher(MustGatherDir)
    Fetch-->>Gen: Fetcher instance
    Gen->>Fetch: For each ResourceSpec: FetchResources(spec)
    Fetch-->>Gen: []*unstructured.Unstructured (or error)
    Gen->>Genr: NewGenerator(config, outputDir).Generate(resourcesBySpec)
    Genr->>Genr: sanitize names, omit fields, ensure safe paths
    Genr->>Genr: write per-kind YAML files and metadata.yaml
    Genr-->>Gen: output path / result
    Gen-->>CLI: print stats and warnings
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

🚥 Pre-merge checks | ✅ 8 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 32.35% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Test Structure And Quality ❓ Inconclusive This PR adds only standard Go tests with testify assertions, not Ginkgo tests. The custom check explicitly requests review of Ginkgo-specific constructs like Describe, Context, It, BeforeEach, AfterEach, and Eventually/Consistently, which are not present in the added test files. Clarify whether the custom check applies to all test code or only Ginkgo-based tests. If reviewing all tests, adapt check requirements for standard Go testing patterns; if specifically Ginkgo, this check should pass as not applicable.
✅ Passed checks (8 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly describes the main functionality added: generating a reference configuration from a running cluster, which aligns with the core changes across multiple new files in pkg/generate/ package.
Stable And Deterministic Test Names ✅ Passed The custom check for test names applies to Ginkgo syntax. This repository uses standard Go testing with deterministic, descriptive static test names.
Microshift Test Compatibility ✅ Passed This PR does not add any Ginkgo e2e tests; all new tests are standard Go unit tests using the testing package.
Single Node Openshift (Sno) Test Compatibility ✅ Passed The pull request contains no Ginkgo e2e tests, only standard Go unit tests. The custom check applies only to Ginkgo e2e tests and is not applicable here.
Topology-Aware Scheduling Compatibility ✅ Passed This pull request adds a CLI feature to generate reference configurations from OpenShift clusters or must-gather archives for use with the kubectl cluster-compare tool. The code consists entirely of configuration parsing, resource fetching, and YAML file writing utilities. It does not introduce any deployment manifests, operator specifications, controllers, or workload-related code that would involve scheduling constraints such as affinity rules, tolerations, nodeSelector targeting, topology spread constraints, or replica count configurations. The topology-aware scheduling compatibility check is therefore not applicable to this change.
Ote Binary Stdout Contract ✅ Passed The repository is a kubectl plugin, not an OTE binary. klog is properly configured to write to stderr by default, and human-readable output appropriately goes to stdout.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed PR adds only unit tests using Go testing package; no Ginkgo e2e tests with IPv4 assumptions or external connectivity detected.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (11)
pkg/generate/config.go (1)

53-61: Optional: validate apiVersion rather than just defaulting.

If a user supplies a typo (e.g., refgen/v2), it's silently accepted. A whitelist check would surface config mistakes early:

Suggested addition
 	if config.APIVersion == "" {
 		config.APIVersion = "refgen/v1"
+	} else if config.APIVersion != "refgen/v1" {
+		return nil, fmt.Errorf("unsupported apiVersion %q (expected refgen/v1)", config.APIVersion)
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/generate/config.go` around lines 53 - 61, Add validation for
config.APIVersion instead of silently accepting unknown values: introduce a
whitelist (e.g., a slice/const like validAPIVersions) and, after applying the
default ("refgen/v1") in the same block that sets config.APIVersion, check that
config.APIVersion is one of the allowed values; if not, return an error (similar
style to the Resources check) so typos like "refgen/v2" are surfaced early. Use
the existing symbol config.APIVersion and return fmt.Errorf on invalid values.
pkg/generate/generator.go (2)

84-107: mergeFieldsToOmit silently drops user omits if the default structure changes.

If items or defaults isn't the expected Go type (e.g., someone edits defaultFieldsToOmit to return []any entries), the early return fto swallows the user's OmitAnnotations/OmitLabels. Since this is all internal, a safer pattern is to panic or at minimum log loudly if the invariant breaks, since defaultFieldsToOmit and this function must remain in lock-step.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/generate/generator.go` around lines 84 - 107, The mergeFieldsToOmit
function silently returns the unmodified default when internal type assertions
for items/defaults fail; instead, make the failure loud so invariants between
defaultFieldsToOmit and mergeFieldsToOmit are enforced: in mergeFieldsToOmit
(referencing RefgenConfig, OmitAnnotations, OmitLabels, items, defaults) replace
the early returns on failed type assertions with explicit panics or a log.Fatal
that include a clear message and the actual unexpected types (e.g.,
"defaultFieldsToOmit changed: expected items to be map[string]any but was %T"),
so misalignment is detected immediately rather than silently dropping user
omits.

254-279: Redundant filepath.Abs and swallowed error at end of Generate.

g.outputDir is already absolute after line 259, so lines 274–278 re-do work and, on an (unlikely) error, return g.outputDir, nil — dropping the error. Simplify:

Suggested cleanup
-	if err := g.writeMetadata(); err != nil {
-		return "", err
-	}
-	absPath, err := filepath.Abs(g.outputDir)
-	if err != nil {
-		return g.outputDir, nil
-	}
-	return absPath, nil
+	if err := g.writeMetadata(); err != nil {
+		return "", err
+	}
+	return g.outputDir, nil
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/generate/generator.go` around lines 254 - 279, The Generate function
redundantly calls filepath.Abs on g.outputDir again and swallows any error by
returning (g.outputDir, nil); remove the final absPath, err block (the second
filepath.Abs and its error branch) and simply return g.outputDir, nil at the end
of Generate (or return the already-resolved outputAbs earlier) so that
g.outputDir remains the canonical absolute path without discarding errors;
update references in Generate and ensure g.outputDir is used consistently after
you set it from outputAbs.
pkg/generate/generator_test.go (1)

178-422: Add a case for two specs sharing the same Kind.

The suite covers duplicate resource names within one spec but not two specs with identical Kind (e.g., one Namespace spec with Required: true and another with Required: false, each contributing different resources). That scenario triggers the g.files[safeKind] = nil clobber bug flagged in generator.go. Adding that case would pin the expected behavior and guard against regressions once the generator is fixed.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/generate/generator_test.go` around lines 178 - 422, Add a new subtest
inside TestGeneratorGenerate that creates two ResourceSpec instances with the
same Kind (e.g., "Namespace") but different Required values, calls
NewGenerator(...).Generate with a map containing both specs (each mapping to
their own resource list), and asserts that both resource files (e.g.,
Namespace/<name>.yaml for each spec's resource) and metadata.yaml are created;
this will exercise the code path that previously set g.files[safeKind] = nil
(see generator.Generate and the safeKind/sanitizePathSegment logic) to ensure
the generator no longer clobbers files when multiple specs share a Kind.
pkg/compare/compare.go (1)

181-181: Nit: update Use to reflect generate mode.

The one-line usage still reads as if -r is the only primary flag. Consider acknowledging the new mode, e.g.:

Suggested change
-		Use:                   "cluster-compare -r <Reference File>",
+		Use:                   "cluster-compare (-r <Reference File> | -g <Generate Config>)",
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/compare/compare.go` at line 181, The command's Use string (in the Cobra
command declaration in compare.go) still mentions only the reference flag;
update the Use value to reflect the new "generate" mode and both primary flags
so help/usage is accurate (e.g., include wording for generate mode and the
-r/--reference flag or --generate flag). Locate the command definition that sets
Use (the Cobra command variable in compare.go) and change the one-line usage to
mention both modes/flags and their expected argument syntax.
pkg/generate/generate.go (2)

64-85: Consider per-spec error handling for non-required specs.

A single fetch failure (e.g., a CRD not installed, transient API error) aborts the whole generation. For Required: false specs, users would likely prefer warn-and-continue so the re-generate workflow isn't blocked by one missing type. If you keep the fail-fast behavior, that's fine for now — just something to revisit as part of the configurability feedback.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/generate/generate.go` around lines 64 - 85, The loop calling
fetcher.FetchResources for each spec should treat non-required specs as
warn-and-continue instead of aborting: inside the for loop where you call
fetcher.FetchResources(spec) (and currently return fmt.Errorf on error), check
spec.Required (or similar field) and if it's false, log a warning with klog
(include spec.Kind/APIVersion and the error), set resources to an empty slice
(or nil) and continue the loop (also append spec to missingSpecs and do not
increment totalResources); if spec.Required is true keep the existing
return-on-error behavior to preserve fail-fast for required specs.

31-34: Remove redundant if o.Verbose guards around klog.V(1) calls.

The generate command runs within the compare command's context where klog verbosity is initialized to 1 when the --verbose flag is set (see compare.go lines 187-194). Since klog.V(1).Infof is a no-op unless verbosity is at least 1, the guards at lines 31, 43, 51, 66, and 82 are unnecessary and can be removed.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/generate/generate.go` around lines 31 - 34, Remove the redundant if
o.Verbose guards that wrap klog.V(1).Infof calls in generate.go: simply delete
the surrounding if o.Verbose { ... } blocks and leave the klog.V(1).Infof(...)
calls (they are already no-ops when verbosity < 1); specifically update the
occurrences around the klog.V(1).Infof invocations at the locations matching the
shown diff (the call sites that report "Loaded configuration...", "Resources to
capture...", and the other klog.V(1).Infof uses referenced in the review) so the
logging calls stand alone without the extra if guard.
pkg/generate/fetcher.go (4)

220-226: Avoid the string(data) round-trip.

strings.NewReader(string(data)) copies the whole file contents into a new string. Use bytes.NewReader(data) directly:

-	dec := yamlv3.NewDecoder(strings.NewReader(string(data)))
+	dec := yamlv3.NewDecoder(bytes.NewReader(data))

(Requires adding "bytes" to the imports.)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/generate/fetcher.go` around lines 220 - 226, The code in
loadResourcesFromFile uses strings.NewReader(string(data)) which unnecessarily
copies the file bytes into a string; replace that with bytes.NewReader(data) and
add "bytes" to the imports so the yamlv3.NewDecoder reads directly from the byte
slice without an extra allocation. Ensure you update the import block (add
bytes) and change the decoder initialization in loadResourcesFromFile to use
bytes.NewReader(data).

151-196: loadAllResources cache invalidation nit.

If the walk yields zero resources, f.cache stays nil (Line 194) and every subsequent FetchResources call re-walks the entire must-gather tree. Either initialize to an empty (non-nil) slice before assignment, or guard with a separate loaded bool flag:

-	f.cache = loaded
+	if loaded == nil {
+		loaded = []*unstructured.Unstructured{}
+	}
+	f.cache = loaded

Not a correctness bug, but worth fixing since FetchResources is called once per ResourceSpec and the walk is the most expensive operation in this path.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/generate/fetcher.go` around lines 151 - 196, The cache can remain nil
when no resources are found causing repeated expensive walks; in
loadAllResources ensure f.cache is always a non-nil slice even when zero results
are loaded by initializing or assigning a zero-length slice before returning
(e.g. ensure the local variable loaded is a non-nil []
*unstructured.Unstructured or set f.cache = []*unstructured.Unstructured{} when
no objects were appended); update the assignment to f.cache = loaded (or the
empty slice) so subsequent FetchResources uses the cached empty result instead
of re-walking.

198-218: findDataRoots swallows walk errors and re-walks matched subtrees.

Two small issues:

  1. At Line 214-216 any walk error results in returning nil (no roots), which then surfaces to the user as the misleading "no must-gather data found under %s" error from Line 157 — e.g., a permission error on a subdirectory looks identical to an empty directory. Prefer propagating the error, or at least logging it via klog.
  2. When a cluster-scoped-resources/namespaces directory is matched at Line 205, the walk continues descending into it. loadAllResources then walks the same subtree again in a second pass. Returning filepath.SkipDir after recording the parent avoids this redundant traversal (the dedupe seen map keeps it correct, just wasteful).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/generate/fetcher.go` around lines 198 - 218, findDataRoots currently
swallows any Walk error and returns nil and also continues descending into
matched "cluster-scoped-resources"/"namespaces" subtrees causing a redundant
second traversal by loadAllResources; update findDataRoots to (1) propagate or
log Walk errors instead of returning nil (e.g., return the error to caller or
klog.Errorf with context) so permission/IO errors aren't hidden, and (2) when
you detect a matched dir (in the filepath.Walk callback where
filepath.Base(path) == "cluster-scoped-resources" || "namespaces"), record the
parent into seen/roots and return filepath.SkipDir from the callback to avoid
descending into that subtree (the existing seen map will prevent duplicates).

51-72: Use a bounded context and consider paginating List.

context.TODO() at Line 66 and Line 68 gives the request no timeout or cancellation hook, so a slow/stuck API server will hang generate indefinitely. Plumb a context.Context through Fetcher.FetchResources (or at minimum wrap with context.WithTimeout) so callers can cancel.

Relatedly, the unpaginated List can be problematic for resources with thousands of items (e.g., Events, Secrets in large clusters). Consider setting ListOptions.Limit + continuation, or using a field/label selector when spec.Names/spec.Namespace is set to narrow the server-side result instead of filtering client-side at Line 77-89.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/generate/fetcher.go` around lines 51 - 72, FetchResources is using
context.TODO() and an unpaginated List which can hang or OOM for large clusters;
change the signature of ClusterFetcher.FetchResources to accept a
context.Context (or create a bounded context via context.WithTimeout inside the
function) and replace context.TODO() calls with that context when calling
f.dynamicClient.Resource(gvr).Namespace(...).List/ .List(...); additionally,
implement server-side pagination by using metav1.ListOptions{Limit:
<reasonableLimit>, Continue: continueToken} and loop to collect items (or apply
ListOptions with fieldSelector/labelSelector when spec.Namespace or spec.Names
are set to narrow results) so you don’t fetch all items client-side.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/generate/fetcher.go`:
- Around line 227-259: The YAML decode loop currently swallows non-EOF decode
errors (dec.Decode) and continues, which can cause infinite loops or silently
drop malformed documents; change the loop in the decoder handling (the block
using dec, raw, and result) to stop iterating and return or propagate the decode
error when err != io.EOF and err != nil (or at minimum log the error via
klog/processLogger) instead of continue, so loadResourcesFromFile (the caller)
receives the failure and can decide to skip the file; ensure you preserve
existing handling for EOF and valid parsed objects (items / kind+apiVersion).
- Around line 237-254: The current items-expansion blindly treats any document
with a top-level "items" key as a list and drops the outer object; update the
heuristic in the block handling raw["items"] (the code that casts raw to
map[string]any, reads items, builds unstructured.Unstructured and appends to
result) to first check the document's "kind" (raw["kind"]) and only treat/expand
items when kind is a List-type (e.g., kind == "List" or
strings.HasSuffix(kindStr, "List")); if the kind is not a List-type then do not
consume/expand the items and instead fall through to register the outer object
normally (i.e., add the unstructured object for the outer doc when it has
kind/apiVersion).

In `@pkg/generate/generator.go`:
- Around line 296-333: writeCRFiles resets g.files per safeKind which clobbers
entries when multiple distinct ResourceSpec values share the same Kind; remove
the g.files[safeKind] = nil and instead preserve/append to existing slice, or
better change g.files to key by the unique spec (e.g.,
map[*ResourceSpec][]fileEntry or a composite key of specID+safeKind) so each
spec has its own entry; update writeCRFiles to append fileEntry to g.files using
the new key (use spec pointer or generated spec key) and then update
writeMetadata to iterate over those spec-keyed entries (not just per safeKind)
so Required/allOf vs anyOf and per-spec file lists are emitted correctly.
- Around line 44-49: The generator currently treats entries in the defaults
slice as exact keys (pathToKey) but some entries are intended as prefix
patterns; update the generator so prefix semantics are explicit and exercised:
either mark intended prefix entries by adding an isPrefix: true field on the
corresponding ManifestPathV1-like structures and extend cleanResource to detect
and delete keys by prefix (use strings.HasPrefix on extracted map keys when
ManifestPathV1.isPrefix is true), or replace those prefix-styled pathToKey
entries with the full concrete keys you actually want removed; locate the
defaults and the cleanResource function in pkg/generate/generator.go and modify
them and the unit/integration tests to validate prefix deletions for labels such
as operators.coreos.com/* and pod-security.kubernetes.io/*.

---

Nitpick comments:
In `@pkg/compare/compare.go`:
- Line 181: The command's Use string (in the Cobra command declaration in
compare.go) still mentions only the reference flag; update the Use value to
reflect the new "generate" mode and both primary flags so help/usage is accurate
(e.g., include wording for generate mode and the -r/--reference flag or
--generate flag). Locate the command definition that sets Use (the Cobra command
variable in compare.go) and change the one-line usage to mention both
modes/flags and their expected argument syntax.

In `@pkg/generate/config.go`:
- Around line 53-61: Add validation for config.APIVersion instead of silently
accepting unknown values: introduce a whitelist (e.g., a slice/const like
validAPIVersions) and, after applying the default ("refgen/v1") in the same
block that sets config.APIVersion, check that config.APIVersion is one of the
allowed values; if not, return an error (similar style to the Resources check)
so typos like "refgen/v2" are surfaced early. Use the existing symbol
config.APIVersion and return fmt.Errorf on invalid values.

In `@pkg/generate/fetcher.go`:
- Around line 220-226: The code in loadResourcesFromFile uses
strings.NewReader(string(data)) which unnecessarily copies the file bytes into a
string; replace that with bytes.NewReader(data) and add "bytes" to the imports
so the yamlv3.NewDecoder reads directly from the byte slice without an extra
allocation. Ensure you update the import block (add bytes) and change the
decoder initialization in loadResourcesFromFile to use bytes.NewReader(data).
- Around line 151-196: The cache can remain nil when no resources are found
causing repeated expensive walks; in loadAllResources ensure f.cache is always a
non-nil slice even when zero results are loaded by initializing or assigning a
zero-length slice before returning (e.g. ensure the local variable loaded is a
non-nil [] *unstructured.Unstructured or set f.cache =
[]*unstructured.Unstructured{} when no objects were appended); update the
assignment to f.cache = loaded (or the empty slice) so subsequent FetchResources
uses the cached empty result instead of re-walking.
- Around line 198-218: findDataRoots currently swallows any Walk error and
returns nil and also continues descending into matched
"cluster-scoped-resources"/"namespaces" subtrees causing a redundant second
traversal by loadAllResources; update findDataRoots to (1) propagate or log Walk
errors instead of returning nil (e.g., return the error to caller or klog.Errorf
with context) so permission/IO errors aren't hidden, and (2) when you detect a
matched dir (in the filepath.Walk callback where filepath.Base(path) ==
"cluster-scoped-resources" || "namespaces"), record the parent into seen/roots
and return filepath.SkipDir from the callback to avoid descending into that
subtree (the existing seen map will prevent duplicates).
- Around line 51-72: FetchResources is using context.TODO() and an unpaginated
List which can hang or OOM for large clusters; change the signature of
ClusterFetcher.FetchResources to accept a context.Context (or create a bounded
context via context.WithTimeout inside the function) and replace context.TODO()
calls with that context when calling
f.dynamicClient.Resource(gvr).Namespace(...).List/ .List(...); additionally,
implement server-side pagination by using metav1.ListOptions{Limit:
<reasonableLimit>, Continue: continueToken} and loop to collect items (or apply
ListOptions with fieldSelector/labelSelector when spec.Namespace or spec.Names
are set to narrow results) so you don’t fetch all items client-side.

In `@pkg/generate/generate.go`:
- Around line 64-85: The loop calling fetcher.FetchResources for each spec
should treat non-required specs as warn-and-continue instead of aborting: inside
the for loop where you call fetcher.FetchResources(spec) (and currently return
fmt.Errorf on error), check spec.Required (or similar field) and if it's false,
log a warning with klog (include spec.Kind/APIVersion and the error), set
resources to an empty slice (or nil) and continue the loop (also append spec to
missingSpecs and do not increment totalResources); if spec.Required is true keep
the existing return-on-error behavior to preserve fail-fast for required specs.
- Around line 31-34: Remove the redundant if o.Verbose guards that wrap
klog.V(1).Infof calls in generate.go: simply delete the surrounding if o.Verbose
{ ... } blocks and leave the klog.V(1).Infof(...) calls (they are already no-ops
when verbosity < 1); specifically update the occurrences around the
klog.V(1).Infof invocations at the locations matching the shown diff (the call
sites that report "Loaded configuration...", "Resources to capture...", and the
other klog.V(1).Infof uses referenced in the review) so the logging calls stand
alone without the extra if guard.

In `@pkg/generate/generator_test.go`:
- Around line 178-422: Add a new subtest inside TestGeneratorGenerate that
creates two ResourceSpec instances with the same Kind (e.g., "Namespace") but
different Required values, calls NewGenerator(...).Generate with a map
containing both specs (each mapping to their own resource list), and asserts
that both resource files (e.g., Namespace/<name>.yaml for each spec's resource)
and metadata.yaml are created; this will exercise the code path that previously
set g.files[safeKind] = nil (see generator.Generate and the
safeKind/sanitizePathSegment logic) to ensure the generator no longer clobbers
files when multiple specs share a Kind.

In `@pkg/generate/generator.go`:
- Around line 84-107: The mergeFieldsToOmit function silently returns the
unmodified default when internal type assertions for items/defaults fail;
instead, make the failure loud so invariants between defaultFieldsToOmit and
mergeFieldsToOmit are enforced: in mergeFieldsToOmit (referencing RefgenConfig,
OmitAnnotations, OmitLabels, items, defaults) replace the early returns on
failed type assertions with explicit panics or a log.Fatal that include a clear
message and the actual unexpected types (e.g., "defaultFieldsToOmit changed:
expected items to be map[string]any but was %T"), so misalignment is detected
immediately rather than silently dropping user omits.
- Around line 254-279: The Generate function redundantly calls filepath.Abs on
g.outputDir again and swallows any error by returning (g.outputDir, nil); remove
the final absPath, err block (the second filepath.Abs and its error branch) and
simply return g.outputDir, nil at the end of Generate (or return the
already-resolved outputAbs earlier) so that g.outputDir remains the canonical
absolute path without discarding errors; update references in Generate and
ensure g.outputDir is used consistently after you set it from outputAbs.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: cb9c2c89-f924-46ea-a61c-1b6a9d5a6536

📥 Commits

Reviewing files that changed from the base of the PR and between 670a676 and 65dc104.

📒 Files selected for processing (9)
  • docs/example/generate-config.yaml
  • pkg/compare/compare.go
  • pkg/generate/config.go
  • pkg/generate/config_test.go
  • pkg/generate/fetcher.go
  • pkg/generate/fetcher_test.go
  • pkg/generate/generate.go
  • pkg/generate/generator.go
  • pkg/generate/generator_test.go

Comment thread pkg/generate/fetcher.go
Comment thread pkg/generate/fetcher.go
Comment thread pkg/generate/generator.go
Comment thread pkg/generate/generator.go Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
pkg/generate/generator.go (1)

231-239: sanitizeFilename lacks the ../dot-only guard that sanitizePathSegment has.

A resource name of .. (or something that reduces to .. after the regex/trim) produces filename ..yaml via sanitizeFilename(name) + ".yaml". Since pathWithinOutput still constrains the final path under g.outputDir, this isn't a traversal bug today, but it yields a surprising filename and diverges from the stricter rules in sanitizePathSegment. Consider applying the same dot-only / contains-.. fallback for consistency.

♻️ Suggested tweak
 func sanitizeFilename(name string) string {
 	safe := sanitizePathChars.ReplaceAllString(name, "-")
 	safe = sanitizePathDashes.ReplaceAllString(safe, "-")
-	safe = strings.Trim(safe, "-")
-	if safe == "" {
+	safe = strings.Trim(safe, "-.")
+	if safe == "" || safe == "." || safe == ".." || strings.Contains(safe, "..") {
 		return "unnamed"
 	}
 	return safe
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/generate/generator.go` around lines 231 - 239, sanitizeFilename currently
can return "." or ".." (leading to filenames like "..yaml") and should follow
the same rules as sanitizePathSegment; update sanitizeFilename to detect
dot-only names or any remaining ".." sequences after sanitization (e.g. safe ==
"." || safe == ".." || strings.Contains(safe, "..")) and fallback to "unnamed"
in those cases so it never returns a dot-only or dot-traversal-like filename;
modify sanitizeFilename (the function shown) to perform that additional check
and return "unnamed" when triggered to match sanitizePathSegment behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/generate/generator.go`:
- Around line 374-382: The current collision-probe loop around crPath using
os.Stat silently ignores non-NotExist errors and creates a TOCTOU; replace the
Stat loop with an atomic create attempt using os.OpenFile(...,
os.O_CREATE|os.O_EXCL, 0o644) on the candidate crPath (built from
sanitizeFilename(r.GetName())/kindDir and the counter) and treat a successful
open as claiming the filename (close it and proceed to write), treat EEXIST by
incrementing counter and retrying, and return any other error immediately
instead of looping; this removes the race and prevents infinite loops on other
filesystem errors.
- Around line 258-289: cleanResource currently performs a shallow copy of
obj.Object so mutations to nested maps (metadata, annotations, labels) modify
the caller's Unstructured; to fix, make a deep copy before mutating (e.g., use
obj.DeepCopy().Object or explicitly deep-copy the metadata, annotations and
labels maps) and then remove
resourceVersion/uid/creationTimestamp/generation/managedFields/selfLink and
status from that copy; update references in cleanResource to operate on the
deep-copied maps (metadata, ann, lbl) so callers' obj is not mutated.

---

Nitpick comments:
In `@pkg/generate/generator.go`:
- Around line 231-239: sanitizeFilename currently can return "." or ".."
(leading to filenames like "..yaml") and should follow the same rules as
sanitizePathSegment; update sanitizeFilename to detect dot-only names or any
remaining ".." sequences after sanitization (e.g. safe == "." || safe == ".." ||
strings.Contains(safe, "..")) and fallback to "unnamed" in those cases so it
never returns a dot-only or dot-traversal-like filename; modify sanitizeFilename
(the function shown) to perform that additional check and return "unnamed" when
triggered to match sanitizePathSegment behavior.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 830ba1cc-ec36-4b78-84ae-9c4b25ef8ac5

📥 Commits

Reviewing files that changed from the base of the PR and between 65dc104 and 99ff3e4.

📒 Files selected for processing (9)
  • go.mod
  • pkg/compare/compare.go
  • pkg/compare/compare_test.go
  • pkg/generate/config.go
  • pkg/generate/fetcher.go
  • pkg/generate/fetcher_test.go
  • pkg/generate/generate.go
  • pkg/generate/generator.go
  • pkg/generate/generator_test.go
✅ Files skipped from review due to trivial changes (1)
  • pkg/generate/generator_test.go
🚧 Files skipped from review as they are similar to previous changes (4)
  • pkg/compare/compare.go
  • pkg/generate/fetcher_test.go
  • pkg/generate/generate.go
  • pkg/generate/fetcher.go

Comment thread pkg/generate/generator.go
Comment thread pkg/generate/generator.go
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/generate/fetcher.go`:
- Around line 202-252: loadAllResources checks f.cache but never assigns it, so
every call re-walks and decodes files and an empty archive keeps hitting the
error path; fix by populating f.cache before returning: when len(roots)==0 set
f.cache = []*unstructured.Unstructured{} and return that with the error (to
avoid repeated work), and at the end of successful loading assign f.cache =
loaded (even if empty) and return f.cache; update the function loadAllResources
to set f.cache in these places (and consider any existing mutex protecting
f.cache if present).

In `@pkg/generate/generator.go`:
- Around line 370-403: The reserved-empty-file bug happens because
cleanResource(r, g.fieldsToOmit) and yaml.Marshal are done after the
exclusive-create loop that creates a zero-byte file; move cleanResource and
yaml.Marshal to before the filename-reservation loop (i.e., compute clean :=
cleanResource(...) and data, err := yaml.Marshal(clean) before you call the loop
that uses sanitizeFilename and g.pathWithinOutput to reserve crPath), then enter
the existing reservation loop to reserve a unique crPath and finally write data
with os.WriteFile(crPath, data, 0o600) and append the fileEntry to g.files; keep
using g.pathWithinOutput, sanitizeFilename, yaml.Marshal, os.WriteFile, and
g.files exactly as in the diff.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 6f38ec9a-e79c-4139-9791-0dcdd96c01b3

📥 Commits

Reviewing files that changed from the base of the PR and between 99ff3e4 and a4df30a.

📒 Files selected for processing (5)
  • pkg/generate/config_test.go
  • pkg/generate/fetcher.go
  • pkg/generate/fetcher_test.go
  • pkg/generate/generator.go
  • pkg/generate/generator_test.go
🚧 Files skipped from review as they are similar to previous changes (3)
  • pkg/generate/fetcher_test.go
  • pkg/generate/config_test.go
  • pkg/generate/generator_test.go

Comment thread pkg/generate/fetcher.go
Comment thread pkg/generate/generator.go
@openshift-ci
Copy link
Copy Markdown

openshift-ci Bot commented Apr 17, 2026

@dcritch: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ok-to-test Indicates a non-member PR verified by an org member that is safe to test.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants