TRT-2518: Add variant OS and JobTier to promotion check#2763
TRT-2518: Add variant OS and JobTier to promotion check#2763xueqzhan wants to merge 1 commit intoopenshift:masterfrom
Conversation
|
Pipeline controller notification For optional jobs, comment This repository is configured in: LGTM mode |
|
@xueqzhan: This pull request references TRT-2518 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
Hello @xueqzhan! Some important instructions when contributing to openshift/api: |
📝 WalkthroughWalkthroughThe pull request modifies the feature gate test analyzer to implement a two-tier validation system distinguishing blocking errors from non-blocking warnings. The JobVariant struct is expanded with OS, JobTiers, and Optional fields to model OS-specific variants and configurable job tiers. Function signatures throughout the codebase are updated to propagate these new parameters: checkIfTestingIsSufficient now returns ValidationResult objects, buildHTMLFeatureGateData accepts blockingErrors explicitly, and Sippy query functions accept os and jobTiers parameters. HTML and Markdown output rendering is updated to display OS and JobTiers information. A new optional RHEL-10 variant is introduced to demonstrate non-blocking variants. 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 golangci-lint (2.11.3)Error: build linters: unable to load custom analyzer "kubeapilinter": tools/_output/bin/kube-api-linter.so, plugin: not implemented Comment Tip CodeRabbit can scan for known vulnerabilities in your dependencies using OSV Scanner.OSV Scanner will automatically detect and report security vulnerabilities in your project's dependencies. No additional configuration is required. |
ⓘ You are approaching your monthly quota for Qodo. Upgrade your plan Review Summary by QodoAdd OS variants and JobTier filtering to feature gate promotion analyzer
WalkthroughsDescription• Add OS variant and JobTier support to promotion check validation • Distinguish blocking errors from non-blocking warnings for optional variants • Implement JobTier validation and filtering in Sippy queries • Update HTML output to display OS and JobTier information Diagramflowchart LR
A["Feature Gate Validation"] --> B["Check Testing Sufficiency"]
B --> C["Separate Blocking vs Warnings"]
C --> D["Optional Variants"]
C --> E["Required Variants"]
D --> F["Non-blocking Warnings"]
E --> G["Blocking Errors"]
B --> H["Validate JobTiers"]
H --> I["Generate Sippy Queries"]
I --> J["Query per JobTier"]
B --> K["Update HTML Output"]
K --> L["Display OS and JobTiers"]
File Changes1. tools/codegen/cmd/featuregate-test-analyzer.go
|
Code Review by Qodo
1. JobTier results overwritten
|
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
tools/codegen/cmd/featuregate-test-analyzer.go (1)
614-636: Consider updatingOrderedJobVariants.Lessto include new fields.The sorting implementation for
OrderedJobVariantsdoesn't consider the newOS,JobTiers, orOptionalfields. If two variants differ only by these fields, their relative ordering will be non-deterministic. This could lead to inconsistent output ordering across runs.♻️ Suggested fix to include new fields in sorting
func (a OrderedJobVariants) Less(i, j int) bool { if strings.Compare(a[i].Topology, a[j].Topology) < 0 { return true } else if strings.Compare(a[i].Topology, a[j].Topology) > 0 { return false } if strings.Compare(a[i].Cloud, a[j].Cloud) < 0 { return true } else if strings.Compare(a[i].Cloud, a[j].Cloud) > 0 { return false } if strings.Compare(a[i].Architecture, a[j].Architecture) < 0 { return true } else if strings.Compare(a[i].Architecture, a[j].Architecture) > 0 { return false } + if strings.Compare(a[i].NetworkStack, a[j].NetworkStack) < 0 { + return true + } else if strings.Compare(a[i].NetworkStack, a[j].NetworkStack) > 0 { + return false + } + + if strings.Compare(a[i].OS, a[j].OS) < 0 { + return true + } else if strings.Compare(a[i].OS, a[j].OS) > 0 { + return false + } + + if strings.Compare(a[i].JobTiers, a[j].JobTiers) < 0 { + return true + } else if strings.Compare(a[i].JobTiers, a[j].JobTiers) > 0 { + return false + } + return false }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tools/codegen/cmd/featuregate-test-analyzer.go` around lines 614 - 636, OrderedJobVariants.Less currently only compares Topology, Cloud, and Architecture, causing non-deterministic ordering when OS, JobTiers, or Optional differ; update the Less method to perform additional comparisons after Architecture in this order: compare OS (string), then compare JobTiers (ensure deterministic comparison by joining/sorting slices or comparing lengths then elements), then compare Optional (boolean: false < true), returning true/false appropriately at each step, and finally return false if all compared fields are equal. Ensure you reference the OrderedJobVariants type and its Less method when making changes.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tools/codegen/cmd/featuregate-test-analyzer_test.go`:
- Line 61: The test calls listTestResultFor() which performs real HTTP requests;
restore the t.Skip (uncomment the existing t.Skip("this is for ease of manual
testing")) or replace external calls with a mocked HTTP transport so CI does not
make network requests; specifically change the test in
featuregate-test-analyzer_test.go to either call t.Skip at the top of the test
or inject a mock http.Client/transport into listTestResultFor (or wrap
listTestResultFor with a helper that accepts an http.Client) and use a fake
roundtripper returning deterministic responses for the dummy feature gate names.
---
Nitpick comments:
In `@tools/codegen/cmd/featuregate-test-analyzer.go`:
- Around line 614-636: OrderedJobVariants.Less currently only compares Topology,
Cloud, and Architecture, causing non-deterministic ordering when OS, JobTiers,
or Optional differ; update the Less method to perform additional comparisons
after Architecture in this order: compare OS (string), then compare JobTiers
(ensure deterministic comparison by joining/sorting slices or comparing lengths
then elements), then compare Optional (boolean: false < true), returning
true/false appropriately at each step, and finally return false if all compared
fields are equal. Ensure you reference the OrderedJobVariants type and its Less
method when making changes.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 41be5d92-a7a4-40ea-90ef-2353a5840568
📒 Files selected for processing (4)
tools/codegen/cmd/featuregate-test-analyzer.gotools/codegen/cmd/featuregate-test-analyzer_test.gotools/codegen/pkg/sippy/json_types.gotools/codegen/pkg/utils/html.go
| for _, tt := range tests { | ||
| t.Run(tt.name, func(t *testing.T) { | ||
| t.Skip("this is for ease of manual testing") | ||
| //t.Skip("this is for ease of manual testing") |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Check git history to understand if this test was always meant to be skipped
git log --oneline -5 -- tools/codegen/cmd/featuregate-test-analyzer_test.go
# Check if there are any comments explaining when this test should run
rg -B2 -A2 "manual testing" tools/codegen/cmd/featuregate-test-analyzer_test.goRepository: openshift/api
Length of output: 324
🏁 Script executed:
# Read the test function to understand what it does
sed -n '57,80p' tools/codegen/cmd/featuregate-test-analyzer_test.go
# Check if listTestResultFor makes actual HTTP requests
rg -B5 -A10 "func listTestResultFor" tools/codegen/cmd/featuregate-test-analyzer_test.go
# Check if there's any mocking or network request setup in the test file
rg "http|mock|Mock|client|Client" tools/codegen/cmd/featuregate-test-analyzer_test.go | head -20Repository: openshift/api
Length of output: 763
🏁 Script executed:
# Find the implementation of listTestResultFor
rg -B2 -A20 "^func listTestResultFor" tools/codegen/cmd/featuregate-test-analyzer_test.go
# Also check if there's a non-test version
rg -B2 -A20 "^func listTestResultFor" tools/codegen/cmd/
# Check the actual commit diff
git show --no-patch --format="%B" cece552dRepository: openshift/api
Length of output: 4226
🏁 Script executed:
# Check the full commit diff to see if the skip was intentionally removed
git show cece552d -- tools/codegen/cmd/featuregate-test-analyzer_test.go | head -100
# Check if there are any test skip markers or CI configuration
rg "skip|Skip" tools/codegen/cmd/featuregate-test-analyzer_test.goRepository: openshift/api
Length of output: 2568
Restore t.Skip or add mocking to prevent external network requests in CI.
The t.Skip comment was added (while commented out) in the initial commit of this new test file. This test calls listTestResultFor(), which makes actual HTTP requests to sippy.dptools.openshift.org using real TLS configuration. The comment indicates this test is "for ease of manual testing only," yet the skip is commented out, causing the test to execute in CI and make real network calls with dummy feature gate names. This creates risk of CI flakiness or unexpected behavior. Either restore the skip or mock the HTTP transport.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tools/codegen/cmd/featuregate-test-analyzer_test.go` at line 61, The test
calls listTestResultFor() which performs real HTTP requests; restore the t.Skip
(uncomment the existing t.Skip("this is for ease of manual testing")) or replace
external calls with a mocked HTTP transport so CI does not make network
requests; specifically change the test in featuregate-test-analyzer_test.go to
either call t.Skip at the top of the test or inject a mock http.Client/transport
into listTestResultFor (or wrap listTestResultFor with a helper that accepts an
http.Client) and use a fake roundtripper returning deterministic responses for
the dummy feature gate names.
| // Parse JobTiers - comma-separated string, default to standard/informing/blocking if empty | ||
| var jobTiersList []string | ||
| if jobTiers == "" { | ||
| jobTiersList = []string{"standard", "informing", "blocking"} | ||
| } else { | ||
| // Split by comma and trim whitespace | ||
| for _, tier := range strings.Split(jobTiers, ",") { | ||
| if trimmed := strings.TrimSpace(tier); trimmed != "" { | ||
| jobTiersList = append(jobTiersList, trimmed) | ||
| } | ||
| } | ||
| } | ||
|
|
||
| // Generate one query per JobTier (to work around API's single LinkOperator limitation) | ||
| var queries []*SippyQueryStruct | ||
| for _, jobTier := range jobTiersList { | ||
| // Copy base items for this query | ||
| items := make([]SippyQueryItem, len(baseItems)) | ||
| copy(items, baseItems) | ||
|
|
||
| // Add JobTier filter | ||
| items = append(items, SippyQueryItem{ | ||
| ColumnField: "variants", | ||
| Not: false, | ||
| OperatorValue: "contains", | ||
| Value: fmt.Sprintf("JobTier:%s", jobTier), | ||
| }) | ||
|
|
||
| queries = append(queries, &SippyQueryStruct{ | ||
| Items: items, | ||
| LinkOperator: "and", | ||
| }) | ||
| } |
There was a problem hiding this comment.
1. Jobtier results overwritten 🐞 Bug ✓ Correctness
With JobTier support, a single variant now triggers multiple Sippy queries (one per tier), but listTestResultForVariant overwrites per-test run counts when CurrentRuns meets the threshold, so the final totals depend on query order and can be wrong. This can incorrectly pass/fail the promotion sufficiency checks for a feature gate.
Agent Prompt
### Issue description
`QueriesFor` now emits multiple Sippy queries per variant (one per JobTier). `listTestResultForVariant` iterates over those queries but overwrites `TotalRuns/SuccessfulRuns/...` when `CurrentRuns >= requiredNumberOfTestRunsPerVariant`, causing incorrect totals across tiers.
### Issue Context
This PR intentionally queries each JobTier separately and merges results. The merge logic must therefore be additive across queries.
### Fix Focus Areas
- tools/codegen/cmd/featuregate-test-analyzer.go[876-897]
- tools/codegen/pkg/sippy/json_types.go[101-133]
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
| func validateJobTiers(jobVariant JobVariant) error { | ||
| if jobVariant.JobTiers == "" { | ||
| return nil // Empty is valid - will default to standard,informing,blocking | ||
| } | ||
|
|
||
| validTiers := map[string]bool{ | ||
| "standard": true, | ||
| "informing": true, | ||
| "blocking": true, | ||
| } | ||
|
|
||
| for _, tier := range strings.Split(jobVariant.JobTiers, ",") { | ||
| tier = strings.TrimSpace(tier) | ||
| if tier != "" && !validTiers[tier] { | ||
| return fmt.Errorf("invalid JobTier %q in variant %+v - must be one of: standard, informing, blocking", tier, jobVariant) | ||
| } | ||
| } | ||
| return nil | ||
| } |
There was a problem hiding this comment.
2. Empty jobtier skips queries 🐞 Bug ✓ Correctness
JobTier strings containing only commas/whitespace (e.g. "," or " , ") pass validateJobTiers but produce an empty jobTiersList, causing QueriesFor to return zero queries and listTestResultForVariant to make no API calls. This silently returns empty test results and can wrongly report insufficient testing.
Agent Prompt
### Issue description
Malformed `JobTiers` strings like `","` or whitespace-only values currently pass validation but result in *zero* JobTier filters, producing zero Sippy queries and silently empty results.
### Issue Context
`validateJobTiers` only rejects unknown non-empty tiers. `QueriesFor` drops empty trimmed tiers and does not apply a default if the resulting list is empty.
### Fix Focus Areas
- tools/codegen/cmd/featuregate-test-analyzer.go[667-685]
- tools/codegen/pkg/sippy/json_types.go[101-113]
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
| //t.Skip("this is for ease of manual testing") | ||
|
|
||
| got, err := listTestResultFor(tt.args.featureGate, sets.New[string](tt.args.clusterProfile)) | ||
| if (err != nil) != tt.wantErr { |
There was a problem hiding this comment.
3. Network test enabled 🐞 Bug ⛯ Reliability
Test_listTestResultFor is no longer skipped and calls listTestResultFor, which performs live HTTPS requests to sippy.dptools.openshift.org. This makes unit tests non-hermetic and likely flaky/slow in CI environments.
Agent Prompt
### Issue description
A test that performs live network calls to Sippy is now enabled by default, which is unsuitable for unit test suites and CI.
### Issue Context
The tested functions call out to `https://sippy.dptools.openshift.org/api/tests`.
### Fix Focus Areas
- tools/codegen/cmd/featuregate-test-analyzer_test.go[59-66]
- tools/codegen/cmd/featuregate-test-analyzer.go[801-864]
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
|
@xueqzhan: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
| for _, tt := range tests { | ||
| t.Run(tt.name, func(t *testing.T) { | ||
| t.Skip("this is for ease of manual testing") | ||
| //t.Skip("this is for ease of manual testing") |
There was a problem hiding this comment.
Did we mean to comment this out?
| Architecture: "amd64", | ||
| Topology: "ha", | ||
| OS: "rhel10", | ||
| Optional: true, // RHEL 10 is optional in 4.22, will be required in OCP 5 |
There was a problem hiding this comment.
Will optionality be represented on the rendered HTML page?
Adds support for OS variants (RHEL 10) and JobTier filtering to the feature gate promotion analyzer.