Skip to content

OCPBUGS-49738: fix handling of host conflict#736

Open
jcmoraisjr wants to merge 1 commit intoopenshift:masterfrom
jcmoraisjr:OCPBUGS-49738-fix-routemap
Open

OCPBUGS-49738: fix handling of host conflict#736
jcmoraisjr wants to merge 1 commit intoopenshift:masterfrom
jcmoraisjr:OCPBUGS-49738-fix-routemap

Conversation

@jcmoraisjr
Copy link
Copy Markdown
Member

@jcmoraisjr jcmoraisjr commented Feb 23, 2026

Routes are grouped by host on three distinct hashmaps when allowWildcardRoutes is enabled. These hashmaps are updated whenever routes are added, modified or deleted, and their content is used to evaluate host conflicts. This hashmaps handling is done using the current host of the route resource, so, if the host value changes, the router does not find the entry on the hashmaps because it is using the current - and still not used - host value as the key.

The approach used on this update is to ignore the host from the router, and remove the resource from all the hosts instead, since the only match on all the hashmaps should be found on the key corresponding to the former host value.

Summary by CodeRabbit

  • Bug Fixes

    • Prevented stale host and wildcard conflicts when routes are updated, removed, or reassigned.
  • Refactor

    • Centralized and simplified route/host cleanup so claiming and release behavior is more consistent.
  • Tests

    • Added coverage for wildcard host updates and lifecycle (add → modify → delete) to ensure reclaiming hosts works without errors.

@openshift-ci-robot openshift-ci-robot added jira/severity-important Referenced Jira bug's severity is important for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Feb 23, 2026
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@jcmoraisjr: This pull request references Jira Issue OCPBUGS-49738, which is invalid:

  • expected the bug to target the "4.22.0" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

Routes are grouped by host on three distinct hashmaps when allowWildcardRoutes is enabled. These hashmaps are updated whenever routes are added, modified or deleted, and their content is used to evaluate host conflicts. This hashmaps handling is done using the current host of the route resource, so, if the host value changes, the router does not find the entry on the hashmaps because it is using the current - and still not used - host value as the key.

The approach used on this update is to ignore the host from the router, and remove the resource from all the hosts instead, since the only match on all the hashmaps should be found on the key corresponding to the former host value.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci Bot requested review from gcs278 and knobunc February 23, 2026 22:12
@jcmoraisjr
Copy link
Copy Markdown
Member Author

/jira refresh

@openshift-ci-robot openshift-ci-robot added jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. and removed jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Feb 24, 2026
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@jcmoraisjr: This pull request references Jira Issue OCPBUGS-49738, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.22.0) matches configured target version for branch (4.22.0)
  • bug is in the state New, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @melvinjoseph86

Details

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@alebedev87
Copy link
Copy Markdown
Contributor

/assign @gcs278

@melvinjoseph86
Copy link
Copy Markdown

melvinjoseph86 commented Feb 25, 2026

Able to reproduce the issue 'HostAlreadyClaimed' on a unfixed build and also tested the same on cluster using this PR.

➜  oc get clusterversion
NAME      VERSION                                                AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.22.0-0-2026-02-25-140247-test-ci-ln-8iitiib-latest   True        False         130m    Cluster version is 4.22.0-0-2026-02-25-140247-test-ci-ln-8iitiib-latest

➜  oc patch ingresscontroller default -n openshift-ingress-operator --type=merge -p '{"spec":{"routeAdmission":{"wildcardPolicy":"WildcardsAllowed"}}}'
ingresscontroller.operator.openshift.io/default patched

➜  oc create -f webrc.yaml 
replicationcontroller/web-server-rc created
service/service-secure created
service/service-unsecure created

➜  oc expose service service-unsecure --name httpd-cm1                                                                                            
route.route.openshift.io/httpd-cm1 exposed
➜  oc expose service service-unsecure --name httpd-cm2
route.route.openshift.io/httpd-cm2 exposed
➜  oc get svc
NAME               TYPE           CLUSTER-IP      EXTERNAL-IP                            PORT(S)     AGE
kubernetes         ClusterIP      172.30.0.1      <none>                                 443/TCP     168m
openshift          ExternalName   <none>          kubernetes.default.svc.cluster.local   <none>      163m
service-secure     ClusterIP      172.30.68.65    <none>                                 27443/TCP   106s
service-unsecure   ClusterIP      172.30.131.71   <none>                                 27017/TCP   105s

➜  oc get routes
NAME        HOST/PORT                                                           PATH   SERVICES           PORT   TERMINATION   WILDCARD
httpd-cm1   httpd-cm1-default.apps.ci-ln-8iitiib-76ef8.aws-4.ci.openshift.org          service-unsecure   http                 None
httpd-cm2   httpd-cm2-default.apps.ci-ln-8iitiib-76ef8.aws-4.ci.openshift.org          service-unsecure   http                 None

➜  oc patch route httpd-cm1 --type=merge -p '{"spec":{"host":"httpd-cm2-default.apps.ci-ln-8iitiib-76ef8.aws-4.ci.openshift.org"}}' 
route.route.openshift.io/httpd-cm1 patched
➜  oc patch route httpd-cm2 --type=merge -p '{"spec":{"host":"httpd-cm1-default.apps.ci-ln-8iitiib-76ef8.aws-4.ci.openshift.org"}}'
route.route.openshift.io/httpd-cm2 patched

➜  oc get routes  
NAME        HOST/PORT                                                           PATH   SERVICES           PORT   TERMINATION   WILDCARD
httpd-cm1   httpd-cm2-default.apps.ci-ln-8iitiib-76ef8.aws-4.ci.openshift.org          service-unsecure   http                 None
httpd-cm2   httpd-cm1-default.apps.ci-ln-8iitiib-76ef8.aws-4.ci.openshift.org          service-unsecure   http                 None
➜  oc get routes -o json | jq -r '.items[] |.metadata.name,.spec.host,.status.ingress[].conditions[].message'
httpd-cm1
httpd-cm2-default.apps.ci-ln-8iitiib-76ef8.aws-4.ci.openshift.org
null
httpd-cm2
httpd-cm1-default.apps.ci-ln-8iitiib-76ef8.aws-4.ci.openshift.org

Hence marking as verified
/verified by @mjoseph
@jcmoraisjr Does the existing e2e will handle this change or do we need to modify/add

@openshift-ci-robot openshift-ci-robot added the verified Signifies that the PR passed pre-merge verification criteria label Feb 25, 2026
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@melvinjoseph86: This PR has been marked as verified by @mjoseph.

Details

In response to this:

Able reproduce the issue 'HostAlreadyClaimed' on unfixed build and also tested the same on cluster using this PR.

➜  oc get clusterversion
NAME      VERSION                                                AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.22.0-0-2026-02-25-140247-test-ci-ln-8iitiib-latest   True        False         130m    Cluster version is 4.22.0-0-2026-02-25-140247-test-ci-ln-8iitiib-latest

➜  oc patch ingresscontroller default -n openshift-ingress-operator --type=merge -p '{"spec":{"routeAdmission":{"wildcardPolicy":"WildcardsAllowed"}}}'
ingresscontroller.operator.openshift.io/default patched

➜  oc create -f webrc.yaml 
replicationcontroller/web-server-rc created
service/service-secure created
service/service-unsecure created

➜  oc expose service service-unsecure --name httpd-cm1                                                                                            
route.route.openshift.io/httpd-cm1 exposed
➜  oc expose service service-unsecure --name httpd-cm2
route.route.openshift.io/httpd-cm2 exposed
➜  oc get svc
NAME               TYPE           CLUSTER-IP      EXTERNAL-IP                            PORT(S)     AGE
kubernetes         ClusterIP      172.30.0.1      <none>                                 443/TCP     168m
openshift          ExternalName   <none>          kubernetes.default.svc.cluster.local   <none>      163m
service-secure     ClusterIP      172.30.68.65    <none>                                 27443/TCP   106s
service-unsecure   ClusterIP      172.30.131.71   <none>                                 27017/TCP   105s

➜  oc get routes
NAME        HOST/PORT                                                           PATH   SERVICES           PORT   TERMINATION   WILDCARD
httpd-cm1   httpd-cm1-default.apps.ci-ln-8iitiib-76ef8.aws-4.ci.openshift.org          service-unsecure   http                 None
httpd-cm2   httpd-cm2-default.apps.ci-ln-8iitiib-76ef8.aws-4.ci.openshift.org          service-unsecure   http                 None

➜  oc patch route httpd-cm1 --type=merge -p '{"spec":{"host":"httpd-cm2-default.apps.ci-ln-8iitiib-76ef8.aws-4.ci.openshift.org"}}' 
route.route.openshift.io/httpd-cm1 patched
➜  oc patch route httpd-cm2 --type=merge -p '{"spec":{"host":"httpd-cm1-default.apps.ci-ln-8iitiib-76ef8.aws-4.ci.openshift.org"}}'
route.route.openshift.io/httpd-cm2 patched

➜  oc get routes  
NAME        HOST/PORT                                                           PATH   SERVICES           PORT   TERMINATION   WILDCARD
httpd-cm1   httpd-cm2-default.apps.ci-ln-8iitiib-76ef8.aws-4.ci.openshift.org          service-unsecure   http                 None
httpd-cm2   httpd-cm1-default.apps.ci-ln-8iitiib-76ef8.aws-4.ci.openshift.org          service-unsecure   http                 None
➜  oc get routes -o json | jq -r '.items[] |.metadata.name,.spec.host,.status.ingress[].conditions[].message'
httpd-cm1
httpd-cm2-default.apps.ci-ln-8iitiib-76ef8.aws-4.ci.openshift.org
null
httpd-cm2
httpd-cm1-default.apps.ci-ln-8iitiib-76ef8.aws-4.ci.openshift.org

Hence marking as verified
/verified by @mjoseph
@jcmoraisjr Does the existing e2e will handle this change or do we need to modify/add

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@jcmoraisjr
Copy link
Copy Markdown
Member Author

@melvinjoseph86 I'm not sure we have an e2e that covers this scenario, it was unknown for us up to now. So it'd need to be added if we want the coverage.

@gcs278
Copy link
Copy Markdown
Contributor

gcs278 commented Apr 22, 2026

@CodeRabbit can you review this PR?

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 22, 2026

@gcs278 Sure, I'll start reviewing this PR right away!

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 22, 2026

Warning

Rate limit exceeded

@jcmoraisjr has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 0 minutes and 5 seconds before requesting another review.

To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 05ef1d1a-6e06-4d84-a4c5-cce0f6c96db4

📥 Commits

Reviewing files that changed from the base of the PR and between 5f8894c and 31a6c97.

📒 Files selected for processing (2)
  • pkg/router/controller/host_admitter.go
  • pkg/router/controller/host_admitter_test.go

Walkthrough

Route host tracking and removal were reworked: RouteMap.RemoveRoute no longer returns a boolean and uses slices.DeleteFunc. HostAdmitter gained a routeHosts cache keyed by types.UID; addRoute now pre-removes previously cached hosts from claimed/blocked maps before (re)claiming, and tests were expanded to cover updates.

Changes

Cohort / File(s) Summary
Host admitter implementation
pkg/router/controller/host_admitter.go
Changed RouteMap.RemoveRoute signature/impl to use slices.DeleteFunc and drop the boolean return. Added routeHosts map[types.UID]string to HostAdmitter. HostAdmitter.addRoute now checks routeHosts for prior host (by UID) and proactively removes that host (and its wildcard key) from claimedHosts, blockedWildcards, and claimedWildcards before performing current policy claiming. Removed prior per-policy cleanup lines since cache-driven pre-removal handles cross-map removals.
Host admitter tests
pkg/router/controller/host_admitter_test.go
Switched to require assertions. Reworked TestWildcardHostDeny to use wildcardAdmitter(true) and fully populate test RouteSpec (including ObjectMeta.UID, Host, Path, WildcardPolicy). Added TestWildcardHostUpdate to exercise add→modify→delete flows and verify a host can be reclaimed after updates/deletes, asserting no errors throughout.

Sequence Diagram(s)

sequenceDiagram
    participant Watch as K8sWatch
    participant HostAdmitter
    participant RouteMap
    participant ClaimMaps as claimed/blocked maps

    K8sWatch->>HostAdmitter: Added/Modified Route (with UID, Host, WildcardPolicy)
    HostAdmitter->>HostAdmitter: lookup routeHosts[UID] -> oldHost?
    alt oldHost exists
        HostAdmitter->>RouteMap: RemoveRoute(keyFor(oldHost), route)
        HostAdmitter->>ClaimMaps: remove oldHost and wildcardKey from claimedHosts/blockedWildcards/claimedWildcards
    end
    HostAdmitter->>HostAdmitter: apply current WildcardPolicy -> attempt claim
    HostAdmitter->>RouteMap: Insert/claim new host entry
    HostAdmitter->>HostAdmitter: update routeHosts[UID] = newHost
    Note over HostAdmitter,ClaimMaps: On delete events, similar pre-removal uses cached host then clears routeHosts[UID]
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 10 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Test Structure And Quality ❓ Inconclusive Test file uses standard Go testing package with require assertions, not Ginkgo BDD framework with Describe/Context/It blocks. Clarify if custom check applies to standard Go unit tests or Ginkgo tests. Current file contains standard Go tests, not Ginkgo tests as required by check instructions.
✅ Passed checks (10 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title directly addresses the main change: fixing host conflict handling in the route admitter by improving how hosts are removed when they change.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed The codebase uses Go's standard testing package, not Ginkgo. Test names are static and deterministic without dynamic information.
Microshift Test Compatibility ✅ Passed The custom check applies only to new Ginkgo e2e tests. The PR adds standard Go unit tests using the testing package with no Ginkgo patterns, so the check is not applicable.
Single Node Openshift (Sno) Test Compatibility ✅ Passed This PR adds unit tests to pkg/router/controller/host_admitter_test.go using standard Go testing framework, not Ginkgo e2e tests, so the custom check does not apply.
Topology-Aware Scheduling Compatibility ✅ Passed PR modifies only route admission control logic in host_admitter.go with no deployment manifests, scheduling constraints, or topology-aware scheduling code.
Ote Binary Stdout Contract ✅ Passed Modified files are standard Go package code and unit tests using the testing package with no stdout writes at process level.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed The changes to host_admitter_test.go are unit tests using standard Go testing package, not Ginkgo e2e tests. No IPv4 assumptions, hardcoded IPs, or external connectivity requirements were found.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Warning

Review ran into problems

🔥 Problems

Git: Failed to clone repository. Please run the @coderabbitai full review command to re-trigger a full review. If the issue persists, set path_filters to include or exclude specific files.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci-robot openshift-ci-robot added jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. and removed jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. labels Apr 22, 2026
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@jcmoraisjr: This pull request references Jira Issue OCPBUGS-49738, which is invalid:

  • expected the bug to target either version "5.0." or "openshift-5.0.", but it targets "4.22.0" instead

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

Details

In response to this:

Routes are grouped by host on three distinct hashmaps when allowWildcardRoutes is enabled. These hashmaps are updated whenever routes are added, modified or deleted, and their content is used to evaluate host conflicts. This hashmaps handling is done using the current host of the route resource, so, if the host value changes, the router does not find the entry on the hashmaps because it is using the current - and still not used - host value as the key.

The approach used on this update is to ignore the host from the router, and remove the resource from all the hosts instead, since the only match on all the hashmaps should be found on the key corresponding to the former host value.

Summary by CodeRabbit

  • Bug Fixes

  • Improved route removal and cleanup logic to ensure routes are properly removed from all relevant mappings before being reused or reassigned.

  • Refactor

  • Optimized internal route management operations for better efficiency and maintainability.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Comment thread pkg/router/controller/host_admitter.go
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
pkg/router/controller/host_admitter.go (1)

172-210: ⚠️ Potential issue | 🟠 Major

Move stale-entry cleanup before conflict detection.

Line 174 can return before Lines 207-210 run. If a route changes host and the new host is rejected as already claimed, its old host entry remains in the maps, so the original stale-claim bug still survives on the rejection path.

🐛 Proposed fix
 func (p *HostAdmitter) addRoute(route *routev1.Route) error {
+	// Remove old entries for this route from all hosts before conflict checks;
+	// this handles host changes even when the new host is rejected.
+	p.claimedHosts.RemoveRouteFromAllKeys(route)
+	p.blockedWildcards.RemoveRouteFromAllKeys(route)
+	p.claimedWildcards.RemoveRouteFromAllKeys(route)
+
 	// Find displaced routes (or error if an existing route displaces us)
 	displacedRoutes, err, ownerNamespace := p.displacedRoutes(route)
 	if err != nil {
 		msg := fmt.Sprintf("a route in another namespace holds host %s", route.Spec.Host)
 		if ownerNamespace == route.Namespace {
@@
 		p.plugin.HandleRoute(watch.Deleted, displacedRoute)
 	}
 
-	// Remove old entries for this route from all hosts, this handles the change of the host value
-	p.claimedHosts.RemoveRouteFromAllKeys(route)
-	p.blockedWildcards.RemoveRouteFromAllKeys(route)
-	p.claimedWildcards.RemoveRouteFromAllKeys(route)
-
 	if len(route.Spec.WildcardPolicy) == 0 {
 		route.Spec.WildcardPolicy = routev1.WildcardPolicyNone
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/router/controller/host_admitter.go` around lines 172 - 210, The stale
host entries cleanup must run before checking/early-returning on displaced
routes so a rejected new-host doesn't leave the old host claimed; move the calls
to p.claimedHosts.RemoveRouteFromAllKeys(route),
p.blockedWildcards.RemoveRouteFromAllKeys(route), and
p.claimedWildcards.RemoveRouteFromAllKeys(route) to just before invoking
p.displacedRoutes(route) (i.e., run cleanup on the incoming route first), then
call p.displacedRoutes(route) and proceed with the existing error handling and
displacedRoute removals as-is; ensure you reference all three
RemoveRouteFromAllKeys methods and keep the displacedRoutes/error flow unchanged
besides relocating the cleanup.
🧹 Nitpick comments (1)
pkg/router/controller/host_admitter.go (1)

38-42: Consider avoiding full-map scans on every route admission.

This helper is O(total routes) per call; used on every add/modify, bulk syncs can become O(n²). If route volume is high, consider tracking previous keys per namespace/name or limiting the full scan to update paths that can actually have stale entries.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/router/controller/host_admitter.go` around lines 38 - 42,
RemoveRouteFromAllKeys currently scans the entire RouteMap on every call which
is O(total routes) and leads to quadratic behavior during bulk syncs; change the
implementation to avoid full-map scans by tracking previously associated keys
per route (e.g., maintain a map[string]map[string]struct{} keyed by
namespace/name -> set of keys) so
RemoveRouteFromAllKeys(RouteMap.RemoveRouteFromAllKeys) can iterate only the
known keys for the given route (routev1.Route) and call RouteMap.RemoveRoute for
those entries, or alternatively update callers to pass the specific keys to
remove rather than relying on a global scan.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@pkg/router/controller/host_admitter.go`:
- Around line 172-210: The stale host entries cleanup must run before
checking/early-returning on displaced routes so a rejected new-host doesn't
leave the old host claimed; move the calls to
p.claimedHosts.RemoveRouteFromAllKeys(route),
p.blockedWildcards.RemoveRouteFromAllKeys(route), and
p.claimedWildcards.RemoveRouteFromAllKeys(route) to just before invoking
p.displacedRoutes(route) (i.e., run cleanup on the incoming route first), then
call p.displacedRoutes(route) and proceed with the existing error handling and
displacedRoute removals as-is; ensure you reference all three
RemoveRouteFromAllKeys methods and keep the displacedRoutes/error flow unchanged
besides relocating the cleanup.

---

Nitpick comments:
In `@pkg/router/controller/host_admitter.go`:
- Around line 38-42: RemoveRouteFromAllKeys currently scans the entire RouteMap
on every call which is O(total routes) and leads to quadratic behavior during
bulk syncs; change the implementation to avoid full-map scans by tracking
previously associated keys per route (e.g., maintain a
map[string]map[string]struct{} keyed by namespace/name -> set of keys) so
RemoveRouteFromAllKeys(RouteMap.RemoveRouteFromAllKeys) can iterate only the
known keys for the given route (routev1.Route) and call RouteMap.RemoveRoute for
those entries, or alternatively update callers to pass the specific keys to
remove rather than relying on a global scan.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: a4014cee-4bb7-40ef-87a0-3c4f7d28ca9b

📥 Commits

Reviewing files that changed from the base of the PR and between d8ed355 and d6ed14d.

📒 Files selected for processing (1)
  • pkg/router/controller/host_admitter.go

@jcmoraisjr jcmoraisjr force-pushed the OCPBUGS-49738-fix-routemap branch from d6ed14d to daab4aa Compare April 23, 2026 13:05
@openshift-ci-robot openshift-ci-robot removed the verified Signifies that the PR passed pre-merge verification criteria label Apr 23, 2026
@jcmoraisjr
Copy link
Copy Markdown
Member Author

/jira refresh

@openshift-ci-robot openshift-ci-robot added jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. and removed jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Apr 23, 2026
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@jcmoraisjr: This pull request references Jira Issue OCPBUGS-49738, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (5.0.0) matches configured target version for branch (5.0.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @melvinjoseph86

Details

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Apr 23, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from gcs278. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
pkg/router/controller/host_admitter_test.go (1)

157-259: ⚠️ Potential issue | 🟡 Minor

Avoid masking host conflicts with zero UIDs after enabling wildcard tracking.

Line 159 now enables the RouteMap conflict path, but these table fixtures still share the empty UID. HostAdmitter.displacedRoutes skips same-UID routes in pkg/router/controller/host_admitter.go:259-262, so duplicate-host cases can pass as “same route” rather than validating admission behavior.

🧪 Proposed test isolation fix
 func TestWildcardHostDeny(t *testing.T) {
-	p := &fakePlugin{}
-	admitter := NewHostAdmitter(p, wildcardAdmitter, true, false, LogRejections)
 	tests := []struct {
 		name   string
 		host   string
@@
 
 	for _, tc := range tests {
-		route := &routev1.Route{
-			ObjectMeta: metav1.ObjectMeta{
-				Name:      tc.name,
-				Namespace: "deny",
-			},
-			Spec: routev1.RouteSpec{Host: tc.host, WildcardPolicy: tc.policy},
-		}
+		t.Run(tc.name, func(t *testing.T) {
+			p := &fakePlugin{}
+			admitter := NewHostAdmitter(p, wildcardAdmitter, true, false, LogRejections)
+			route := &routev1.Route{
+				ObjectMeta: metav1.ObjectMeta{
+					Name:      tc.name,
+					Namespace: "deny",
+					UID:       types.UID(tc.name),
+				},
+				Spec: routev1.RouteSpec{Host: tc.host, WildcardPolicy: tc.policy},
+			}
 
-		err := admitter.HandleRoute(watch.Added, route)
-		if tc.errors {
-			if err == nil {
-				t.Fatalf("Test case %s expected errors, got none", tc.name)
-			}
-		} else {
-			if err != nil {
-				t.Fatalf("Test case %s expected no errors, got %v", tc.name, err)
+			err := admitter.HandleRoute(watch.Added, route)
+			if tc.errors {
+				if err == nil {
+					t.Fatalf("Test case %s expected errors, got none", tc.name)
+				}
+			} else {
+				if err != nil {
+					t.Fatalf("Test case %s expected no errors, got %v", tc.name, err)
+				}
 			}
-		}
+		})
 	}
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/router/controller/host_admitter_test.go` around lines 157 - 259, The test
uses multiple Route fixtures with empty ObjectMeta.UIDs, which masks conflict
detection because HostAdmitter.displacedRoutes ignores routes with the same UID;
update TestWildcardHostDeny to give each test route a unique UID (e.g.,
route.ObjectMeta.UID = types.UID(fmt.Sprintf("%s-uid", tc.name)) or similar)
before calling NewHostAdmitter/HandleRoute so duplicate-host cases are evaluated
as distinct routes and the wildcard conflict path is exercised correctly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/router/controller/host_admitter_test.go`:
- Around line 262-329: TestWildcardHostUpdate reuses one admitter and duplicate
UIDs across subtests and doesn't validate cleanup of route2's old host key;
update the test to create a fresh NewHostAdmitter per t.Run and assign unique
UIDs for each route (avoid reusing "1"/"2"), and add a third route (e.g.,
route3) that uses tc.oldHostRoute2 before modifying route2, call
admitter.HandleRoute(watch.Added, route3) to ensure the map entry for
oldHostRoute2 is exercised, then perform the Modified updates for route1/route2
with unique UIDs to assert the cleanup behavior implemented around
RemoveRouteFromAllKeys (host_admitter.go lines ~211-213) is covered.

---

Outside diff comments:
In `@pkg/router/controller/host_admitter_test.go`:
- Around line 157-259: The test uses multiple Route fixtures with empty
ObjectMeta.UIDs, which masks conflict detection because
HostAdmitter.displacedRoutes ignores routes with the same UID; update
TestWildcardHostDeny to give each test route a unique UID (e.g.,
route.ObjectMeta.UID = types.UID(fmt.Sprintf("%s-uid", tc.name)) or similar)
before calling NewHostAdmitter/HandleRoute so duplicate-host cases are evaluated
as distinct routes and the wildcard conflict path is exercised correctly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: ae9cb993-942f-43a8-a1ce-81641a5396ba

📥 Commits

Reviewing files that changed from the base of the PR and between d6ed14d and daab4aa.

📒 Files selected for processing (2)
  • pkg/router/controller/host_admitter.go
  • pkg/router/controller/host_admitter_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • pkg/router/controller/host_admitter.go

Comment thread pkg/router/controller/host_admitter_test.go
@jcmoraisjr jcmoraisjr force-pushed the OCPBUGS-49738-fix-routemap branch from daab4aa to fe0db75 Compare April 23, 2026 13:40
@melvinjoseph86
Copy link
Copy Markdown

/retest-required

1 similar comment
@melvinjoseph86
Copy link
Copy Markdown

/retest-required

Copy link
Copy Markdown
Contributor

@gcs278 gcs278 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice PR - generally looks good, I just have a concern about performance and a minor unit test coverage addition.

require.NoError(t, err)
err = admitter.HandleRoute(watch.Modified, route2)
require.NoError(t, err)
})
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit For complete coverage, consider adding a delete then recreate here with another route that is using the same hostname, since the bug https://redhat.atlassian.net/browse/OCPBUGS-49738 called out a "phantom" hostname claim even while the route was deleted.

Suggested change
})
route3name := tc.namePrefix + "-route3"
route3 := &routev1.Route{
ObjectMeta: metav1.ObjectMeta{
Name: route3name,
Namespace: "host-update",
UID: types.UID(route3name),
},
Spec: routev1.RouteSpec{Host: tc.oldHostRoute1, WildcardPolicy: tc.policy},
}
[...]
// Delete both routes
err = admitter.HandleRoute(watch.Deleted, route1)
require.NoError(t, err)
err = admitter.HandleRoute(watch.Deleted, route2)
require.NoError(t, err)
// New route should be able to claim route1's original host
err = admitter.HandleRoute(watch.Added, route3)
require.NoError(t, err)
})

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, just added.

Comment thread pkg/router/controller/host_admitter.go Outdated
Comment on lines +210 to +212
p.claimedHosts.RemoveRouteFromAllKeys(route)
p.blockedWildcards.RemoveRouteFromAllKeys(route)
p.claimedWildcards.RemoveRouteFromAllKeys(route)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had claude do a rough benchmarking for me.

Master (before) PR (after) Ratio
routes-100 ~3,800 ns/op ~7,600 ns/op 2x slower
routes-1,000 ~8,500 ns/op ~85,500 ns/op 10x slower
routes-10,000 ~170,000 ns/op ~1,100,000 ns/op 6x slower

Nothing unexpected here - iterating over the routes can be expensive for large clusters.

My experience is that the router is particularly performance sensitive - and sometimes have customers with an enormous amount of routes (10k+).

The alternative to looping through is to use a data structure to do some "bookkeeping" of the previous route (key) to hostname (value), so that you could look up the route's previous hostname quickly and remove that without scanning.

Could look something like:

Suggested change
p.claimedHosts.RemoveRouteFromAllKeys(route)
p.blockedWildcards.RemoveRouteFromAllKeys(route)
p.claimedWildcards.RemoveRouteFromAllKeys(route)
if oldHost, ok := p.routeHosts[route.UID]; ok {
oldWildcardKey := routeapihelpers.GetDomainForHost(oldHost)
p.claimedHosts.RemoveRoute(oldHost, route)
p.blockedWildcards.RemoveRoute(oldWildcardKey, route)
p.claimedWildcards.RemoveRoute(oldWildcardKey, route)
}
p.routeHosts[route.UID] = route.Spec.Host

Did you consider adding a cache of route hostnames like this for better performance?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a fair point, in fact I was focusing more in the problem instead of in the solution. Just changed pretty much like you suggested, changing a bit in naming and location of the code.

@jcmoraisjr jcmoraisjr force-pushed the OCPBUGS-49738-fix-routemap branch from fe0db75 to 5f8894c Compare April 28, 2026 13:26
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
pkg/router/controller/host_admitter.go (1)

145-150: ⚠️ Potential issue | 🟠 Major

Memory leak: routeHosts cache not cleaned up on route deletion.

When a route is deleted, the entry in p.routeHosts[route.UID] is never removed. This causes a memory leak over time as routes are deleted, and could cause stale behavior if a route UID is reused.

🔧 Proposed fix
 		case watch.Deleted:
 			p.claimedHosts.RemoveRoute(route.Spec.Host, route)
 			wildcardKey := routeapihelpers.GetDomainForHost(route.Spec.Host)
 			p.claimedWildcards.RemoveRoute(wildcardKey, route)
 			p.blockedWildcards.RemoveRoute(wildcardKey, route)
+			delete(p.routeHosts, route.UID)
 		}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/router/controller/host_admitter.go` around lines 145 - 150, The Deleted
case in the watch handler is missing cleanup of the per-route cache entry in
p.routeHosts; add code in the watch.Deleted branch to remove the entry for the
deleted route (e.g., delete(p.routeHosts, route.UID) or
p.routeHosts.Delete(route.UID) depending on the container type) and, if
p.routeHosts is protected by a mutex, perform the deletion under the same lock
used elsewhere to avoid races.
🧹 Nitpick comments (1)
pkg/router/controller/host_admitter.go (1)

24-35: Misleading variable name: removed contains remaining elements, not removed ones.

slices.DeleteFunc returns the modified slice with matching elements removed, so removed actually holds the elements that were not removed. Consider renaming to remaining for clarity.

♻️ Suggested rename for clarity
 func (srm RouteMap) RemoveRoute(key string, route *routev1.Route) {
 	source := srm[key]
-	removed := slices.DeleteFunc(source, func(r *routev1.Route) bool {
+	remaining := slices.DeleteFunc(source, func(r *routev1.Route) bool {
 		return r.Namespace == route.Namespace && r.Name == route.Name
 	})
 
-	if len(removed) > 0 {
-		srm[key] = removed
+	if len(remaining) > 0 {
+		srm[key] = remaining
 	} else {
 		delete(srm, key)
 	}
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/router/controller/host_admitter.go` around lines 24 - 35, The variable
name `removed` in RouteMap.RemoveRoute is misleading because slices.DeleteFunc
returns the slice after deletion (the remaining elements); rename `removed` to
`remaining` (or similar) and update uses in the function (the assignment
srm[key] = remaining and the else branch) to reflect that it contains the
elements that remain rather than the ones removed—this keeps semantics clear in
the RemoveRoute function and matches the behavior of slices.DeleteFunc.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@pkg/router/controller/host_admitter.go`:
- Around line 145-150: The Deleted case in the watch handler is missing cleanup
of the per-route cache entry in p.routeHosts; add code in the watch.Deleted
branch to remove the entry for the deleted route (e.g., delete(p.routeHosts,
route.UID) or p.routeHosts.Delete(route.UID) depending on the container type)
and, if p.routeHosts is protected by a mutex, perform the deletion under the
same lock used elsewhere to avoid races.

---

Nitpick comments:
In `@pkg/router/controller/host_admitter.go`:
- Around line 24-35: The variable name `removed` in RouteMap.RemoveRoute is
misleading because slices.DeleteFunc returns the slice after deletion (the
remaining elements); rename `removed` to `remaining` (or similar) and update
uses in the function (the assignment srm[key] = remaining and the else branch)
to reflect that it contains the elements that remain rather than the ones
removed—this keeps semantics clear in the RemoveRoute function and matches the
behavior of slices.DeleteFunc.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: e42df920-e088-45f1-bf5f-a1a2e2247e97

📥 Commits

Reviewing files that changed from the base of the PR and between daab4aa and 5f8894c.

📒 Files selected for processing (2)
  • pkg/router/controller/host_admitter.go
  • pkg/router/controller/host_admitter_test.go

@jcmoraisjr jcmoraisjr force-pushed the OCPBUGS-49738-fix-routemap branch from 5f8894c to 704aaf6 Compare April 28, 2026 14:26
Routes are grouped by host on three distinct hashmaps when
allowWildcardRoutes is enabled. These hashmaps are updated whenever
routes are added, modified or deleted, and their content is used to
evaluate host conflicts. This hashmaps handling is done using the
current host of the route resource, so, if the host value changes, the
router does not find the entry on the hashmaps because it is using the
current - and still not used - host value as the key.

The approach used on this update is to cache the host from the router
when it's added, and remove the resource from the cached host, which is
the key of the hashmaps used to evaluate conflicts.
@jcmoraisjr jcmoraisjr force-pushed the OCPBUGS-49738-fix-routemap branch from 704aaf6 to 31a6c97 Compare April 28, 2026 14:27
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Apr 28, 2026

@jcmoraisjr: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-upgrade 31a6c97 link true /test e2e-upgrade
ci/prow/e2e-aws-serial-1of2 31a6c97 link true /test e2e-aws-serial-1of2

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@@ -153,6 +147,7 @@ func (p *HostAdmitter) HandleRoute(eventType watch.EventType, route *routev1.Rou
wildcardKey := routeapihelpers.GetDomainForHost(route.Spec.Host)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is another nuanced variation of this bug with delete behavior that we should consider fixing while we are here. Consider the scenario:

  1. RouteX added with hostA: maps have entries under A, routeHosts[UID] = A
  2. RouteX modified to hostB, but rejected by displacedRoutes: maps still have entries under hostA, routeHosts[UID] still = A
  3. RouteX deleted: route.Spec.Host = B, but maps have the route under A: RemoveRoute("B", route) is a no-op, stale entry leaks

In this scenario, we'd be attempting to clean up entries in claimedHosts, claimedWildcards, and blockedWildcards using a host that was never admitted. routeHosts is effectively the "last successfully admitted host" while the route might have failed to admit it's current host.

Should we also use routeHosts here to fix that bug? Also, if you do end up fixing this, consider a unit test, like TestDeleteAfterRejectedHostChange to verify the bug I'm describing & the fix.

p.blockedWildcards.RemoveRoute(oldWildcardKey, route)
p.claimedWildcards.RemoveRoute(oldWildcardKey, route)
}
p.routeHosts[route.UID] = route.Spec.Host
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this go below the switch statement? The switch statement's default case is a rejection for an unsupported wildcard policy, which means we'd be storing a route host in our map that was rejected.

Probably more of defensive coding comment - I don't know if it's possible to trigger the Default case with API validation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

jira/severity-important Referenced Jira bug's severity is important for the branch this PR is targeting. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants