Skip to content

4.x: PrivateLink Support Phase 3: Query Implementation, DNS Resolution & Unit Tests 🔌#807

Open
nikagra wants to merge 6 commits intoscylladb:scylla-4.xfrom
nikagra:4.x-privatelink-support-queries
Open

4.x: PrivateLink Support Phase 3: Query Implementation, DNS Resolution & Unit Tests 🔌#807
nikagra wants to merge 6 commits intoscylladb:scylla-4.xfrom
nikagra:4.x-privatelink-support-queries

Conversation

@nikagra
Copy link

@nikagra nikagra commented Feb 26, 2026

4.x: PrivateLink Support Phase 3: Query Implementation, DNS Resolution & Tests 🔌

Note: Depends on #804 (Phase 2 — Core Infrastructure). ❗

Overview

Completes the runtime implementation of PrivateLink/client-routes support for ScyllaDB Cloud. The
driver can now query system.client_routes at startup and on control-connection reconnect, resolve
node hostnames to InetSocketAddress via a TTL-aware caching DNS resolver, and dynamically
translate addresses used for every CQL connection.

Server-side feature requires scylladb/scylladb#27323.


Changes

Event Negotiation Fix — ProtocolInitHandler (SCYLLADB-850)

  • Introduced a mutable registerEventTypes list (copy of options.eventTypes) so the REGISTER
    step can be retried with a narrowed event set without mutating the original channel options.
  • When the REGISTER step receives a PROTOCOL_ERROR whose message contains
    CLIENT_ROUTES_CHANGE, the driver logs a warning, removes that event type, and retries REGISTER
    with the remaining types (SCHEMA_CHANGE, STATUS_CHANGE, TOPOLOGY_CHANGE).
  • If no event types remain after stripping, the handler calls setConnectSuccess() directly,
    skipping REGISTER altogether.
  • This makes the driver backward-compatible with ScyllaDB Enterprise < 2026.1: client-routes table
    queries may still work on those versions, but live push-updates via the event are disabled.

DNS Resolution — CachingDnsResolver (new)

  • TTL-based in-memory cache backed by ConcurrentHashMap<String, CacheEntry> with per-hostname
    Semaphore(1) to prevent concurrent re-resolution storms.
  • Fast-path unlocked read; double-checked locking only when the entry is absent or stale.
  • Last-known-good fallback: on UnknownHostException, the previous successful resolution is
    returned and a warning is logged.
  • clearCache() preserves last-known-good entries so a full session re-init survives transient DNS
    failures.

Client Routes Query — ClientRoutesHandler

  • queryAndResolveRoutes() implemented: executes
    SELECT connection_id, host_id, address, port, tls_port FROM %s WHERE connection_id IN :connectionIds ALLOW FILTERING
    via AdminRequestHandler.
  • CachingDnsResolver wired in, replacing the Phase 2 no-op stub.
  • Route map atomically swapped via AtomicReference.set() so in-flight translations see a
    consistent snapshot.

Session Lifecycle — DefaultSession

  • initClientRoutes() inserted between topologyMonitor.init() and
    metadataManager.refreshNodes() in the startup chain.
  • Failure is non-fatal: if the client-routes query fails at startup, the session still opens and
    falls back to direct addressing.

Reconnect Integration — ControlConnection

  • On a successful reconnect, clientRoutesHandler.refresh() is awaited before refreshNodes(),
    so translated addresses are current before nodes are re-evaluated.

Tests

Unit Tests — 22 new tests across 3 suites

Suite # Coverage
CachingDnsResolverTest 7 Cache hit/miss, TTL expiry, concurrency, last-known-good fallback, clearCache()
ClientRoutesHandlerTest 8 Query construction, route-map population, translate with/without SSL, refresh no-op when disabled
ClientRoutesAddressTranslatorTest 7 Address translation pipeline, null pass-through, SSL port selection

Integration Tests — 6 new tests across 2 suites

Suite Requirement # Coverage
ClientRoutesIT ScyllaDB Enterprise ≥ 2026.1 5 Routes loaded on init; refresh() picks up new rows; session opens with empty table; TLS port selection; auto-refresh on control-connection reconnect
ClientRoutesUnsupportedVersionIT ScyllaDB Enterprise < 2026.1 1 Session opens and handler returns null when system.client_routes is absent (graceful degradation)

Both integration test suites are @Category(IsolatedTests.class) and @ScyllaOnly. Positive tests
use a user-space mirror table (test_client_routes.client_routes) to avoid requiring privileged
write access to system.

Additional notes

This PR also uses workaround for SCYLLADB-850: older ScyllaDB Enterprise versions (< 2026.1) reject the CLIENT_ROUTES_CHANGE REGISTER event with a PROTOCOL_ERROR, which previously caused connection initialisation to fail. The driver now gracefully handles this by stripping the unsupported event and retrying REGISTER with the remaining types, so sessions succeed on any server version.


Related to DRIVER-86, DRIVER-88, SCYLLADB-850

nikagra and others added 5 commits February 18, 2026 13:13
…RoutesEndpoint for PrivateLink deployments ⚙️

feat: update client routes configuration details for mutual exclusivity with AddressTranslator and cloud bundles

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
- Protocol layer: CLIENT_ROUTES_CHANGE event registration
- API: AddressTranslator V2 with Host ID, datacenter, and rack parameters
- Infrastructure: ClientRoutesHandler, ClientRoutesAddressTranslator (stubs)
- Integration: DefaultDriverContext, DefaultSession, DefaultTopologyMonitor
- Tests: ClientRoutesConfigTest, ClientRoutesSessionBuilderTest

Phase 2 establishes API. Phase 3 implements queries and lifecycle.

Related: DRIVER-88

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@nikagra nikagra force-pushed the 4.x-privatelink-support-queries branch from a3acff2 to 57574a3 Compare February 27, 2026 13:16
…r, and unit tests

- Implement CachingDnsResolver with configurable TTL (default 500ms),
  per-hostname Semaphore(1) concurrency limit, double-checked locking
  fast-path, and last-known-good fallback on UnknownHostException
- Implement queryAndResolveRoutes() in ClientRoutesHandler: queries
  system.client_routes filtered by configured connection IDs, parses
  AdminResult rows into ResolvedClientRoute map, atomically swaps via
  AtomicReference
- Wire CachingDnsResolver into ClientRoutesHandler (replaces no-op stub)
- Integrate handler.init() into DefaultSession startup chain, after
  topology monitor init and before refreshNodes(); non-fatal on failure
- Trigger handler.refresh() on control connection reconnect before
  refreshNodes() to prevent stale routes after reconnection
- Add 22 unit tests across three suites:
  CachingDnsResolverTest (7), ClientRoutesHandlerTest (8), ClientRoutesAddressTranslatorTest (7)

Related to DRIVER-86, DRIVER-88
@nikagra nikagra force-pushed the 4.x-privatelink-support-queries branch from 57574a3 to 26aebdd Compare February 27, 2026 14:26
@nikagra nikagra requested review from Copilot and dkropachev and removed request for Copilot February 27, 2026 14:55
@nikagra nikagra marked this pull request as ready for review February 27, 2026 14:55
@nikagra nikagra requested a review from Copilot February 27, 2026 14:56
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR completes Phase 3 of PrivateLink support for the Java driver, implementing the runtime components required for ScyllaDB Cloud deployments. Building on Phase 2's infrastructure (#804), it adds DNS resolution, query execution, and complete lifecycle integration for the client routes feature.

Changes:

  • Implements TTL-based caching DNS resolver with last-known-good fallback and concurrency control
  • Adds query execution for system.client_routes table with hostname resolution and route map management
  • Integrates client routes initialization into session startup and control-connection reconnect lifecycle
  • Implements backward-compatible event negotiation to handle older ScyllaDB versions that don't support CLIENT_ROUTES_CHANGE events

Reviewed changes

Copilot reviewed 29 out of 29 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
pom.xml Upgrades native protocol to 1.5.2.2 to support CLIENT_ROUTES_CHANGE event
manual/core/address_resolution/README.md Documents client routes configuration and usage with examples
core/.../util/AddressParser.java New utility for parsing contact point addresses with IPv4/IPv6/hostname support
core/.../clientroutes/DnsResolver.java Interface defining DNS resolution with caching requirements
core/.../clientroutes/CachingDnsResolver.java TTL-based DNS caching implementation with semaphore-based concurrency control
core/.../clientroutes/ClientRouteInfo.java Data structure representing raw client_routes table rows
core/.../clientroutes/ResolvedClientRoute.java Resolved route with DNS resolution capability
core/.../clientroutes/ClientRoutesHandler.java Main coordinator implementing query execution and route management
core/.../clientroutes/ClientRoutesAddressTranslator.java AddressTranslator implementation delegating to handler
core/.../session/DefaultSession.java Integrates client routes initialization into session startup
core/.../metadata/DefaultTopologyMonitor.java Passes node metadata (hostId, datacenter, rack) to address translator
core/.../control/ControlConnection.java Adds CLIENT_ROUTES_CHANGE event handling and reconnect integration
core/.../context/DefaultDriverContext.java Wires client routes handler into driver context
core/.../context/InternalDriverContext.java Adds getClientRoutesHandler() method
core/.../channel/ProtocolInitHandler.java Implements backward-compatible event negotiation for CLIENT_ROUTES_CHANGE
core/.../session/SessionBuilder.java Adds withClientRoutesConfig() with mutual exclusivity validation
core/.../session/ProgrammaticArguments.java Stores client routes config in programmatic arguments
core/.../config/ClientRoutesConfig.java Configuration class for client routes with builder
core/.../config/ClientRoutesEndpoint.java Represents a single client routes endpoint
core/.../addresstranslation/AddressTranslator.java Adds overloaded translate() method accepting node metadata
core/.../config/DefaultDriverOption.java Minor formatting cleanup (blank line removal)
core/src/main/resources/reference.conf Minor formatting cleanup (blank line removal)
core/src/test/.../util/AddressParserTest.java Comprehensive unit tests for address parsing
core/src/test/.../clientroutes/CachingDnsResolverTest.java Unit tests for DNS resolver cache behavior
core/src/test/.../clientroutes/ClientRoutesHandlerTest.java Unit tests for handler translate() method
core/src/test/.../clientroutes/ClientRoutesAddressTranslatorTest.java Unit tests for address translator delegation
core/src/test/.../session/ClientRoutesSessionBuilderTest.java Unit tests for session builder configuration
core/src/test/.../config/ClientRoutesConfigTest.java Unit tests for configuration validation
core/src/test/.../metadata/DefaultTopologyMonitorTest.java Updates test verification counts for additional getString() calls

Comment on lines +372 to +382
// This is expected on ScyllaDB versions older than 2025.4 (OSS) / 2025.4 (Enterprise),
// which do not implement the client_routes feature at the protocol level. The driver
// registers for this event only when ClientRoutesConfig is set, so this branch fires
// on any pre-2025.4 cluster that the user has (mis)configured with client routes.
//
// Behavior: strip CLIENT_ROUTES_CHANGE and retry REGISTER with the remaining event
// types (SCHEMA_CHANGE, STATUS_CHANGE, TOPOLOGY_CHANGE). The session connects
// successfully; client routes table queries may still work if the server exposes a
// compatible table, but live push-updates via this event will be absent.
LOG.warn(
"[{}] Server does not support {} event (requires ScyllaDB ≥ 2025.4);"
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment mentions ScyllaDB version "2025.4" but the PR description states this requires "ScyllaDB Enterprise ≥ 2026.1". This version mismatch should be corrected to ensure consistency across the codebase. The correct version requirement (2026.1 based on the PR description) should be used.

Suggested change
// This is expected on ScyllaDB versions older than 2025.4 (OSS) / 2025.4 (Enterprise),
// which do not implement the client_routes feature at the protocol level. The driver
// registers for this event only when ClientRoutesConfig is set, so this branch fires
// on any pre-2025.4 cluster that the user has (mis)configured with client routes.
//
// Behavior: strip CLIENT_ROUTES_CHANGE and retry REGISTER with the remaining event
// types (SCHEMA_CHANGE, STATUS_CHANGE, TOPOLOGY_CHANGE). The session connects
// successfully; client routes table queries may still work if the server exposes a
// compatible table, but live push-updates via this event will be absent.
LOG.warn(
"[{}] Server does not support {} event (requires ScyllaDB ≥ 2025.4);"
// This is expected on ScyllaDB Enterprise versions older than 2026.1,
// which do not implement the client_routes feature at the protocol level. The driver
// registers for this event only when ClientRoutesConfig is set, so this branch fires
// on any pre-2026.1 Enterprise cluster that the user has (mis)configured with client routes.
//
// Behavior: strip CLIENT_ROUTES_CHANGE and retry REGISTER with the remaining event
// types (SCHEMA_CHANGE, STATUS_CHANGE, TOPOLOGY_CHANGE). The session connects
// successfully; client routes table queries may still work if the server exposes a
// compatible table, but live push-updates via this event will be absent.
LOG.warn(
"[{}] Server does not support {} event (requires ScyllaDB Enterprise ≥ 2026.1);"

Copilot uses AI. Check for mistakes.
Comment on lines +372 to +382
// This is expected on ScyllaDB versions older than 2025.4 (OSS) / 2025.4 (Enterprise),
// which do not implement the client_routes feature at the protocol level. The driver
// registers for this event only when ClientRoutesConfig is set, so this branch fires
// on any pre-2025.4 cluster that the user has (mis)configured with client routes.
//
// Behavior: strip CLIENT_ROUTES_CHANGE and retry REGISTER with the remaining event
// types (SCHEMA_CHANGE, STATUS_CHANGE, TOPOLOGY_CHANGE). The session connects
// successfully; client routes table queries may still work if the server exposes a
// compatible table, but live push-updates via this event will be absent.
LOG.warn(
"[{}] Server does not support {} event (requires ScyllaDB ≥ 2025.4);"
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment mentions ScyllaDB version "≥ 2025.4" but the PR description states this requires "ScyllaDB Enterprise ≥ 2026.1". This version mismatch should be corrected to ensure consistency across the codebase. The correct version requirement (2026.1 based on the PR description) should be used.

Suggested change
// This is expected on ScyllaDB versions older than 2025.4 (OSS) / 2025.4 (Enterprise),
// which do not implement the client_routes feature at the protocol level. The driver
// registers for this event only when ClientRoutesConfig is set, so this branch fires
// on any pre-2025.4 cluster that the user has (mis)configured with client routes.
//
// Behavior: strip CLIENT_ROUTES_CHANGE and retry REGISTER with the remaining event
// types (SCHEMA_CHANGE, STATUS_CHANGE, TOPOLOGY_CHANGE). The session connects
// successfully; client routes table queries may still work if the server exposes a
// compatible table, but live push-updates via this event will be absent.
LOG.warn(
"[{}] Server does not support {} event (requires ScyllaDB ≥ 2025.4);"
// This is expected on ScyllaDB versions older than 2026.1 (Enterprise),
// which do not implement the client_routes feature at the protocol level. The driver
// registers for this event only when ClientRoutesConfig is set, so this branch fires
// on any pre-2026.1 cluster that the user has (mis)configured with client routes.
//
// Behavior: strip CLIENT_ROUTES_CHANGE and retry REGISTER with the remaining event
// types (SCHEMA_CHANGE, STATUS_CHANGE, TOPOLOGY_CHANGE). The session connects
// successfully; client routes table queries may still work if the server exposes a
// compatible table, but live push-updates via this event will be absent.
LOG.warn(
"[{}] Server does not support {} event (requires ScyllaDB Enterprise ≥ 2026.1);"

Copilot uses AI. Check for mistakes.
Comment on lines +164 to +165
**Note:** As of the current version, the client routes configuration API is available, but the full handler implementation
(DNS resolution, address translation, event handling) is still under development.
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This note states that "the full handler implementation (DNS resolution, address translation, event handling) is still under development" but this PR (Phase 3) completes that implementation. This note should be removed or updated to reflect that the feature is now fully implemented.

Suggested change
**Note:** As of the current version, the client routes configuration API is available, but the full handler implementation
(DNS resolution, address translation, event handling) is still under development.
**Note:** The client routes configuration API and handler (including DNS resolution, address translation, and event handling) are fully implemented.

Copilot uses AI. Check for mistakes.
@Override
public void clearCache() {
cache.clear();
semaphores.clear();
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calling clearCache() while concurrent resolve() operations are in progress could cause race conditions. If a thread has acquired a semaphore (line 58-63) and another thread calls clearCache() which clears the semaphores map (line 126), the first thread will release a semaphore that's no longer in the map. Subsequent calls to resolve() for the same hostname will create a new semaphore, potentially allowing multiple concurrent resolutions for the same hostname. Consider either: (1) not clearing semaphores, or (2) using a lock during clearCache() to ensure no resolve operations are in progress.

Suggested change
semaphores.clear();
// semaphores are intentionally not cleared to avoid interfering with in-flight resolutions

Copilot uses AI. Check for mistakes.
verify(peer2, times(2)).getUuid("host_id");
verify(peer2).getString("data_center");
// Note: getUuid("host_id") called three times total (once in findInPeers, once in
// nodeInfoBuilder, once in buildNodeEndPoint)
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment is slightly inaccurate. It should say "getUuid('host_id') called three times (once in findInPeers, once in buildNodeEndPoint, once in nodeInfoBuilder)" to correctly reflect the call order. The current comment says "once in nodeInfoBuilder, once in buildNodeEndPoint" but nodeInfoBuilder calls buildNodeEndPoint first, so the order is: findInPeers → buildNodeEndPoint → nodeInfoBuilder.

Suggested change
// nodeInfoBuilder, once in buildNodeEndPoint)
// buildNodeEndPoint, once in nodeInfoBuilder)

Copilot uses AI. Check for mistakes.
Comment on lines +109 to +115
// Build the IN clause with literal UUID values — AdminRequestHandler does not support
// List<UUID> as a named parameter, so we inline the values directly.
String connectionIdsCsv =
endpoints.stream()
.map(ep -> ep.getConnectionId().toString())
.collect(Collectors.joining(", "));
String query = String.format(SELECT_ROUTES_TEMPLATE, config.getTableName(), connectionIdsCsv);
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The UUID values are being directly interpolated into the SQL query without proper escaping. While UUIDs from toString() are safe (they only contain hex digits and hyphens), this pattern is generally discouraged as it could set a precedent for unsafe query construction. Consider using proper parameter binding if AdminRequestHandler supports it, or at minimum add a comment explaining why direct interpolation is safe for UUIDs.

Copilot uses AI. Check for mistakes.
Comment on lines +37 to +164
@RunWith(MockitoJUnitRunner.class)
public class ClientRoutesHandlerTest {

@Mock private InternalDriverContext context;

private TestableClientRoutesHandler handler;

/**
* Subclass exposing package-private {@code resolvedRoutesRef} so tests can inject test data
* without actually executing admin queries.
*/
@SuppressWarnings("NewClassNamingConvention")
static class TestableClientRoutesHandler extends ClientRoutesHandler {
TestableClientRoutesHandler(InternalDriverContext ctx, ClientRoutesConfig cfg) {
super(ctx, cfg);
}

void setRoutes(Map<UUID, ResolvedClientRoute> routes) {
resolvedRoutesRef.set(new ConcurrentHashMap<>(routes));
}
}

@Before
public void setup() {
when(context.getSessionName()).thenReturn("test-session");
ClientRoutesConfig config =
ClientRoutesConfig.builder()
.addEndpoint(new ClientRoutesEndpoint(UUID.randomUUID(), "host1:9042"))
.build();
handler = new TestableClientRoutesHandler(context, config);
}

// ---- translate() -------------------------------------------------------

@Test
public void should_return_null_for_unknown_host_id() {
assertThat(handler.translate(UUID.randomUUID(), false)).isNull();
}

@Test
public void should_translate_known_host_id_non_ssl() {
UUID hostId = UUID.randomUUID();
handler.setRoutes(
ImmutableMap.of(hostId, new ResolvedClientRoute(hostId, "127.0.0.1", 9042, 9142)));

InetSocketAddress result = handler.translate(hostId, false);

assertThat(result).isNotNull();
assertThat(result.getPort()).isEqualTo(9042);
}

@Test
public void should_select_tls_port_when_ssl() {
UUID hostId = UUID.randomUUID();
handler.setRoutes(
ImmutableMap.of(hostId, new ResolvedClientRoute(hostId, "127.0.0.1", 9042, 9142)));

InetSocketAddress result = handler.translate(hostId, true);

assertThat(result).isNotNull();
assertThat(result.getPort()).isEqualTo(9142);
}

@Test
public void should_fall_back_to_non_ssl_port_when_tls_port_absent() {
UUID hostId = UUID.randomUUID();
// tls_port is null — should warn and fall back to non-SSL port
handler.setRoutes(
ImmutableMap.of(hostId, new ResolvedClientRoute(hostId, "127.0.0.1", 9042, null)));

InetSocketAddress result = handler.translate(hostId, true);

assertThat(result).isNotNull();
assertThat(result.getPort()).isEqualTo(9042);
}

@Test
public void should_return_null_when_no_port_configured() {
UUID hostId = UUID.randomUUID();
// Both ports null → IllegalStateException → translate() catches it and returns null
handler.setRoutes(
ImmutableMap.of(hostId, new ResolvedClientRoute(hostId, "127.0.0.1", null, null)));

assertThat(handler.translate(hostId, false)).isNull();
}

@Test
public void should_return_null_after_close() {
UUID hostId = UUID.randomUUID();
handler.setRoutes(
ImmutableMap.of(hostId, new ResolvedClientRoute(hostId, "127.0.0.1", 9042, null)));

handler.close();

assertThat(handler.translate(hostId, false)).isNull();
}

@Test
public void should_return_null_for_unresolvable_hostname() {
UUID hostId = UUID.randomUUID();
// Use a hostname guaranteed not to resolve
handler.setRoutes(
ImmutableMap.of(
hostId,
new ResolvedClientRoute(hostId, "this.host.does.not.exist.invalid", 9042, null)));

// Should not throw; returns null and logs a warning
assertThat(handler.translate(hostId, false)).isNull();
}

@Test
public void should_refresh_updates_routes() {
UUID hostId1 = UUID.randomUUID();
UUID hostId2 = UUID.randomUUID();

handler.setRoutes(
ImmutableMap.of(hostId1, new ResolvedClientRoute(hostId1, "127.0.0.1", 9042, null)));
assertThat(handler.translate(hostId1, false)).isNotNull();
assertThat(handler.translate(hostId2, false)).isNull();

// Simulate a refresh that swaps in a different set of routes
handler.setRoutes(
ImmutableMap.of(hostId2, new ResolvedClientRoute(hostId2, "127.0.0.2", 9042, null)));

assertThat(handler.translate(hostId1, false)).isNull();
assertThat(handler.translate(hostId2, false)).isNotNull();
}
}
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The unit tests for ClientRoutesHandler only test the translate() method using a TestableClientRoutesHandler subclass that bypasses the actual query execution. There are no unit tests covering the queryAndResolveRoutes() method, init(), or refresh() methods. While these may be covered by integration tests, having unit tests for error handling, query construction, and row parsing would improve confidence in the implementation.

Copilot uses AI. Check for mistakes.
Comment on lines +434 to +436
UUID hostId = row.getUuid("host_id");
String datacenter = row.getString("data_center");
String rack = row.getString("rack");
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The datacenter and rack strings are being read from the AdminRow twice: once in buildNodeEndPoint (lines 435-436) and once later in nodeInfoBuilder. Since nodeInfoBuilder calls buildNodeEndPoint first (line 315), these getString() calls in buildNodeEndPoint happen before the values are read again later in nodeInfoBuilder. While AdminRow may cache parsed values, this creates unnecessary coupling and could impact performance. Consider refactoring to pass these values as parameters to buildNodeEndPoint, or retrieve them once in nodeInfoBuilder and pass them to both buildNodeEndPoint and the builder.

Copilot uses AI. Check for mistakes.
private static final Logger LOG = LoggerFactory.getLogger(CachingDnsResolver.class);

private final long cacheDurationNanos;
private final ConcurrentHashMap<String, Semaphore> semaphores = new ConcurrentHashMap<>();
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The semaphores ConcurrentHashMap grows unboundedly as new hostnames are resolved, but is never cleaned up except during clearCache(). For long-running applications with many unique hostnames (e.g., due to changing DNS records or different endpoints), this could lead to a memory leak. Consider implementing a bounded cache with eviction (e.g., LRU) or periodically removing entries for hostnames that haven't been resolved recently.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants