Feat: New race strategy by marsteg · Pull Request #133 · cloudfoundry/bosh-dns-release

marsteg · 2026-03-06T08:00:41Z

This PR is for adding a new Recursor Selection Option. This option should allow to forward requests to all recursors in parallel and take the fastest, good response. This mimics the behavior or bind and dnsmasq to improve dns resiliency against a set of recursors.

## Overview Add a new `"race"` recursor selection strategy that sends DNS queries to all configured recursors in parallel and uses the best/fastest response. ## Motivation **Benefits:** - **Lower latency**: Always get the fastest response regardless of which recursor is fastest - **Better reliability**: Single slow/failing recursor doesn't slow down queries - **More accurate results**: Can prefer successful responses over failures (SERVFAIL/NXDOMAIN) - **No manual preference tuning needed**: Automatically adapts to current network conditions **Tradeoffs:** - **Higher network bandwidth**: Sends N queries instead of 1-2 - **More load on upstream recursors**: Each query hits all recursors - **Resource usage**: More goroutines and network connections ## Implementation Details ### 1. Configuration Add `"race"` as a valid value for `recursor_selection`: ```json { "recursor_selection": "race" // Valid values: "serial", "smart", "race" } ``` ### 2. Add Constant Add to `config/config.go`: ```go const ( SmartRecursorSelection = "smart" SerialRecursorSelection = "serial" RaceRecursorSelection = "race" // NEW RFCFormatting = "rfc3339" ) ``` ### 3. Core Implementation Add new `raceRecursorPool` type in `failover_recursor_pool.go`: ```go type raceRecursorPool struct { recursors []string logger logger.Logger logTag string recursorRetrySettings recursorRetrySettings } func newRaceRecursorPool(recursors []string, recursorSettings recursorRetrySettings, logger logger.Logger) RecursorPool { if recursors == nil { recursors = []string{} } return &raceRecursorPool{ recursors: recursors, logger: logger, logTag: "RaceRecursor", recursorRetrySettings: recursorSettings, } } ``` ### 4. Race Logic with Response Quality Ranking The key insight: **Not all responses are equal**. We should prefer: 1. ✅ **NOERROR** (successful resolution) - USE IMMEDIATELY 2. ⚠️ **NXDOMAIN** (name doesn't exist) - WAIT for better answer 3. ❌ **SERVFAIL** (server failure) - WAIT for better answer 4. ❌ **Network errors** - WAIT for better answer ```go func (q *raceRecursorPool) PerformStrategically(work func(string) error) error { if len(q.recursors) == 0 { return ErrNoRecursorResponse } type result struct { recursor string err error priority int // Lower is better: 0=success, 1=NXDOMAIN, 2=SERVFAIL, 3=network error } // Buffered channel sized to number of recursors - prevents goroutine blocking results := make(chan result, len(q.recursors)) // Start all queries in parallel for _, recursor := range q.recursors { go func(r string) { err := performWithRetryLogic(work, r, q.recursorRetrySettings.maxRetries, q.logTag, q.logger) // Classify the error by priority priority := q.classifyError(err) results <- result{ recursor: r, err: err, priority: priority, } }(recursor) } // Collect all results var bestResult *result receivedCount := 0 totalRecursors := len(q.recursors) for receivedCount < totalRecursors { res := <-results receivedCount++ q.logger.Debug(q.logTag, fmt.Sprintf( "received response %d/%d from %s (priority=%d, err=%v)", receivedCount, totalRecursors, res.recursor, res.priority, res.err)) // Keep track of best result so far if bestResult == nil || res.priority < bestResult.priority { bestResult = &res } // If we got a perfect response (NOERROR), return immediately // Don't wait for other responses if res.priority == 0 { q.logger.Info(q.logTag, fmt.Sprintf( "recursor %s returned successful response (received %d/%d responses)", res.recursor, receivedCount, totalRecursors)) // Drain remaining responses to prevent goroutine leaks go q.drainResults(results, totalRecursors-receivedCount) return nil } // For non-success responses, continue collecting to see if we get a better one } // All responses received, return best one if bestResult == nil { return ErrNoRecursorResponse } if bestResult.err != nil { q.logger.Info(q.logTag, fmt.Sprintf( "all %d recursors responded, best result from %s (priority=%d): %s", totalRecursors, bestResult.recursor, bestResult.priority, bestResult.err.Error())) } else { q.logger.Info(q.logTag, fmt.Sprintf( "all %d recursors responded, using result from %s", totalRecursors, bestResult.recursor)) } return bestResult.err } // classifyError assigns priority to errors // Lower number = better response func (q *raceRecursorPool) classifyError(err error) int { if err == nil { return 0 // NOERROR - perfect response } // Check if it's a DNS error with rcode if dnsErr, ok := err.(server.DnsError); ok { switch dnsErr.Rcode() { case dns.RcodeNameError: // NXDOMAIN return 1 // Name doesn't exist - could be correct, but prefer NOERROR case dns.RcodeServerFailure: // SERVFAIL return 2 // Server failure - prefer NXDOMAIN over this default: return 2 // Other DNS errors } } // Network errors are worst if _, ok := err.(net.Error); ok { return 3 } // Unknown errors return 3 } // drainResults reads remaining results to prevent goroutine blocking func (q *raceRecursorPool) drainResults(results chan result, remaining int) { for i := 0; i < remaining; i++ { <-results } } ``` ### 5. Update Factory Function Modify `NewFailoverRecursorPool` in `failover_recursor_pool.go`: ```go func NewFailoverRecursorPool(recursors []string, recursorSelection string, RecursorMaxRetries int, logger logger.Logger) RecursorPool { recursorSettings := recursorRetrySettings{ maxRetries: RecursorMaxRetries, } switch recursorSelection { case config.SmartRecursorSelection: return newSmartFailoverRecursorPool(recursors, recursorSettings, logger) case config.RaceRecursorSelection: return newRaceRecursorPool(recursors, recursorSettings, logger) default: // serial return newSerialFailoverRecursorPool(recursors, recursorSettings, logger) } } ``` ### 6. Response Handling in ForwardHandler The current implementation in `forward_handler.go` works correctly because: 1. The `work` function (line 67-96) writes the response immediately when called 2. When `PerformStrategically` returns `nil`, it means a response was already written 3. If it returns an error, no response was written and we create one from the error **For race strategy**: - First NOERROR causes immediate response write and return - For NXDOMAIN/SERVFAIL, we wait for all responses, then use the best one - The response is only written once by the winning recursor's work function ✅ **No changes needed to ForwardHandler** ## Connection Leak Prevention ### Analysis 1. **Goroutines**: Each recursor query runs in its own goroutine 2. **Channel**: Buffered to `len(recursors)` - no goroutine blocks on send 3. **Cleanup paths**: - **Fast path** (NOERROR): We spawn a goroutine to drain remaining results - **Slow path** (all responses): We read all results in the loop ✅ **No goroutine leaks**: All goroutines complete naturally and all results are consumed ### Why the draining goroutine is safe When we get a NOERROR and return early: ```go go q.drainResults(results, totalRecursors-receivedCount) ``` - The draining goroutine reads remaining results from the channel - The query goroutines complete their DNS exchanges and send to the buffered channel - Once all results are drained, the draining goroutine exits - The channel is garbage collected when no references remain **No resource leaks**: Goroutines terminate, connections close, memory is freed. ## Response Quality Logic ### Scenario Examples #### Scenario 1: Mixed Responses - Recursor A: SERVFAIL (server is down) - Recursor B: NXDOMAIN (name not found) ← **Best result** - Recursor C: Network timeout **Result**: Returns NXDOMAIN (priority 1 beats 2 and 3) #### Scenario 2: One Good Response - Recursor A: NOERROR ← **Returns immediately** - Recursor B: (still processing) - Recursor C: (still processing) **Result**: Returns NOERROR immediately, doesn't wait for B and C #### Scenario 3: Split-Brain DNS - Recursor A: NXDOMAIN (stale cache, thinks name doesn't exist) - Recursor B: NOERROR ← **Best result** - Recursor C: SERVFAIL (misconfigured) **Result**: Returns NOERROR (priority 0 beats everything) #### Scenario 4: All Fail - Recursor A: SERVFAIL - Recursor B: SERVFAIL ← **Best result (all tied)** - Recursor C: Network timeout **Result**: Returns SERVFAIL (priority 2 beats 3) ## Testing Considerations Test scenarios needed: 1. **Performance**: - Verify fastest NOERROR wins - Measure latency improvement vs serial/smart 2. **Correctness**: - One NOERROR + others NXDOMAIN → returns NOERROR - All NXDOMAIN → returns NXDOMAIN - One NXDOMAIN + others SERVFAIL → returns NXDOMAIN - All SERVFAIL → returns SERVFAIL 3. **Resource safety**: - No goroutine leaks (use runtime.NumGoroutine()) - No connection leaks - Memory usage under load 4. **Edge cases**: - Zero recursors - One recursor (should behave like serial) - All timeout - Artificially delay responses to test early return ## Performance Impact **Latency**: - ✅ Best case: ~50-70% reduction with multiple recursors - ✅ Common case: Returns immediately on first NOERROR - ⚠️ Worst case: Waits for all responses if no NOERROR (slower than serial) **Network bandwidth**: - ❌ Always sends N queries (number of recursors) - For typical query: 64 bytes × N recursors - For typical response: 512 bytes × N recursors - Example with 3 recursors: ~1.7KB per DNS query **Recursor load**: - ❌ Every recursor receives every query - May violate rate limits on public DNS servers - Not suitable for large-scale deployments without considering upstream capacity ## Documentation Updates Update `jobs/bosh-dns/spec`: ```yaml recursor_selection: description: | Recursor selection strategy. Options: - "serial": Try recursors in order, no failure tracking - "smart": Track failures, prefer reliable recursors (default, DNS spec compliant) - "race": Query all recursors in parallel, use best/fastest response * Prefers successful responses (NOERROR) over failures * Returns immediately on first successful response * Waits for all responses if only errors received * Higher network bandwidth (N× queries) default: smart ``` ## Recommendation **When to use race strategy:** - ✅ Low-latency requirements (milliseconds matter) - ✅ Recursors have varying performance characteristics - ✅ Bandwidth and recursor load are not concerns - ✅ Split-brain DNS scenarios need resolution - ✅ Small to medium scale deployments **When NOT to use:** - ❌ Large-scale deployments (bandwidth cost) - ❌ Rate-limited public DNS servers - ❌ Bandwidth-constrained networks - ❌ Metered connections Implement with the response quality ranking to ensure correct DNS behavior.

marsteg · 2026-03-06T09:13:09Z

Adding some basic dnsperf test results to compare performance between "smart" and "race":

Smart:
Statistics:

Queries sent: 504851
Queries completed: 504851 (100.00%)
Queries lost: 0 (0.00%)

Response codes: NOERROR 504851 (100.00%)
Average packet size: request 32, response 63
Run time (s): 30.000126
Queries per second: 16828.295988

Average Latency (s): 0.003507 (min 0.000021, max 2.074524)
Latency StdDev (s): 0.077404

Race:
Statistics:

Queries sent: 889711
Queries completed: 889711 (100.00%)
Queries lost: 0 (0.00%)

Response codes: NOERROR 889711 (100.00%)
Average packet size: request 33, response 65
Run time (s): 30.000800
Queries per second: 29656.242500

Average Latency (s): 0.000707 (min 0.000023, max 0.204968)
Latency StdDev (s): 0.003805

beyhan · 2026-03-06T09:45:00Z

Thank you @marsteg for this contribution! IMO this is only relevant for Ubuntu Jammy which is reaching EOL in April 2027 because the switch to Ubuntu Noble introduces a major change how BOSH DNS is invoked. In Noble BOSH DNS will be only responsible for resolving its own domain and the OS resolver will handle everything else. For more details please check the discussions in this pr. With that BOSH DNS won't need to delegate resolutions to recursors and require a selection strategy.

beyhan · 2026-03-30T10:05:32Z

@marsteg Given the feedback in the comment above, is the feature still valuable and will help you being added to bosh-dns?

marsteg · 2026-03-30T11:52:48Z

Hi @beyhan,
AFAIK the migration to Noble-Stemcell makes this improvement useless for my Team. I think it might be valuable in the current use-case but not hard required anymore,

beyhan · 2026-03-30T14:38:26Z

Hi @marsteg,

Thank you for the update! In that case, we'd prefer to close the PR and avoid adding functionality to BOSH DNS that won't actually be used. Should someone decide to support Jammy for a longer period down the road, they're always welcome to reopen it and we can revisit the discussion then.

marsteg added 3 commits March 5, 2026 13:36

adding configuration validation

52d96cd

cleaned up comments

77efd2f

cf-foundation-community-automation Bot added this to Foundational Infrastructure Working Group Mar 6, 2026

cf-foundation-community-automation Bot moved this to Inbox in Foundational Infrastructure Working Group Mar 6, 2026

aramprice requested review from a team, a-hassanin and ragaskar and removed request for a team March 12, 2026 16:18

aramprice moved this from Inbox to Pending Review | Discussion in Foundational Infrastructure Working Group Mar 12, 2026

beyhan closed this Mar 30, 2026

github-project-automation Bot moved this from Pending Review | Discussion to Done in Foundational Infrastructure Working Group Mar 30, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat: New race strategy#133

Feat: New race strategy#133
marsteg wants to merge 3 commits intocloudfoundry:masterfrom
marsteg:dns/newRaceStrategy

marsteg commented Mar 6, 2026 •

edited

Loading

Uh oh!

marsteg commented Mar 6, 2026

Uh oh!

beyhan commented Mar 6, 2026 •

edited

Loading

Uh oh!

beyhan commented Mar 30, 2026

Uh oh!

marsteg commented Mar 30, 2026

Uh oh!

beyhan commented Mar 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

marsteg commented Mar 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

marsteg commented Mar 6, 2026

Uh oh!

beyhan commented Mar 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

beyhan commented Mar 30, 2026

Uh oh!

marsteg commented Mar 30, 2026

Uh oh!

beyhan commented Mar 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

marsteg commented Mar 6, 2026 •

edited

Loading

beyhan commented Mar 6, 2026 •

edited

Loading