fix: prevent xray goroutine leak via background sweeper by cnlangzi · Pull Request #79 · cnlangzi/proxyclient

cnlangzi · 2026-03-30T02:47:44Z

Background

The xray.go servers map caches Xray instances keyed by proxy URL. When Close() is called, it immediately calls Instance.Close() and removes the entry. However, xray-core's goroutines may not fully terminate on short-lived proxy tests, causing goroutine accumulation over time.

After 18 days of sustained proxy testing the process holds 1.5GB RSS with 4GB swap full.

Changes

Add DrainedAt field to Server struct (zero = active, non-zero = draining since)
Add background sweeper() goroutine that runs every 10s and closes instances that have been draining for more than 30s
Close() now marks the instance as draining instead of closing immediately
CloseAll() marks all instances as draining for the sweeper to clean up
getServer() revives a draining instance if accessed again within the grace period

This gives in-flight operations a graceful period to finish while ensuring no goroutine leaks accumulate over time.

Acceptance Criteria

No goroutine accumulation on repeated proxy test cycles
Memory stays stable under sustained proxy testing
Close() and CloseAll() still work correctly

Summary by Sourcery

Introduce a delayed-drain mechanism for Xray server instances to prevent goroutine leaks and allow graceful shutdown of in-flight operations.

Bug Fixes:

Prevent goroutine and memory leaks by deferring Xray instance closure until after a drain timeout via a background sweeper.

Enhancements:

Track server drain state with a timestamp on each Xray server instance and periodically sweep and close expired draining instances.
Allow previously draining Xray server instances to be revived if reused within the configured grace period.
Ensure Close and CloseAll mark servers as draining instead of closing immediately, centralizing actual shutdown in the background sweeper.

sourcery-ai · 2026-03-30T02:47:49Z

Reviewer's Guide

Introduces a draining lifecycle for cached Xray servers with a background sweeper goroutine, so Close/CloseAll mark instances for deferred shutdown instead of closing them immediately, preventing goroutine leaks while allowing in-flight work to finish.

Sequence diagram for deferred Close with background sweeper

sequenceDiagram
    participant Caller
    participant Functions
    participant GlobalState
    participant Server
    participant Sweeper
    participant CoreInstance as core.Instance

    Caller->>Functions: Close(proxyURL)
    Functions->>GlobalState: Lock mu
    Functions->>GlobalState: lookup servers[proxyURL]
    alt server exists and DrainedAt is zero
        Functions->>Server: set DrainedAt = time.Now()
    else server missing or already draining
        Note over Functions,Server: No change
    end
    Functions->>GlobalState: Unlock mu

    loop every sweepInterval
        Sweeper->>GlobalState: Sleep(sweepInterval)
        Sweeper->>GlobalState: Lock mu
        Sweeper->>GlobalState: iterate servers
        alt server DrainedAt != zero and now - DrainedAt > drainTimeout
            Sweeper->>CoreInstance: Instance.Close()
            Sweeper->>GlobalState: delete servers[proxyURL]
        else not ready to close
            Note over Sweeper,Server: Keep server in draining or active state
        end
        Sweeper->>GlobalState: Unlock mu
    end

Sequence diagram for getServer reviving a draining instance

sequenceDiagram
    participant Caller
    participant Functions
    participant GlobalState
    participant Server

    Caller->>Functions: getServer(proxyURL)
    Functions->>GlobalState: Lock mu
    Functions->>GlobalState: lookup servers[proxyURL]
    alt server exists
        alt Server.DrainedAt is non_zero
            Functions->>Server: set DrainedAt = zero_time
        else Server.DrainedAt is zero
            Note over Functions,Server: Already active
        end
        Functions-->>Caller: *Server
    else server missing
        Functions-->>Caller: nil
    end
    Functions->>GlobalState: Unlock mu

Class diagram for updated Xray server lifecycle management

classDiagram
    class Server {
        core.Instance Instance
        int SocksPort
        time.Time DrainedAt
    }

    class GlobalState {
        sync.Mutex mu
        map~string,*Server~ servers
        bool sweeping
        const time.Duration drainTimeout
        const time.Duration sweepInterval
    }

    class Functions {
        +getServer(proxyURL string) *Server
        +setServer(proxyURL string, instance *core.Instance, port int)
        +Close(proxyURL string)
        +CloseAll()
        +sweeper()
        +init()
    }

    GlobalState --> Server : manages
    Functions --> Server : creates_and_updates
    Functions --> GlobalState : reads_writes

State diagram for Server draining lifecycle

stateDiagram-v2
    [*] --> Active

    Active: DrainedAt = zero
    Draining: DrainedAt = non_zero
    Closed: removed from servers map

    Active --> Draining: Close or CloseAll
    Draining --> Active: getServer called
    Draining --> Closed: sweeper after drainTimeout
    Closed --> [*]

File-Level Changes

Change	Details	Files
Add a draining lifecycle with background sweeper to defer Xray instance shutdown and cleanup safely.	Introduce drainTimeout and sweepInterval constants controlling how long servers stay in draining state and how often cleanup runs Extend Server struct with a DrainedAt timestamp indicating when draining began Add a global sweeper goroutine invoked from init that periodically scans servers and closes/removes those that have exceeded the drain timeout Guard sweeper access to the servers map with the existing mutex to keep concurrent access safe	`xray/xray.go`
Change server cache operations to cooperate with the draining lifecycle instead of closing immediately.	Update getServer to clear DrainedAt and revive servers that are accessed while draining Update setServer to initialize new servers with a zero DrainedAt (active) state Modify Close to only mark a server as draining by setting DrainedAt instead of closing and deleting it immediately Modify CloseAll to mark all servers as draining at the current time and let the sweeper perform actual close and removal	`xray/xray.go`

Possibly linked issues

#fix: xray goroutine leak in servers map (sweeper solution): PR implements the issue’s proposed sweeper-based drain mechanism in xray.go to stop goroutine leaks.

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it. You can also reply to a
review comment with @sourcery-ai issue to create an issue from it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time. You can also comment
@sourcery-ai title on the pull request to (re-)generate the title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time exactly where you
want it. You can also comment @sourcery-ai summary on the pull request to
(re-)generate the summary at any time.
Generate reviewer's guide: Comment @sourcery-ai guide on the pull
request to (re-)generate the reviewer's guide at any time.
Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
pull request to resolve all Sourcery comments. Useful if you've already
addressed all the comments and don't want to see them anymore.
Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
request to dismiss all existing Sourcery reviews. Especially useful if you
want to start fresh with a new review - don't forget to comment
@sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

sourcery-ai

Hey - I've found 2 issues, and left some high level feedback:

Holding mu while calling srv.Instance.Close() in sweeper risks blocking all map operations if close is slow; consider first collecting the instances to close under the lock, then releasing the lock and closing them outside the critical section.
The sweeping package-level variable is declared but never used; it can be removed or repurposed (e.g., to guard multiple sweeper startups) to avoid confusion.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- Holding `mu` while calling `srv.Instance.Close()` in `sweeper` risks blocking all map operations if close is slow; consider first collecting the instances to close under the lock, then releasing the lock and closing them outside the critical section.
- The `sweeping` package-level variable is declared but never used; it can be removed or repurposed (e.g., to guard multiple sweeper startups) to avoid confusion.

## Individual Comments

### Comment 1
<location path="xray/xray.go" line_range="120-123" />
<code_context>
 	defer mu.Unlock()

 	servers[proxyURL] = &Server{
-		Instance:  instance,
+		Instance:   instance,
 		SocksPort: port,
+		DrainedAt:  time.Time{},
 	}
 }
</code_context>
<issue_to_address>
**issue (bug_risk):** Overwriting an existing draining server entry can leak the old instance because the sweeper no longer sees it.

With the new draining design, `setServer` now unconditionally overwrites `servers[proxyURL]`. If a server has been marked draining by `Close`/`CloseAll` but the sweeper hasn’t run yet, a new `setServer` for the same URL will remove the draining `*Server` from the map, so the sweeper never sees or closes it. To avoid this leak, ensure any existing entry is explicitly `Close()`d before overwriting, or always close the previous instance when replacing it, regardless of draining state.
</issue_to_address>

### Comment 2
<location path="xray/xray.go" line_range="142-148" />
<code_context>

+// CloseAll marks all servers as draining immediately. The sweeper will close
+// each one after drainTimeout.
 func CloseAll() {
 	mu.Lock()
 	defer mu.Unlock()

-	for url, server := range servers {
-		server.Instance.Close() //nolint: errcheck
-		delete(servers, url)
+	now := time.Now()
+	for _, srv := range servers {
+		if srv.DrainedAt.IsZero() {
+			srv.DrainedAt = now
+		}
</code_context>
<issue_to_address>
**question (bug_risk):** CloseAll no longer guarantees synchronous shutdown of instances, which may surprise callers depending on previous semantics.

Previously this called `Instance.Close()` and removed each server entry before returning, so callers could rely on all instances being fully shut down. Now it only sets `DrainedAt` and returns, leaving `Close()` and deletion to the sweeper after `drainTimeout`, which can introduce races or resource conflicts for callers expecting immediate shutdown (e.g., port reuse, goroutine termination). If that synchronous guarantee is required, consider either keeping `CloseAll` blocking for shutdown (with an internal helper for async draining) or introducing a separate async API and preserving the original semantics here.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

sourcery-ai · 2026-03-30T02:48:40Z

 	servers[proxyURL] = &Server{
-		Instance:  instance,
+		Instance:   instance,
 		SocksPort: port,
+		DrainedAt:  time.Time{},


issue (bug_risk): Overwriting an existing draining server entry can leak the old instance because the sweeper no longer sees it.

With the new draining design, setServer now unconditionally overwrites servers[proxyURL]. If a server has been marked draining by Close/CloseAll but the sweeper hasn’t run yet, a new setServer for the same URL will remove the draining *Server from the map, so the sweeper never sees or closes it. To avoid this leak, ensure any existing entry is explicitly Close()d before overwriting, or always close the previous instance when replacing it, regardless of draining state.

sourcery-ai · 2026-03-30T02:48:40Z

 func CloseAll() {
 	mu.Lock()
 	defer mu.Unlock()

-	for url, server := range servers {
-		server.Instance.Close() //nolint: errcheck
-		delete(servers, url)
+	now := time.Now()
+	for _, srv := range servers {
+		if srv.DrainedAt.IsZero() {


question (bug_risk): CloseAll no longer guarantees synchronous shutdown of instances, which may surprise callers depending on previous semantics.

Previously this called Instance.Close() and removed each server entry before returning, so callers could rely on all instances being fully shut down. Now it only sets DrainedAt and returns, leaving Close() and deletion to the sweeper after drainTimeout, which can introduce races or resource conflicts for callers expecting immediate shutdown (e.g., port reuse, goroutine termination). If that synchronous guarantee is required, consider either keeping CloseAll blocking for shutdown (with an internal helper for async draining) or introducing a separate async API and preserving the original semantics here.

codecov · 2026-03-30T02:49:28Z

Codecov Report

❌ Patch coverage is 31.81818% with 15 lines in your changes missing coverage. Please review.
✅ Project coverage is 8.79%. Comparing base (a27df08) to head (6fbbbe0).
⚠️ Report is 1 commits behind head on main.

Files with missing lines	Patch %	Lines
xray/xray.go	31.81%	14 Missing and 1 partial ⚠️

Additional details and impacted files

@@           Coverage Diff            @@
##            main     #79      +/-   ##
========================================
+ Coverage   8.42%   8.79%   +0.37%     
========================================
  Files         27      27              
  Lines       1508    1523      +15     
========================================
+ Hits         127     134       +7     
- Misses      1367    1374       +7     
- Partials      14      15       +1

Flag	Coverage Δ
Tests	`8.79% <31.81%> (+0.37%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

- Add DrainedAt field to Server struct for graceful draining - Add background sweeper goroutine that closes drained xray instances after 30s grace period (gives in-flight ops time to finish) - sweeper runs every 10s to close expired draining instances - Close() marks instance as draining instead of immediate close - CloseAll() marks all instances as draining for sweeper to clean up - This prevents goroutine accumulation when xray-core's Close() does not fully terminate all internal goroutines on short-lived proxy tests

sourcery-ai Bot reviewed Mar 30, 2026

View reviewed changes

cnlangzi force-pushed the fix/xray-goroutine-leak branch from fead215 to 6fbbbe0 Compare March 30, 2026 02:57

cnlangzi merged commit 0b5ab19 into main Mar 30, 2026
7 checks passed

cnlangzi deleted the fix/xray-goroutine-leak branch March 30, 2026 03:04

This was referenced Mar 30, 2026

fix: release mu before Instance.Close() in sweeper (follow-up to #79) #80

Merged

fix: close and delete under re-checked lock to fix race conditions #81

Merged

cnlangzi linked an issue Mar 30, 2026 that may be closed by this pull request

fix: xray goroutine leak in servers map (sweeper solution) #78

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: prevent xray goroutine leak via background sweeper#79

fix: prevent xray goroutine leak via background sweeper#79
cnlangzi merged 1 commit intomainfrom
fix/xray-goroutine-leak

cnlangzi commented Mar 30, 2026 •

edited by sourcery-ai Bot

Loading

Uh oh!

sourcery-ai Bot commented Mar 30, 2026 •

edited

Loading

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

sourcery-ai Bot left a comment

Uh oh!

sourcery-ai Bot Mar 30, 2026

Uh oh!

sourcery-ai Bot Mar 30, 2026

Uh oh!

codecov Bot commented Mar 30, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

cnlangzi commented Mar 30, 2026 • edited by sourcery-ai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Background

Changes

Acceptance Criteria

Summary by Sourcery

Uh oh!

sourcery-ai Bot commented Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviewer's Guide

Sequence diagram for deferred Close with background sweeper

Sequence diagram for getServer reviving a draining instance

Class diagram for updated Xray server lifecycle management

State diagram for Server draining lifecycle

File-Level Changes

Possibly linked issues

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

sourcery-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

sourcery-ai Bot Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

sourcery-ai Bot Mar 30, 2026

Choose a reason for hiding this comment

Uh oh!

codecov Bot commented Mar 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

cnlangzi commented Mar 30, 2026 •

edited by sourcery-ai Bot

Loading

sourcery-ai Bot commented Mar 30, 2026 •

edited

Loading

codecov Bot commented Mar 30, 2026 •

edited

Loading