Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 19 additions & 2 deletions xray/xray.go
Original file line number Diff line number Diff line change
Expand Up @@ -86,15 +86,32 @@ func init() {
func sweeper() {
for {
time.Sleep(sweepInterval)

// Collect expired URLs under lock, then release lock before closing
// to avoid blocking all map operations while Instance.Close() runs.
var expired []struct {
url string
srv *Server
}
mu.Lock()
now := time.Now()
for url, srv := range servers {
if !srv.DrainedAt.IsZero() && now.Sub(srv.DrainedAt) > drainTimeout {
srv.Instance.Close() //nolint: errcheck
delete(servers, url)
expired = append(expired, struct {
url string
Comment on lines +92 to +101
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (bug_risk): Reviving a draining server can race with the sweeper and still result in the instance being closed and deleted.

The sweeper computes expiry under the lock, releases it, and later re-enters the lock only to delete. In between, getServer can revive a draining server by resetting DrainedAt to zero, but the sweeper will still close and delete it because it never re-checks the expiry condition.

When re-entering the critical section, re-validate expiry before deletion. For example, store only URLs in expired, then on delete re-fetch srv from the map and confirm DrainedAt is still non-zero and past drainTimeout; if it was revived, skip closing and deleting.

srv *Server
}{url, srv})
}
}
mu.Unlock()

// Close instances outside the critical section.
for _, e := range expired {
e.srv.Instance.Close() //nolint: errcheck
mu.Lock()
delete(servers, e.url)
mu.Unlock()
}
}
}

Expand Down
Loading