Skip to content

fix(client): wrap notify_stream in with_stale_connection_retry#155

Merged
mhenrixon merged 1 commit into
mainfrom
fix/notify-stream-stale-connection-retry
May 14, 2026
Merged

fix(client): wrap notify_stream in with_stale_connection_retry#155
mhenrixon merged 1 commit into
mainfrom
fix/notify-stream-stale-connection-retry

Conversation

@mhenrixon
Copy link
Copy Markdown
Owner

@mhenrixon mhenrixon commented May 14, 2026

Summary

  • notify_stream was the only @pgmq.* call site on Pgbus::Client that bypassed with_stale_connection_retry, causing transient idle-socket TLS errors (SSL error: unexpected eof, SSL SYSCALL error) to propagate to callers — typically Rails requests, after_commit callbacks, or event handlers using the ephemeral broadcast path
  • Wrapped notify_stream body in with_stale_connection_retry with synchronized inside the retry, matching the pattern used by send_message, send_batch, and every other method on Pgbus::Client
  • Added 4 specs covering: matched retry succeeds, non-matching error propagates, double-failure re-raises, and warning log emitted on retry

No public API change, no configuration change, no migration.

Test plan

  • bundle exec rspec spec/pgbus/client/notify_stream_spec.rb — 8 examples, 0 failures
  • bundle exec rspec spec/pgbus/client_spec.rb — 116 examples, 0 failures
  • bundle exec rubocop lib/pgbus/client/notify_stream.rb spec/pgbus/client/notify_stream_spec.rb — clean
  • Verify in staging with ephemeral Turbo broadcasts under idle-pool conditions

Summary by CodeRabbit

  • Bug Fixes

    • Notification delivery now includes automatic retry logic for stale PostgreSQL connections, improving reliability when transient connection issues occur. The system will attempt to recover from idle-socket SSL EOF conditions automatically.
  • Tests

    • Added test coverage for stale connection recovery scenarios, validating retry attempts, error propagation for non-recoverable failures, and warning notifications during recovery operations.

Review Change Stack

notify_stream was the only PGMQ call site on Pgbus::Client that
bypassed with_stale_connection_retry, causing transient idle-socket
TLS errors (SSL EOF, SSL SYSCALL) to propagate to callers instead
of being absorbed by the one-shot retry that every other method uses.

This wraps the body of notify_stream in with_stale_connection_retry
with synchronized inside the retry (matching send_message and the
rest of Client), so the ephemeral broadcast path has the same
resilience as every other producer.
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 14, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository UI (base), Organization UI (inherited)

Review profile: ASSERTIVE

Plan: Pro

Run ID: e7bf9126-4680-4b46-8f7e-d2a690ed1f1b

📥 Commits

Reviewing files that changed from the base of the PR and between d6c7e42 and 502403a.

📒 Files selected for processing (2)
  • lib/pgbus/client/notify_stream.rb
  • spec/pgbus/client/notify_stream_spec.rb

📝 Walkthrough

Walkthrough

Wraps the pg_notify execution in NotifyStream#notify_stream with with_stale_connection_retry to handle stale PostgreSQL connections that fail with idle-socket SSL EOF errors. Tests validate single retry on matching errors, non-retry behavior for other connection errors, re-raising after double failure, and warning log emission.

Changes

Stale Connection Retry in notify_stream

Layer / File(s) Summary
Stale connection retry wrapper and validation tests
lib/pgbus/client/notify_stream.rb, spec/pgbus/client/notify_stream_spec.rb
notify_stream wraps its pg_notify execution with with_stale_connection_retry to retry once on idle-socket SSL EOF connection errors. Tests cover single retry on matching errors, non-retry on mismatched messages, re-raise after second failure, and warning log emission.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

  • mhenrixon/pgbus#152: Both PRs modify Pgbus::Client::NotifyStream#notify_stream in lib/pgbus/client/notify_stream.rb, changing the control flow around the connection/pg_notify execution.
  • mhenrixon/pgbus#140: Both PRs introduce and apply the same with_stale_connection_retry logic around enqueue/publishing operations when PGMQ::Errors::ConnectionError indicates a stale/killed PG connection.

Suggested labels

bug

Poem

🐰 A connection once stale, now retries with care,
One gentle attempt before raising despair,
The logger warns softly when seconds winds round,
PostgreSQL's quirks now gracefully sound. 🌟

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fix(client): wrap notify_stream in with_stale_connection_retry' accurately describes the main change—wrapping the notify_stream method with retry logic for stale connections.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/notify-stream-stale-connection-retry

Comment @coderabbitai help to get the list of available commands and usage tips.

@mhenrixon mhenrixon merged commit 216a32a into main May 14, 2026
9 checks passed
@mhenrixon mhenrixon deleted the fix/notify-stream-stale-connection-retry branch May 14, 2026 11:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant