Skip to content

feat: Implement activity heartbeat#73

Open
timl3136 wants to merge 1 commit intocadence-workflow:mainfrom
timl3136:heartbeat
Open

feat: Implement activity heartbeat#73
timl3136 wants to merge 1 commit intocadence-workflow:mainfrom
timl3136:heartbeat

Conversation

@timl3136
Copy link
Member

What changed?
Implement the activity heartbeat feature

Why?
Heartbeat is a crucial component that will inform the server about client status, especially useful in long-running activities

How did you test it?
Unit test

Potential risks

Release notes

Documentation Changes

Signed-off-by: Tim Li <ltim@uber.com>
@gitar-bot
Copy link

gitar-bot bot commented Mar 18, 2026

🔍 CI failure analysis for f9e8396: Code formatting check failed: the new test file tests/integration_tests/workflow/test_heartbeat.py violates Black code style rules and needs to be reformatted before merge.

Overview

CI analysis found 1 code formatting violation in the activity heartbeat feature PR. The build completed successfully through dependency installation and protobuf generation, but failed at the final Black code style validation step.

Failures

Black Code Formatter Violation (confidence: high)

  • Type: configuration
  • Affected jobs: 67647412522
  • Related to PR: yes
  • Root cause: The newly added test file tests/integration_tests/workflow/test_heartbeat.py does not conform to the project's Black code formatting standards. The CI job runs Black in check mode (--check) and exits with code 1 when formatting violations are detected.
  • Suggested fix: Run black tests/integration_tests/workflow/test_heartbeat.py locally to auto-format the file according to project standards, then commit and push the changes. Alternatively, run the full formatting check (uv run black --check .) before pushing to catch any violations early.

Summary

  • PR-related failures: 1 — code formatting violation in newly added integration test file
  • Infrastructure/flaky failures: None
  • Recommended action: Author should format the test file using Black and push the corrected changes. This is a straightforward fix that does not indicate any logic errors in the heartbeat implementation itself.
Code Review ⚠️ Changes requested 0 resolved / 1 findings

Activity heartbeat implementation fires heartbeat RPCs without error handling, silently swallowing failures in both async and sync contexts, which could mask connectivity issues in production.

⚠️ Bug: Fire-and-forget heartbeat silently swallows RPC errors

📄 cadence/_internal/activity/_context.py:42 📄 cadence/_internal/activity/_context.py:71-74

_Context.heartbeat() uses asyncio.ensure_future() which creates a fire-and-forget task. If the heartbeat RPC fails (e.g., activity cancelled by the server, network error, or timeout exceeded), the exception is silently discarded — the user gets no feedback that their heartbeat failed.

In Cadence, a heartbeat failure (specifically CancellationAlreadyRequested) is the primary mechanism to notify a long-running activity that it has been cancelled. Silently dropping these errors means activities cannot respond to cancellation requests, which defeats a key purpose of heartbeating.

Additionally, Python will log Task exception was never retrieved warnings for unhandled exceptions in fire-and-forget tasks, polluting logs.

Suggested fix
Consider awaiting the heartbeat or adding an exception handler callback:

    def heartbeat(self, *details: Any) -> None:
        task = asyncio.ensure_future(
            self._heartbeat_sender.send_heartbeat(*details)
        )
        task.add_done_callback(self._on_heartbeat_done)

    @staticmethod
    def _on_heartbeat_done(task: asyncio.Task) -> None:
        if task.exception() is not None:
            logger.warning("Heartbeat failed: %s", task.exception())
🤖 Prompt for agents
Code Review: Activity heartbeat implementation fires heartbeat RPCs without error handling, silently swallowing failures in both async and sync contexts, which could mask connectivity issues in production.

1. ⚠️ Bug: Fire-and-forget heartbeat silently swallows RPC errors
   Files: cadence/_internal/activity/_context.py:42, cadence/_internal/activity/_context.py:71-74

   `_Context.heartbeat()` uses `asyncio.ensure_future()` which creates a fire-and-forget task. If the heartbeat RPC fails (e.g., activity cancelled by the server, network error, or timeout exceeded), the exception is silently discarded — the user gets no feedback that their heartbeat failed.
   
   In Cadence, a heartbeat failure (specifically `CancellationAlreadyRequested`) is the primary mechanism to notify a long-running activity that it has been cancelled. Silently dropping these errors means activities cannot respond to cancellation requests, which defeats a key purpose of heartbeating.
   
   Additionally, Python will log `Task exception was never retrieved` warnings for unhandled exceptions in fire-and-forget tasks, polluting logs.

   Suggested fix:
   Consider awaiting the heartbeat or adding an exception handler callback:
   
       def heartbeat(self, *details: Any) -> None:
           task = asyncio.ensure_future(
               self._heartbeat_sender.send_heartbeat(*details)
           )
           task.add_done_callback(self._on_heartbeat_done)
   
       @staticmethod
       def _on_heartbeat_done(task: asyncio.Task) -> None:
           if task.exception() is not None:
               logger.warning("Heartbeat failed: %s", task.exception())

Tip

Comment Gitar fix CI or enable auto-apply: gitar auto-apply:on

Options

Auto-apply is off → Gitar will not commit updates to this branch.
Display: compact → Showing less information.

Comment with these commands to change:

Auto-apply Compact
gitar auto-apply:on         
gitar display:verbose         

Was this helpful? React with 👍 / 👎 | Gitar

return self._info

def heartbeat(self, *details: Any) -> None:
asyncio.ensure_future(self._heartbeat_sender.send_heartbeat(*details))
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Bug: Fire-and-forget heartbeat silently swallows RPC errors

_Context.heartbeat() uses asyncio.ensure_future() which creates a fire-and-forget task. If the heartbeat RPC fails (e.g., activity cancelled by the server, network error, or timeout exceeded), the exception is silently discarded — the user gets no feedback that their heartbeat failed.

In Cadence, a heartbeat failure (specifically CancellationAlreadyRequested) is the primary mechanism to notify a long-running activity that it has been cancelled. Silently dropping these errors means activities cannot respond to cancellation requests, which defeats a key purpose of heartbeating.

Additionally, Python will log Task exception was never retrieved warnings for unhandled exceptions in fire-and-forget tasks, polluting logs.

Suggested fix:

Consider awaiting the heartbeat or adding an exception handler callback:

    def heartbeat(self, *details: Any) -> None:
        task = asyncio.ensure_future(
            self._heartbeat_sender.send_heartbeat(*details)
        )
        task.add_done_callback(self._on_heartbeat_done)

    @staticmethod
    def _on_heartbeat_done(task: asyncio.Task) -> None:
        if task.exception() is not None:
            logger.warning("Heartbeat failed: %s", task.exception())

Was this helpful? React with 👍 / 👎 | Reply gitar fix to apply this suggestion

return _Context(self._client, info, activity_def, heartbeat_sender)
else:
return _SyncContext(self._client, info, activity_def, self._thread_pool)
loop = asyncio.get_running_loop()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need to pass it in. You can get loop anywhere inside the loop

class _HeartbeatSender:
def __init__(
self,
worker_stub: Any,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Use the Client type rather than Any

from cadence.data_converter import DataConverter


class _HeartbeatSender:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that this is a separate class, we'll probably want two main things in the future:

  • Automatic heartbeating
  • Handling the heartbeat response to cancel the activity execution.

So the scope of this will definitely grow.

return self._info

def heartbeat(self, *details: Any) -> None:
asyncio.ensure_future(self._heartbeat_sender.send_heartbeat(*details))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: We should log errors. Long term we need to handle errors to terminate the activity execution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants