Skip to content

Add WFT Chunking v2#1281

Open
mjameswh wants to merge 1 commit into
mainfrom
wft-chunking-v2
Open

Add WFT Chunking v2#1281
mjameswh wants to merge 1 commit into
mainfrom
wft-chunking-v2

Conversation

@mjameswh
Copy link
Copy Markdown
Contributor

What was changed

  • Add Workflow Task Chunking v2. This is a complete rewrite of the Workflow Task Chunking algorithm following a stricter, grammar-like design.

Why?

  • Fix NDE on WFT heartbeats with speculative Updates
  • Fix possibility of inconsistent reported WFT's time
  • Adopt an algorithmic style that makes it significantly harder to introduce bugs resulting from unexpected interactions between commands requiring look ahead (i.e. Local Activities and Updates), Workflow Task heartbeat, and Workflow History pagination.

Rollout

  • Given that the chunking v2 algorithm is not strictly compatible with v1, the new algorithm is gated behind an SDK Flag that has to be set on the first WFTCompleted of a given Workflow Execution (i.e. this flag can't be turned on midway of an existing Workflow).
  • The flag is not yet turned on by default; it will be turned on at a later time, once it is known that at least one release of each Core-based SDK has been released. This is in accordance with usual Temporal SDK development practice, to ensure that users have a safe rollback path to a prior release of the SDK if needed.
  • Users may choose to immediately turn on WFT Chunking v2 on new Workflow Executions by setting the TEMPORAL_USE_WFT_CHUNKING_V2 environment variable to true.

Checklist

  1. How was this tested:
  • Added new tests
  • Diverted one case of the CI test matrix to run ALL tests using WFT Chunking v2. This confirms that no unwanted visible changes results from running workflows with v1 vs v2.
  1. Any docs updates needed?
  • Complete update of the "workflow_task_chunking.md" file.

@mjameswh mjameswh requested a review from a team as a code owner May 19, 2026 14:44
@mjameswh mjameswh requested a review from Copilot May 19, 2026 14:44
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces Workflow Task Chunking v2 to the Rust Core SDK, gated behind a new internal flag (WftChunkingV2) that is opt-in for new workflow executions via TEMPORAL_USE_WFT_CHUNKING_V2. It adds a stricter, grammar/state-machine-based chunking algorithm intended to prevent nondeterminism in heartbeat/speculative-update scenarios, and expands test coverage (unit + integration) while updating CI to exercise the new mode.

Changes:

  • Add WftChunkingV2 internal flag + env-var opt-in logic, and update flag-writing behavior for first-WFT-only vs cumulative defaults.
  • Implement and wire a new v2 chunking algorithm (with time-sensitive event classification + speculative-update handling), plus extensive unit tests.
  • Add integration tests validating replay behavior/timestamps under v2, update CI to run one Linux matrix variant with v2 enabled, and expand architecture docs.

Reviewed changes

Copilot reviewed 8 out of 9 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
crates/sdk-core/tests/integ_tests/workflow_tests/replay.rs Adds integration tests to ensure v2 chunking behavior propagates through replay and preserves per-WFT observed timestamps.
crates/sdk-core/src/worker/workflow/machines/workflow_machines.rs Adjusts internal-flag initialization and flag writing behavior to support first-WFT-only defaults.
crates/sdk-core/src/worker/workflow/history_update.rs Adds v2 chunking algorithm, new paginator/update plumbing for speculative updates + v2 selection, and extensive test expansions.
crates/sdk-core/src/replay/history_builder.rs Extends test history builder to auto-stamp the first WFTCompleted with WftChunkingV2 and adds helpers for flag manipulation.
crates/sdk-core/src/internal_flags.rs Introduces WftChunkingV2 flag, env-var opt-in, and refactors default-enabled flag handling.
crates/sdk-core/src/core_tests/workflow_tasks.rs Updates internal-flag expectations to include first-WFT-only defaults when enabled.
crates/common/src/protos/mod.rs Adds HistoryEvent::is_wft_time_sensitive_event used by v2 chunking to prevent unsafe collapsing.
arch_docs/workflow_task_chunking.md Major doc rewrite describing LWFT model, v1/v2 behavior, and rollout plan (currently includes unfinished sections).
.github/workflows/per-pr.yml Runs one Linux CI matrix entry with TEMPORAL_USE_WFT_CHUNKING_V2=true for both unit and integration tests.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 195 to 210
@@ -183,13 +205,16 @@ impl HistoryPaginator {
event_queue: Default::default(),
next_page_token: NextPageToken::FetchFromStart,
final_events: req.original_wft.work.update.events,
has_pending_speculative_updates: !req.original_wft.work.messages.is_empty(),
has_wft_chunking_v2: req.original_wft.paginator.has_wft_chunking_v2,
};
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a valid concern 🤔

Comment thread crates/sdk-core/src/worker/workflow/history_update.rs
Comment thread crates/sdk-core/src/worker/workflow/history_update.rs
Comment on lines +60 to +77
// FIXME: What are absolute must-haves for LWFT?

## Workflow History Pagination

There are various ways The Worker obtains Workflow history from the server.

## Chunking history into Logical Workflow Tasks

It is critical that, on replay, Workflow history be split into LWFTs that correctly align with the
LWFTs processed during the original execution. Differences in how events are grouped affect how and
when those events are presented to workflow code. That, in turn, may affect which commands the
workflow produces, or when it produces them, resulting in non-determinism errors.

It is also critical that

Note that we said that LWFTs have to _align_ with the original execution, not that they have to be
_strictly the same_. In some cases, it is

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants