Context
The current Rust port of core/qbft (crates/core/src/qbft/mod.rs) is a faithful translation of the Go reference but built on synchronous, thread-based primitives:
crossbeam::channel for transport
std::thread::scope for spawning per-process workers
- A blocking
pub fn run(...) entry point
cancellation::CancellationToken (third-party crate, not tokio_util)
This is at odds with the rest of Pluto, which is tokio-async end to end. Every consumer that will eventually wire QBFT in (core/consensus, core/scheduler, the duty pipeline) is async. The current shape forces either (a) bridging via spawn_blocking and ad-hoc channel adapters, or (b) blocking inside async tasks — both bad.
Closes the gap left by #13, which produced the initial sync port.
Goal
Rewrite core/qbft with an async-Rust-native API that integrates cleanly with the rest of the workspace and preserves the Byzantine-safety guarantees of the Go reference.
Scope
API
run becomes pub async fn run(...) -> Result<Decision, QbftError>
Transport trait methods return impl Future<Output = Result<...>> + Send (or are async fn via trait async)
- Cancellation uses
tokio_util::sync::CancellationToken (already used elsewhere in Pluto)
- Channels:
tokio::sync::mpsc (and oneshot where appropriate)
- No
std::thread, no crossbeam, no std::sync::mpsc anywhere in the public or internal API
Internals
- Replace
thread::scope with tokio::spawn + JoinSet (or select! for per-instance loops)
- Replace blocking sleeps/timers with
tokio::time::sleep / Interval
- Clock abstraction (
fake_clock.rs) becomes async-aware so deterministic tests still work — likely using tokio::time::pause() + advance() rather than the current fake clock
- Keep the algorithm shape (rounds, justifications, message bookkeeping) bit-for-bit identical to the Go reference — only the concurrency primitives change
Safety net (mandatory)
- Port
charon/core/consensus/qbft/strategysim_internal_test.go (~1040 LOC) as the parity gate. This is the test that catches BFT correctness regressions; without it we cannot trust the rewrite.
- All current
crates/core/src/qbft/internal_test.rs cases pass (or are replaced by equivalent async versions)
Acceptance criteria
References
Notes
This is on the critical path for the duty-flow pipeline. core/consensus (#157) and downstream modules (core/scheduler #176, etc.) will consume the new async API directly, so this should land before consensus framework integration begins.
Context
The current Rust port of
core/qbft(crates/core/src/qbft/mod.rs) is a faithful translation of the Go reference but built on synchronous, thread-based primitives:crossbeam::channelfor transportstd::thread::scopefor spawning per-process workerspub fn run(...)entry pointcancellation::CancellationToken(third-party crate, nottokio_util)This is at odds with the rest of Pluto, which is tokio-async end to end. Every consumer that will eventually wire QBFT in (
core/consensus,core/scheduler, the duty pipeline) is async. The current shape forces either (a) bridging viaspawn_blockingand ad-hoc channel adapters, or (b) blocking inside async tasks — both bad.Closes the gap left by #13, which produced the initial sync port.
Goal
Rewrite
core/qbftwith an async-Rust-native API that integrates cleanly with the rest of the workspace and preserves the Byzantine-safety guarantees of the Go reference.Scope
API
runbecomespub async fn run(...) -> Result<Decision, QbftError>Transporttrait methods returnimpl Future<Output = Result<...>> + Send(or areasync fnvia trait async)tokio_util::sync::CancellationToken(already used elsewhere in Pluto)tokio::sync::mpsc(andoneshotwhere appropriate)std::thread, nocrossbeam, nostd::sync::mpscanywhere in the public or internal APIInternals
thread::scopewithtokio::spawn+JoinSet(orselect!for per-instance loops)tokio::time::sleep/Intervalfake_clock.rs) becomes async-aware so deterministic tests still work — likely usingtokio::time::pause()+advance()rather than the current fake clockSafety net (mandatory)
charon/core/consensus/qbft/strategysim_internal_test.go(~1040 LOC) as the parity gate. This is the test that catches BFT correctness regressions; without it we cannot trust the rewrite.crates/core/src/qbft/internal_test.rscases pass (or are replaced by equivalent async versions)Acceptance criteria
pub fn runblocking entry point remainscrossbeam,std::thread, or third-partycancellationcrate usage incrates/core/src/qbft/cargo clippy --workspace --all-targets --all-features -- -D warningscleancargo +nightly fmt --all --checkcleanmissing_docswarning re-enabled for the moduleReferences
crates/core/src/qbft/mod.rs(1305 LOC)Notes
This is on the critical path for the duty-flow pipeline.
core/consensus(#157) and downstream modules (core/scheduler#176, etc.) will consume the new async API directly, so this should land before consensus framework integration begins.