Refactor/globalindex external kv#330
Open
maning00 wants to merge 2 commits into
Open
Conversation
Consolidate the parallel ExternalKvBlockIndex into GlobalBlockIndex by
adding a LocationOwner discriminator (UMBP_OWNED vs EXTERNAL_HICACHE)
on every Location entry. Storage-backed blocks and external sglang
bind/unbind notifications now share one index, one lookup path, and
one eviction picker, removing a long-standing source of drift between
the two views.
- types: add LocationOwner enum, Location::SameIdentity, FullSyncScope
- global_block_index: owner-aware ApplyEvents / ReplaceNodeLocations /
MatchExternal / FindEvictionCandidates
- master_client: ack-retained bundle outbox with seq numbers;
owner-scoped full sync; bind/unbind/clear/flush external hashes APIs
- master_server: drop ReportExternalKv{Add,Remove,Clear} RPC handlers;
MatchExternalKv now reads from the unified GlobalBlockIndex
- pybind: expose bind_external_hashes / unbind_external_hashes /
unbind_all_external_hashes_at_tier / flush_external_queue
- proto: remove deprecated mutation RPCs (events ship via heartbeat
bundles instead)
- delete external_kv_block_index.{h,cpp} and its dedicated unit tests;
refresh test_global_block_index_events / test_router_dedup /
test_peer_dram_allocator coverage
Restore the pre-v2.5 ReportExternalKvBlocks / RevokeExternalKvBlocks /
RevokeAllExternalKvBlocksAtTier RPC surface as a thin compatibility
layer on top of the unified GlobalBlockIndex. The deleted
ExternalKvBlockIndex class is NOT brought back — the restored handlers
delegate directly to GlobalBlockIndex::ApplyEvents with
LocationOwner::EXTERNAL_HICACHE, sharing the same backing store as the
v2.5 heartbeat bundle outbox path.
Two distinct surfaces, both consistent with pre-v2.5 signatures:
* UMBPMasterClient — 3-arg explicit-node_id sync RPCs for
schedulers / sidecars that report on behalf of a registered
worker. Report requires node_id alive in ClientRegistry; revoke
paths skip the alive check (index delete is always allowed).
Empty node_id / empty hashes return INVALID_ARGUMENT.
* mori.cpp.UMBPClient — 2-arg implicit-self aliases that route
through BindExternalHashes / UnbindExternalHashes /
UnbindAllExternalHashesAtTier + FlushExternalQueue, so the
entries are tracked in external_current_set_ and survive a
subsequent FULL_SYNC_EXTERNAL_HICACHE replay.
Other changes:
* proto: restore 3 request/response messages and service entries
byte-identical to pre-v2.5 (existing master and new master are
wire-compatible for these RPCs and MatchExternalKv)
* global_block_index: RemoveLocationsLocked now returns the count
of removed locations; ApplyEvents accumulates it for
CLEAR_AT_TIER so RevokeAllExternalKvBlocksAtTier reports a
truthful BLOCKS_TOTAL metric
* master_server: reuse existing MORI_UMBP_METRIC_EXT_KV_*
constants instead of inventing new names; consistent with
existing dashboards
* IUMBPClient / DistributedClient / PoolClient / StandaloneClient:
propagate the 2-arg API down the data-plane stack (no-op stub
on standalone)
* docs/api/umbp.rst: restore method-table rows and add a
"Where to call from" subsection explaining the two write paths
and their visibility / lifecycle differences
* src/umbp/doc/design-external-kv-report-revoke-bc.md: full
design notes (background, API surface, alive-check rationale,
coexistence with bundle outbox, test plan)
* tests: 12 new Python tests covering both surfaces, including a
regression guard (test_umbpclient_two_arg_alias_survives_external_full_sync)
that prevents the 2-arg alias from accidentally being changed
to a non-outbox path
* test_global_block_index_events: assert the new
CLEAR_AT_TIER mutated-count return value
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.