Skip to content

feat(internal, cmd): protocol-migrate current-state#553

Open
aristidesstaffieri wants to merge 7 commits intofeature/protocol-migrate-historyfrom
feature/protocol-migrate-current-state
Open

feat(internal, cmd): protocol-migrate current-state#553
aristidesstaffieri wants to merge 7 commits intofeature/protocol-migrate-historyfrom
feature/protocol-migrate-current-state

Conversation

@aristidesstaffieri
Copy link
Copy Markdown
Contributor

@aristidesstaffieri aristidesstaffieri commented Mar 30, 2026

Closes #516

What

Implements the protocol-migrate current-state CLI subcommand and extracts a shared migration engine to eliminate duplication between the history and current-state migration paths.

  • protocol-migrate current-state — builds protocol current state from a user-specified --start-ledger forward to the tip, converging with live ingestion via CAS-gated cursors on protocol_{ID}_current_state_cursor. Mirrors the history migration lifecycle: validate → process batches → converge → complete.
  • Shared protocolMigrateEngine with migrationStrategy struct — parameterizes the 5 substitution points between history and current-state (status field, status update method, cursor name function, persist method, start ledger source), eliminating ~1100 lines of duplication. Both history and current-state services are now thin wrappers (~100 lines each) over the shared engine.
  • UpdateCurrentStateMigrationStatus added to ProtocolsModel for current-state migration lifecycle tracking.
  • Shared CLI helpers (buildMigrationCommand/runMigration) extracted in cmd/protocol_migrate.go to deduplicate flag definitions and setup logic between subcommands.
  • Integration test (TestCurrentStateMigrationThenLiveIngestionHandoff) verifying current-state migration hands off cleanly to live ingestion via CAS cursor convergence.

Why

Protocol current-state production requires building a complete snapshot of protocol state from each protocol's deployment ledger forward — unlike history migration which only covers the retention window. Without this command, protocols only have current state from the point live ingestion started, missing all prior state. The shared engine extraction was necessary because the history and current-state migration paths are structurally identical (validate, batch-process, CAS-advance, converge, complete) with only 5 strategy-specific substitution points.

Known limitations

No current protocols implemented yet.

Issue that this PR addresses

#516

Checklist

PR Structure

  • It is not possible to break this PR down into smaller PRs.
  • This PR does not mix refactoring changes with feature changes.
  • This PR's title starts with name of package that is most changed in the PR, or all if the changes are broad or impact many packages.

Thoroughness

  • This PR adds tests for the new functionality or fixes.
  • All updated queries have been tested (refer to this check if the data set returned by the updated query is expected to be same as the original one).

Release

  • This is not a breaking change.
  • This is ready to be tested in development.
  • The new functionality is gated with a feature flag if this is not ready for production.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds protocol state-production migration tooling and supporting infrastructure: a new protocol-migrate current-state command plus a shared migration engine, alongside protocol WASM/contract tracking, classification (“protocol-setup”), and live-ingestion handoff via CAS cursors.

Changes:

  • Introduce shared protocolMigrateEngine with history/current-state strategies and add protocol-migrate current-state CLI subcommand.
  • Add protocol setup/classification pipeline (WASM spec extraction + validator interface/registry) and DB models/migrations for protocols, protocol_wasms, protocol_contracts.
  • Extend ingestion (live + backfill) and indexer buffer/processors to collect and persist protocol WASMs/contracts, plus new protocol-state metrics and caching.

Reviewed changes

Copilot reviewed 72 out of 73 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
internal/utils/retry.go New generic retry helper used for ledger fetch retries.
internal/utils/retry_test.go Unit tests for RetryWithBackoff.
internal/utils/collections.go New BuildMap helper for slice→map with dup detection.
internal/utils/collections_test.go Unit tests for BuildMap.
internal/utils/ingestion_utils.go Add protocol cursor key helper functions.
internal/services/validator_registry.go New validator registry for protocol-setup.
internal/services/validator_registry_test.go Tests for validator registry behavior and concurrency.
internal/services/processor_registry.go New processor registry for protocol-migrate/ingest wiring.
internal/services/processor_registry_test.go Tests for processor registry behavior and concurrency.
internal/services/protocol_processor.go Define ProtocolProcessor + input contract for state production.
internal/services/protocol_migrate.go Shared migration engine with CAS cursor convergence/handoff.
internal/services/protocol_migrate_history.go History migration wrapper wiring the shared engine.
internal/services/protocol_migrate_history_test.go Tests for history strategy-specific behavior.
internal/services/protocol_migrate_current_state.go Current-state migration wrapper wiring the shared engine.
internal/services/protocol_migrate_current_state_test.go Tests for current-state strategy-specific behavior.
internal/services/protocol_validator.go New validator/spec-extractor interfaces + wazero implementation.
internal/services/wasm_spec_extractor_test.go Tests extracting specs from real WASMs + SEP41 validator check.
internal/services/protocol_setup.go Protocol setup service to classify unclassified WASMs via RPC.
internal/services/protocol_setup_test.go Unit tests for protocol setup service behavior.
internal/services/contract_validator.go Refactor SEP-41 detection helpers + add SEP41ProtocolValidator.
internal/services/contract_validator_test.go Update tests to use refactored helper functions.
internal/services/ingest.go Add protocol processors + checkpoint service wiring; remove old retry helper.
internal/services/ingest_live.go Add protocol state production pipeline, CAS gating, and contract caching.
internal/services/ingest_backfill.go Persist protocol WASMs/contracts during catchup; use shared retry helper.
internal/services/mocks.go Add mocks for ProtocolProcessor/Validator/WasmSpecExtractor and ChangeReader; remove obsolete token method mock.
internal/services/token_ingestion_test.go Refactor tests for new TokenIngestionService config API and add helpers.
internal/indexer/processors/protocol_wasms.go New processor extracting ContractCode WASM hashes.
internal/indexer/processors/protocol_wasms_test.go Tests for protocol WASM extraction processor.
internal/indexer/processors/protocol_contracts.go New processor mapping contract instances to WASM hashes.
internal/indexer/processors/protocol_contracts_test.go Tests for protocol contract extraction processor.
internal/indexer/indexer.go Wire new processors into per-operation processing and buffer.
internal/indexer/mocks.go Add mocks for new ledger change processors.
internal/indexer/indexer_test.go Update indexer tests to include protocol WASM/contract processors.
internal/indexer/indexer_buffer.go Add buffer support for protocol WASMs/contracts with merge/clear semantics.
internal/indexer/indexer_buffer_test.go Tests for new buffer protocol WASM/contract behavior.
internal/data/protocols.go New protocols model + status update helpers.
internal/data/protocols_test.go Tests for protocols model operations.
internal/data/protocol_wasms.go New protocol_wasms model for batch insert + classification update.
internal/data/protocol_wasms_test.go Tests for protocol_wasms batch insert/update.
internal/data/protocol_contracts.go New protocol_contracts model for upsert + protocol lookup.
internal/data/protocol_contracts_test.go Tests for protocol_contracts inserts/upserts/FK behavior.
internal/data/models.go Add new protocol models to shared Models struct.
internal/data/mocks.go Add mocks for new protocol models and TrustlineAssetModel.
internal/data/ingest_store.go Add CAS method and cursor name constants.
internal/data/ingest_store_test.go Tests for CompareAndSwap behavior.
internal/db/migrations/2026-03-09.0-protocols.sql Create protocols table with classification + migration statuses.
internal/db/migrations/2026-02-20.0-protocol_wasms.sql Create protocol_wasms table.
internal/db/migrations/2026-03-09.2-protocol_wasms_fk.sql Add FK from protocol_wasms.protocol_id → protocols.id.
internal/db/migrations/2026-03-09.0-protocol_contracts.sql Create protocol_contracts table referencing protocol_wasms.
internal/db/migrations/protocols/main.go Add embedded “protocol registration” SQL runner.
internal/db/migrations/protocols/000_placeholder.sql Placeholder for embedded protocol registration SQL files.
internal/db/migrate.go Add RunProtocolMigrations entrypoint for embedded protocol SQL.
internal/metrics/metrics.go Add protocol state production + cache metrics.
internal/metrics/metrics_test.go Tests for newly added protocol metrics.
internal/metrics/mocks.go Extend metrics mock with new protocol metric methods.
internal/ingest/ingest.go Wire checkpoint service and processor registry into ingest dependency setup.
internal/loadtest/runner.go Update loadtest wiring for new TokenIngestionService config API.
internal/integrationtests/main_test.go Add DataMigrationTestSuite to integration test runner.
cmd/root.go Register new protocol-setup and protocol-migrate commands.
cmd/protocol_setup.go Add protocol-setup CLI command to classify WASMs via validators.
cmd/protocol_migrate.go Add protocol-migrate CLI with history/current-state subcommands and shared flag setup.
docs/feature-design/data-migrations.md Update feature design doc for protocol WASM/contract and migration flows.
internal/serve/graphql/generated/generated.go Import ordering adjustment (generated).
internal/serve/graphql/resolvers/account.resolvers.go Import ordering adjustment.
internal/serve/graphql/resolvers/mutations.resolvers.go Import ordering adjustment.
internal/serve/graphql/resolvers/queries.resolvers.go Import ordering adjustment.
internal/serve/graphql/resolvers/statechange.resolvers.go Flatten resolver type declarations (no functional change).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@aristidesstaffieri aristidesstaffieri force-pushed the feature/protocol-migrate-history branch from 262156b to 1812163 Compare March 30, 2026 18:44
@aristidesstaffieri aristidesstaffieri force-pushed the feature/protocol-migrate-current-state branch 2 times, most recently from d2f13ab to ca19b56 Compare March 30, 2026 19:17
@aristidesstaffieri aristidesstaffieri marked this pull request as ready for review March 30, 2026 19:17
@aristidesstaffieri aristidesstaffieri force-pushed the feature/protocol-migrate-history branch from 13f61d7 to a22f382 Compare April 2, 2026 21:59
…s tracking

  - Add known_wasms table (migration, model, mock, and data layer tests) for tracking WASM hashes during checkpoint population
  - Add KnownWasm field to Models struct
  - Create WasmIngestionService (wasm_ingestion.go) that runs protocol validators against WASM bytecode and batch-persists hashes to known_wasms
  - Create CheckpointService (checkpoint.go) that orchestrates single-pass checkpoint population, delegating ContractCode entries to both WasmIngestionService and
  TokenProcessor, and all other entries to TokenProcessor
  - Extract readerFactory on checkpointService for injectable checkpoint reader creation
  - Extract TokenProcessor interface and NewTokenProcessor from TokenIngestionService, moving checkpoint iteration logic out of token_ingestion.go into checkpoint.go
  - Remove db, archive, and PopulateAccountTokens from TokenIngestionService interface and struct
  - Remove dbPool parameter from NewTokenIngestionServiceForLoadtest
  - Wire CheckpointService into IngestServiceConfig and ingestService
  - Update ingest_live.go to call checkpointService.PopulateFromCheckpoint instead of tokenIngestionService.PopulateAccountTokens
  - Update ingest.go setupDeps to construct WasmIngestionService and CheckpointService
  - Add ContractValidatorMock, ProtocolValidatorMock, ChangeReaderMock, CheckpointServiceMock, WasmIngestionServiceMock, TokenProcessorMock, and TokenIngestionServiceMock
  updates to mocks.go
  - Add unit tests for WasmIngestionService (10 cases covering ProcessContractCode and PersistKnownWasms)
  - Add unit tests for CheckpointService (16 cases covering entry routing, error propagation, and context cancellation)
  Introduces the infrastructure for protocol processors to produce and
  persist protocol-specific state during live ledger ingestion, gated by
  per-protocol compare-and-swap cursors that coordinate with concurrent
  migration processes.

  Key changes:
  - ProtocolProcessor interface and ProtocolProcessorInput for protocol-
    specific ledger analysis and state persistence
  - Processor registry (RegisterProcessor/GetAllProcessors) for protocol
    processor discovery at startup
  - Dual CAS gating in PersistLedgerData (step 5.5): per-protocol history
    and current_state cursors ensure exactly-once writes even when live
    ingestion and migration run concurrently
  - Protocol contract cache with periodic refresh to avoid per-ledger DB
    queries for classified contracts
  - Data layer additions: IngestStoreModel.GetTx, CompareAndSwap,
    ProtocolContractsModel.GetByProtocolID, ProtocolsModel.GetClassified

  Tests:
  - Unit tests for processor registry (concurrent safety, overwrite, etc.)
  - 5 subtests for PersistLedgerData CAS gating (win, lose, behind, no
    cursor, no processors) using a real test DB and sentinel-writing
    testProtocolProcessor
  - Docker integration test (ProtocolStateProductionTestSuite) exercising
    CAS gating against a live ingest container's DB in three phases
   Combine protocol setup and protocol state tests into a shared
   DataMigrationTestSuite. Use real SEP41 setup classification plus
   manual cursor seeding to verify live ingestion produces protocol
   history/current state only when the protocol cursors are ready,
   and stays inert when they are absent.
  When any GetByProtocolID call fails during cache refresh, lastRefreshLedger
  was never updated, causing the staleness check to trigger on every ledger
  instead of every 100th — a 100x query amplification. Make the ledger update
  unconditional since the cache already preserves previous entries on partial
  failure, so data integrity is not at risk. Add warn-level logging to
  distinguish partial from full refreshes.
…on engine

  Implement the current-state migration CLI command and underlying service,
  then extract a shared protocolMigrateEngine parameterized by a
  migrationStrategy struct to eliminate ~1100 lines of duplication between
  the history and current-state implementations.

  The current-state migration builds protocol state from a user-specified
  --start-ledger forward to the tip, converging with live ingestion via
  CAS-gated cursors on protocol_{ID}_current_state_cursor. It mirrors the
  history migration with 5 substitutions: status field, status update
  method, cursor name function, persist method, and start ledger source.

  Changes:
  - Add UpdateCurrentStateMigrationStatus to ProtocolsModel and interface
  - Create protocolMigrateEngine with migrationStrategy in protocol_migrate.go
  - Rewrite history and current-state services as thin wrappers (~100 lines each)
  - Consolidate tests into shared engine suite (18 cases) + thin strategy tests
  - Extract buildMigrationCommand/runMigration helpers in CLI
  - Add integration test: TestCurrentStateMigrationThenLiveIngestionHandoff
…sy-loops

  Return a clear error upfront when maxRetries <= 0 or maxBackoff <= 0
  instead of silently producing a confusing "failed after 0 attempts: <nil>"
  or spinning with zero backoff. Add tests for both validation cases.
@aristidesstaffieri aristidesstaffieri force-pushed the feature/protocol-migrate-current-state branch 2 times, most recently from a0acdff to 3b89302 Compare April 6, 2026 15:56
  Add IngestStoreModel.GetMany to fetch multiple ingest_store keys in a
  single WHERE key = ANY($1) query. Refactor
  protocolProcessorsEligibleForProduction to collect all cursor names
  upfront and issue one GetMany call instead of 2N individual Get calls
  (one history + one current-state cursor per protocol per ledger).
@aristidesstaffieri aristidesstaffieri force-pushed the feature/protocol-migrate-current-state branch from 3b89302 to 702bef3 Compare April 6, 2026 16:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants