Skip to content

Implement in-place seed for assembly pipeline#323

Open
folbricht wants to merge 10 commits intomasterfrom
inplace-seed
Open

Implement in-place seed for assembly pipeline#323
folbricht wants to merge 10 commits intomasterfrom
inplace-seed

Conversation

@folbricht
Copy link
Owner

Summary

Refactors the assembly pipeline into a plan-based architecture and adds in-place seed support, enabling efficient file reconstruction when the target file already contains chunk data at different offsets.

  • Introduces AssemblePlan that pre-computes all chunk placements into a DAG of steps with explicit dependencies, replacing the interleaved sequencer approach
  • Adds InPlaceSeed which rearranges chunks already present in the target file using Tarjan's SCC algorithm for cycle detection and buffer-break resolution
  • Splits assembly sources into composable types: skipInPlace, inPlaceCopy, fileSeedSource, selfSeedSegment, copyFromStore
  • Self-seed matching uses longestMatchFrom for longer contiguous sequences
  • Plan validation detects stale file seeds before execution starts
  • Concurrent validation and initial in-place scan
  • Integration test exercising all source types together (in-place skip, in-place copy with cycles, file seed, self-seed, store fetch) with variable-size chunks

Closes #312

Introduce AssemblePlan that separates planning from execution in file
assembly. The plan pre-computes all chunk placements (self-seed, file
seeds, store fetches, skip-in-place) into a DAG of steps with explicit
dependencies, replacing the interleaved sequencer approach.

This lays the groundwork for #312 (destination-as-seed) by making
assembly sources composable and the planning phase extensible.

Key changes:
- New AssemblePlan with functional options and step-based execution
- Split assembly sources into separate files (fileseed, selfseed, store,
  skip)
- Self-seed matching now uses longestMatchFrom for longer sequences
- Plan validation detects stale file seeds before execution
- Comprehensive tests for plan generation and in-place detection
- Remove sequencer.go, selfseed.go in favor of new plan types

Closes #312
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

desync: use destination chunks as an additional seed source

1 participant