From 486efd567ceb7cab59b45989026724d2584d8853 Mon Sep 17 00:00:00 2001 From: Claude Date: Wed, 17 Dec 2025 18:20:52 +0000 Subject: [PATCH 1/3] docs: add WAL vs changeset replication tradeoffs analysis Evaluates Litestream and WAL-based replication as alternatives to SyndDB's changeset-based approach. Documents why changesets are preferred for validator verification while acknowledging WAL's strengths for disaster recovery. --- docs/WAL_VS_CHANGESETS.md | 204 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 204 insertions(+) create mode 100644 docs/WAL_VS_CHANGESETS.md diff --git a/docs/WAL_VS_CHANGESETS.md b/docs/WAL_VS_CHANGESETS.md new file mode 100644 index 00000000..6f00b103 --- /dev/null +++ b/docs/WAL_VS_CHANGESETS.md @@ -0,0 +1,204 @@ +# WAL vs Changeset Replication: Tradeoffs for SyndDB + +This document evaluates two approaches to SQLite replication and explains why SyndDB uses changeset-based replication via the SQLite Session Extension. + +## Background + +SyndDB needs to capture database changes and publish them for validator verification. Two primary approaches exist: + +1. **WAL-based replication** - Capture Write-Ahead Log frames (physical page changes) +2. **Changeset-based replication** - Capture logical changes via Session Extension + +Tools like [Litestream](https://litestream.io/) use WAL-based replication for disaster recovery. This document evaluates whether that approach would work for SyndDB. + +## Comparison Summary + +| Aspect | WAL-Based | Changeset-Based | +|--------|-----------|-----------------| +| **Capture level** | Physical (pages) | Logical (rows) | +| **Integration** | External daemon | In-process library | +| **What's captured** | All page writes | INSERT/UPDATE/DELETE | +| **Auditability** | Opaque bytes | Inspectable operations | +| **Payload size** | Larger (full pages) | Smaller (changed values) | +| **Determinism** | Architecture-dependent | Architecture-independent | +| **Complexity** | Monitor files | Session lifecycle | + +## WAL-Based Replication + +### How It Works + +SQLite's Write-Ahead Log (WAL) mode writes changes to a `-wal` file before checkpointing them to the main database. WAL replication: + +1. Monitors the WAL file for new frames +2. Copies frames to replica storage +3. Periodically captures full database snapshots +4. Replays WAL frames on top of snapshots for recovery + +### Advantages + +| Benefit | Description | +|---------|-------------| +| **Zero application integration** | Just monitor file changes, no code modifications | +| **Complete capture** | Gets everything: data, schema, pragmas, vacuum | +| **No runtime overhead** | SQLite writes WAL anyway; just copy the frames | +| **Stable format** | WAL format is well-documented and rarely changes | +| **Point-in-time recovery** | Can restore to any WAL frame | + +### Disadvantages + +| Drawback | Description | +|----------|-------------| +| **Physical, not logical** | Contains page bytes, not "INSERT INTO users" | +| **Architecture-dependent** | Page layout may differ across platforms | +| **Larger payloads** | Full 4KB pages vs just changed column values | +| **Opaque to validators** | Can't easily inspect "what operation occurred" | +| **Checkpoint coordination** | Must prevent SQLite from checkpointing before capture | +| **Schema changes implicit** | Must parse page content to detect DDL | + +### Litestream Specifics + +[Litestream](https://litestream.io/) is a well-maintained WAL replication tool that: + +- Runs as an external daemon or sidecar +- Supports S3, GCS, Azure, SFTP, and other backends +- Uses "generations" (WAL eras) with frame indices +- Provides point-in-time recovery within retention window + +**What Litestream lacks for SyndDB:** + +- Global sequence numbers (has generation + frame index, not monotonic sequence) +- Cryptographic signing (no COSE_Sign1 or similar) +- TEE attestation integration +- Logical change visibility for validators + +## Changeset-Based Replication + +### How It Works + +The [SQLite Session Extension](https://www.sqlite.org/sessionintro.html) hooks into SQLite's internal change tracking: + +1. Attach a session to a database connection +2. Session records logical changes (INSERT/UPDATE/DELETE with values) +3. Generate changeset blob on commit +4. Apply changesets to replicas for deterministic reconstruction + +### Advantages + +| Benefit | Description | +|---------|-------------| +| **Logical operations** | "UPDATE users SET balance=100 WHERE id=5" | +| **Auditable** | Validators can inspect exactly what changed | +| **Compact** | Only changed columns, not full pages | +| **Deterministic** | Same changesets produce same results everywhere | +| **Schema-aware** | Session extension knows about DDL changes | +| **Row-level granularity** | Can filter, inspect, or reject individual changes | + +### Disadvantages + +| Drawback | Description | +|----------|-------------| +| **Requires integration** | Must attach session to connection in application | +| **Thread-local state** | Sessions bound to creating thread | +| **Some operations missed** | PRAGMA, ATTACH, VACUUM not captured | +| **Memory overhead** | Tracks pending changes until changeset generated | +| **Lifecycle management** | Must handle session enable/disable around transactions | + +## Why SyndDB Uses Changesets + +SyndDB's architecture requires validators to verify operations, not just reconstruct state. Key requirements: + +### 1. SQL Operations as Audit Trail + +From the [SPEC](../SPEC.md): + +> "SQL operations themselves become the verifiable audit trail" + +Validators need to see logical operations to verify business rules (e.g., "withdrawals don't exceed balance"). WAL pages are opaque—you'd need to parse SQLite's internal B-tree format to extract logical changes. + +### 2. Cross-Architecture Determinism + +Validators may run on different hardware than the application. Changesets are architecture-independent (logical values), while WAL pages may have different layouts due to: + +- Endianness differences +- Alignment/padding variations +- Page size configurations + +### 3. Compact Wire Format + +SyndDB's CBOR wire format achieves ~40% size reduction. Changesets contain only changed values, while WAL frames contain full 4KB pages even for single-column updates. + +### 4. Schema Change Detection + +SyndDB triggers immediate snapshots on schema changes to ensure validators can reconstruct the database. The Session Extension detects DDL operations directly; WAL replication would require parsing page content to detect schema changes. + +### 5. Sequencing and Signing + +Every message (changeset or snapshot) gets: + +- A monotonic sequence number from the sequencer +- A COSE_Sign1 signature +- Optional TEE attestation + +WAL frames don't have this metadata. Adapting Litestream would require significant additions to provide sequencing and signing. + +## Extracting Logical Changes from WAL + +Could you parse WAL to get changeset-like output? Yes, but it's complex: + +``` +WAL Frame → Page Content → B-tree Parsing → Row Extraction → Logical Diff +``` + +This requires understanding: + +1. **WAL frame headers** - Documented, manageable +2. **SQLite page format** - B-tree interior/leaf pages, overflow pages +3. **B-tree structure** - Cell format, pointer maps, free lists +4. **Schema mapping** - Column types from sqlite_schema +5. **Change detection** - Diff before/after page content + +The Session Extension already does this correctly by hooking into SQLite's internal change tracking. Reimplementing it from the physical layer is substantial work with potential for subtle bugs. + +## When to Use Each Approach + +### Use WAL-Based (Litestream) When: + +- Disaster recovery / backup is the primary goal +- You need zero application code changes +- Validators just reconstruct state (don't verify operations) +- Point-in-time recovery to arbitrary moments is needed +- You're replicating to read-only replicas + +### Use Changeset-Based (Session Extension) When: + +- Validators need to audit logical operations +- Cross-architecture determinism is required +- Compact payloads matter +- You need schema change awareness +- Operations need sequencing and signing + +## Hybrid Approach + +These approaches can coexist for different purposes: + +``` +Application (TEE) +├── SQLite Database +├── SyndDB Client (changesets) → Sequencer → DA Layer → Validators +└── Litestream (WAL) → S3 → Disaster Recovery (not for validators) +``` + +Use changesets for the validator verification path and WAL for application-level backup. However, this adds operational complexity. + +## Conclusion + +SyndDB's requirements—auditable operations, deterministic replay, compact payloads, sequencing, and signing—align better with changeset-based replication. WAL-based tools like Litestream are excellent for backup and disaster recovery but don't provide the logical visibility that validators need. + +The Session Extension's complexity (thread-local state, lifecycle management) is the cost of getting logical change capture. This complexity is bounded and well-understood, whereas parsing WAL to extract logical changes would be reimplementing Session Extension functionality from scratch. + +## References + +- [SQLite Session Extension](https://www.sqlite.org/sessionintro.html) +- [SQLite WAL Mode](https://www.sqlite.org/wal.html) +- [Litestream Documentation](https://litestream.io/how-it-works/) +- [SyndDB SPEC](../SPEC.md) From f39e05871d1ba996a4c8706f60eef5495ce88318 Mon Sep 17 00:00:00 2001 From: Claude Date: Wed, 17 Dec 2025 18:26:23 +0000 Subject: [PATCH 2/3] docs: expand WAL vs changeset doc into design document Restructure as a discussion-focused design doc with: - Clear options (A/B/C/D) for decision making - Key decision factors with specific questions - Effort estimates and risk levels - 10 open questions for team discussion - Suggested experiments to validate assumptions - Tentative recommendation with rationale --- docs/WAL_VS_CHANGESETS.md | 360 ++++++++++++++++++++++++-------------- 1 file changed, 231 insertions(+), 129 deletions(-) diff --git a/docs/WAL_VS_CHANGESETS.md b/docs/WAL_VS_CHANGESETS.md index 6f00b103..e71c42db 100644 --- a/docs/WAL_VS_CHANGESETS.md +++ b/docs/WAL_VS_CHANGESETS.md @@ -1,204 +1,306 @@ -# WAL vs Changeset Replication: Tradeoffs for SyndDB +# Design Doc: WAL vs Changeset Replication -This document evaluates two approaches to SQLite replication and explains why SyndDB uses changeset-based replication via the SQLite Session Extension. +**Status:** Draft - Open for Discussion +**Author:** (auto-generated) +**Last Updated:** 2024-12 -## Background +## Context -SyndDB needs to capture database changes and publish them for validator verification. Two primary approaches exist: +SyndDB currently uses changeset-based replication via the SQLite Session Extension. This document evaluates whether WAL-based replication (e.g., Litestream) would be a better approach, and explores hybrid options. -1. **WAL-based replication** - Capture Write-Ahead Log frames (physical page changes) -2. **Changeset-based replication** - Capture logical changes via Session Extension +### What Prompted This Discussion -Tools like [Litestream](https://litestream.io/) use WAL-based replication for disaster recovery. This document evaluates whether that approach would work for SyndDB. +- Litestream is a mature, well-maintained tool for SQLite replication +- WAL-based replication requires zero application integration +- Session Extension has lifecycle complexity (thread-local state, enable/disable timing) +- Question: Are we overcomplicating capture when simpler tools exist? -## Comparison Summary +## Current Architecture + +``` +Application (TEE) +└── SQLite + Session Extension + └── Changesets → HTTP → Sequencer → COSE Sign → DA Layer → Validators +``` + +**Key characteristics:** +- Logical changes (INSERT/UPDATE/DELETE with values) +- In-process, same TEE as application +- Thread-local session state +- Automatic schema change detection triggers snapshots +- ~250 lines in `snapshot_sender.rs` + session integration + +## Options Under Consideration + +### Option A: Keep Current (Changeset-Based) + +Continue using SQLite Session Extension for change capture. + +### Option B: Replace with Litestream (WAL-Based) + +Use Litestream or similar WAL replication, adding sequencing/signing layer. + +### Option C: Hybrid Approach + +Use changesets for validator path, add Litestream for disaster recovery. + +### Option D: Custom WAL Parser + +Build our own WAL-to-logical-changes converter for more control. + +--- + +## Technical Comparison | Aspect | WAL-Based | Changeset-Based | |--------|-----------|-----------------| -| **Capture level** | Physical (pages) | Logical (rows) | +| **Capture level** | Physical (4KB pages) | Logical (row changes) | | **Integration** | External daemon | In-process library | -| **What's captured** | All page writes | INSERT/UPDATE/DELETE | +| **What's captured** | Everything (data, pragmas, vacuum) | INSERT/UPDATE/DELETE only | | **Auditability** | Opaque bytes | Inspectable operations | | **Payload size** | Larger (full pages) | Smaller (changed values) | | **Determinism** | Architecture-dependent | Architecture-independent | -| **Complexity** | Monitor files | Session lifecycle | +| **Complexity location** | Checkpoint coordination | Session lifecycle | +| **Maturity** | Litestream is battle-tested | Session Extension is SQLite core | + +### WAL-Based: Detailed Analysis + +**How it works:** +1. SQLite writes changes to `-wal` file +2. External daemon monitors WAL for new frames +3. Copies frames to replica storage before checkpoint +4. Recovery = snapshot + WAL replay + +**What Litestream provides:** +- S3, GCS, Azure, SFTP backend support +- Generation-based WAL sequencing +- Point-in-time recovery +- Automatic snapshot scheduling + +**What Litestream lacks:** +- Global monotonic sequence numbers +- Cryptographic signing (COSE_Sign1) +- TEE attestation integration +- Logical change visibility -## WAL-Based Replication +**Effort to adapt Litestream:** +- Fork and modify, or wrap with signing layer +- Add sequence number assignment (per-frame? per-commit?) +- Integrate TEE attestation +- Either: validators understand WAL, or parse WAL → logical changes -### How It Works +### Changeset-Based: Detailed Analysis -SQLite's Write-Ahead Log (WAL) mode writes changes to a `-wal` file before checkpointing them to the main database. WAL replication: +**How it works:** +1. Attach session to SQLite connection +2. Session hooks into SQLite's change tracking +3. Generate changeset blob containing logical changes +4. Changeset is deterministic and architecture-independent -1. Monitors the WAL file for new frames -2. Copies frames to replica storage -3. Periodically captures full database snapshots -4. Replays WAL frames on top of snapshots for recovery +**Current pain points:** +- Thread-local state requires careful lifecycle management +- Must enable/disable session around certain operations +- Some operations not captured (PRAGMA, ATTACH, VACUUM) +- Memory overhead from tracking pending changes -### Advantages +**What works well:** +- Validators see exactly what changed +- Compact payloads (only changed columns) +- Schema change detection is automatic +- Deterministic replay across architectures -| Benefit | Description | -|---------|-------------| -| **Zero application integration** | Just monitor file changes, no code modifications | -| **Complete capture** | Gets everything: data, schema, pragmas, vacuum | -| **No runtime overhead** | SQLite writes WAL anyway; just copy the frames | -| **Stable format** | WAL format is well-documented and rarely changes | -| **Point-in-time recovery** | Can restore to any WAL frame | +--- -### Disadvantages +## Key Decision Factors -| Drawback | Description | -|----------|-------------| -| **Physical, not logical** | Contains page bytes, not "INSERT INTO users" | -| **Architecture-dependent** | Page layout may differ across platforms | -| **Larger payloads** | Full 4KB pages vs just changed column values | -| **Opaque to validators** | Can't easily inspect "what operation occurred" | -| **Checkpoint coordination** | Must prevent SQLite from checkpointing before capture | -| **Schema changes implicit** | Must parse page content to detect DDL | +### 1. Validator Verification Model -### Litestream Specifics +**Question:** Do validators need to see logical operations, or just reconstruct state? -[Litestream](https://litestream.io/) is a well-maintained WAL replication tool that: +| If validators need to... | Then use... | +|--------------------------|-------------| +| Verify "balance >= withdrawal" | Changesets (logical) | +| Check "no self-trading" | Changesets (logical) | +| Just replay to same state | Either works | +| Audit specific row changes | Changesets (logical) | -- Runs as an external daemon or sidecar -- Supports S3, GCS, Azure, SFTP, and other backends -- Uses "generations" (WAL eras) with frame indices -- Provides point-in-time recovery within retention window +**Current SPEC says:** "SQL operations themselves become the verifiable audit trail" -**What Litestream lacks for SyndDB:** +This implies validators inspect operations, not just replay them. **Does this requirement still hold?** -- Global sequence numbers (has generation + frame index, not monotonic sequence) -- Cryptographic signing (no COSE_Sign1 or similar) -- TEE attestation integration -- Logical change visibility for validators +### 2. Cross-Architecture Determinism -## Changeset-Based Replication +WAL pages may differ across: +- Endianness (big vs little endian) +- Alignment/padding +- Page size configuration +- SQLite compile options -### How It Works +Changesets are architecture-independent by design. -The [SQLite Session Extension](https://www.sqlite.org/sessionintro.html) hooks into SQLite's internal change tracking: +**Question:** Will validators always run on identical architecture to the application? -1. Attach a session to a database connection -2. Session records logical changes (INSERT/UPDATE/DELETE with values) -3. Generate changeset blob on commit -4. Apply changesets to replicas for deterministic reconstruction +### 3. Payload Size -### Advantages +Rough comparison for a single-column UPDATE: +- WAL: 4KB page (minimum) +- Changeset: ~50-200 bytes (column value + metadata) -| Benefit | Description | -|---------|-------------| -| **Logical operations** | "UPDATE users SET balance=100 WHERE id=5" | -| **Auditable** | Validators can inspect exactly what changed | -| **Compact** | Only changed columns, not full pages | -| **Deterministic** | Same changesets produce same results everywhere | -| **Schema-aware** | Session extension knows about DDL changes | -| **Row-level granularity** | Can filter, inspect, or reject individual changes | +**Question:** Is bandwidth/storage cost a significant concern? -### Disadvantages +### 4. Operational Complexity -| Drawback | Description | -|----------|-------------| -| **Requires integration** | Must attach session to connection in application | -| **Thread-local state** | Sessions bound to creating thread | -| **Some operations missed** | PRAGMA, ATTACH, VACUUM not captured | -| **Memory overhead** | Tracks pending changes until changeset generated | -| **Lifecycle management** | Must handle session enable/disable around transactions | +| Approach | Application complexity | Infrastructure complexity | +|----------|------------------------|---------------------------| +| Changesets | Session lifecycle | None (in-process) | +| Litestream | None | Sidecar daemon, checkpoint coordination | +| Hybrid | Session lifecycle | Sidecar daemon | -## Why SyndDB Uses Changesets +**Question:** Where do we prefer complexity to live? -SyndDB's architecture requires validators to verify operations, not just reconstruct state. Key requirements: +### 5. What Operations Need Capturing? -### 1. SQL Operations as Audit Trail +| Operation | Changeset captures | WAL captures | +|-----------|-------------------|--------------| +| INSERT/UPDATE/DELETE | Yes | Yes | +| Schema changes (DDL) | Yes (triggers snapshot) | Yes (in pages) | +| PRAGMA changes | No | Yes | +| VACUUM | No | Yes | +| ATTACH/DETACH | No | Yes | -From the [SPEC](../SPEC.md): +**Question:** Do we need to capture PRAGMAs or VACUUM? -> "SQL operations themselves become the verifiable audit trail" +--- -Validators need to see logical operations to verify business rules (e.g., "withdrawals don't exceed balance"). WAL pages are opaque—you'd need to parse SQLite's internal B-tree format to extract logical changes. +## Hybrid Architecture (Option C) -### 2. Cross-Architecture Determinism +``` +Application (TEE) +├── SQLite Database +│ +├── SyndDB Client (changesets) +│ └── Sequencer → DA Layer → Validators +│ (validator verification path) +│ +└── Litestream (WAL) + └── S3/GCS → Disaster Recovery + (application backup, not for validators) +``` -Validators may run on different hardware than the application. Changesets are architecture-independent (logical values), while WAL pages may have different layouts due to: +**Use cases for WAL backup:** +- Application crashes before sending changesets +- Need to recover local state quickly +- Debug/forensics on raw database state +- Belt-and-suspenders redundancy -- Endianness differences -- Alignment/padding variations -- Page size configurations +**Downsides:** +- Two replication systems to operate +- WAL backup not useful for validator bootstrap +- Additional infrastructure (Litestream sidecar) -### 3. Compact Wire Format +--- -SyndDB's CBOR wire format achieves ~40% size reduction. Changesets contain only changed values, while WAL frames contain full 4KB pages even for single-column updates. +## Effort Estimates -### 4. Schema Change Detection +| Option | Estimated Effort | Risk Level | +|--------|------------------|------------| +| A: Keep current | None | Low | +| B: Replace with Litestream | 2-4 weeks + ongoing maintenance | High | +| C: Add Litestream for DR | 1 week | Low | +| D: Custom WAL parser | 4-8 weeks | Very High | -SyndDB triggers immediate snapshots on schema changes to ensure validators can reconstruct the database. The Session Extension detects DDL operations directly; WAL replication would require parsing page content to detect schema changes. +**Option B breakdown:** +- Fork Litestream or build wrapper: 1 week +- Add sequencing/signing: 1 week +- Modify validators to handle WAL or parse to logical: 1-2 weeks +- Testing and edge cases: 1 week -### 5. Sequencing and Signing +**Option D risks:** +- SQLite page format is internal/undocumented +- B-tree parsing is complex (overflow pages, pointer maps) +- Essentially reimplementing Session Extension from scratch -Every message (changeset or snapshot) gets: +--- -- A monotonic sequence number from the sequencer -- A COSE_Sign1 signature -- Optional TEE attestation +## Open Questions for Discussion -WAL frames don't have this metadata. Adapting Litestream would require significant additions to provide sequencing and signing. +### Architecture Questions -## Extracting Logical Changes from WAL +1. **Do validators need logical auditability?** + If validators only replay state (not inspect operations), WAL becomes viable. -Could you parse WAL to get changeset-like output? Yes, but it's complex: +2. **Is cross-architecture determinism required?** + If app and validators always share architecture, WAL determinism concerns go away. -``` -WAL Frame → Page Content → B-tree Parsing → Row Extraction → Logical Diff -``` +3. **Should snapshots be WAL-based or remain as full DB copies?** + Current: full SQLite file. Could be: WAL generation + frames. -This requires understanding: +### Operational Questions -1. **WAL frame headers** - Documented, manageable -2. **SQLite page format** - B-tree interior/leaf pages, overflow pages -3. **B-tree structure** - Cell format, pointer maps, free lists -4. **Schema mapping** - Column types from sqlite_schema -5. **Change detection** - Diff before/after page content +4. **What are the actual pain points with Session Extension today?** + Thread-local state? Memory overhead? Something else? -The Session Extension already does this correctly by hooking into SQLite's internal change tracking. Reimplementing it from the physical layer is substantial work with potential for subtle bugs. +5. **Is application-level disaster recovery (Option C) valuable?** + Do we have a recovery gap if app crashes before changeset send? -## When to Use Each Approach +6. **Where should complexity live - application or infrastructure?** + Session Extension = app complexity. Litestream = infra complexity. -### Use WAL-Based (Litestream) When: +### Performance Questions -- Disaster recovery / backup is the primary goal -- You need zero application code changes -- Validators just reconstruct state (don't verify operations) -- Point-in-time recovery to arbitrary moments is needed -- You're replicating to read-only replicas +7. **Is changeset generation a performance bottleneck?** + Have we measured Session Extension overhead? -### Use Changeset-Based (Session Extension) When: +8. **Is payload size a concern for DA layer costs?** + WAL would increase payload size significantly. -- Validators need to audit logical operations -- Cross-architecture determinism is required -- Compact payloads matter -- You need schema change awareness -- Operations need sequencing and signing +### Future Questions -## Hybrid Approach +9. **Do we anticipate needing PRAGMA/VACUUM capture?** + Currently not captured by changesets. -These approaches can coexist for different purposes: +10. **Would WAL simplify the FFI story for other languages?** + External daemon vs in-process library integration. -``` -Application (TEE) -├── SQLite Database -├── SyndDB Client (changesets) → Sequencer → DA Layer → Validators -└── Litestream (WAL) → S3 → Disaster Recovery (not for validators) -``` +--- + +## Experiments We Could Run + +### Experiment 1: Measure Session Extension Overhead +- Benchmark with/without session attached +- Measure memory usage during large transactions +- Identify actual (not theoretical) pain points + +### Experiment 2: Prototype Litestream Integration +- Run Litestream alongside existing system +- Measure WAL payload sizes vs changeset sizes +- Test checkpoint coordination in TEE environment + +### Experiment 3: WAL-to-Changeset Feasibility +- Prototype parsing WAL frames +- Assess complexity of extracting logical changes +- Determine if this is 2 weeks or 2 months of work + +--- + +## Recommendation -Use changesets for the validator verification path and WAL for application-level backup. However, this adds operational complexity. +**Tentative recommendation: Option A (keep current) with possible Option C (add Litestream for DR)** -## Conclusion +Rationale: +- Validator auditability requirement favors changesets +- Session Extension complexity is bounded and understood +- WAL adaptation would be significant effort for unclear benefit +- Litestream for DR is low-effort and provides safety net -SyndDB's requirements—auditable operations, deterministic replay, compact payloads, sequencing, and signing—align better with changeset-based replication. WAL-based tools like Litestream are excellent for backup and disaster recovery but don't provide the logical visibility that validators need. +**However, this should be validated against the open questions above.** -The Session Extension's complexity (thread-local state, lifecycle management) is the cost of getting logical change capture. This complexity is bounded and well-understood, whereas parsing WAL to extract logical changes would be reimplementing Session Extension functionality from scratch. +--- ## References - [SQLite Session Extension](https://www.sqlite.org/sessionintro.html) - [SQLite WAL Mode](https://www.sqlite.org/wal.html) -- [Litestream Documentation](https://litestream.io/how-it-works/) +- [Litestream How It Works](https://litestream.io/how-it-works/) - [SyndDB SPEC](../SPEC.md) +- Current implementation: `crates/synddb-client/src/session.rs`, `snapshot_sender.rs` From 464d1de172499525c2369e907db91b9ef4c13707 Mon Sep 17 00:00:00 2001 From: Will Papper Date: Fri, 26 Dec 2025 11:17:12 -0500 Subject: [PATCH 3/3] docs: add changeset inversion section to WAL vs changeset comparison MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Add section 6 covering changeset inversion as a key differentiator: - Document sqlite3changeset_invert() API and how it transforms operations - Explain why WAL cannot support inversion (forward-only, checkpointing) - List SyndDB use cases: validator rollback, dispute resolution, optimistic execution - Add inversion row to comparison table - Update recommendation rationale 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 --- docs/WAL_VS_CHANGESETS.md | 66 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 66 insertions(+) diff --git a/docs/WAL_VS_CHANGESETS.md b/docs/WAL_VS_CHANGESETS.md index e71c42db..58b1819e 100644 --- a/docs/WAL_VS_CHANGESETS.md +++ b/docs/WAL_VS_CHANGESETS.md @@ -62,6 +62,7 @@ Build our own WAL-to-logical-changes converter for more control. | **Determinism** | Architecture-dependent | Architecture-independent | | **Complexity location** | Checkpoint coordination | Session lifecycle | | **Maturity** | Litestream is battle-tested | Session Extension is SQLite core | +| **Inversion/undo** | Not possible (forward-only) | Native support via `sqlite3changeset_invert()` | ### WAL-Based: Detailed Analysis @@ -170,6 +171,69 @@ Rough comparison for a single-column UPDATE: **Question:** Do we need to capture PRAGMAs or VACUUM? +### 6. Changeset Inversion + +**This is a capability unique to changesets that WAL cannot provide.** + +The Session Extension provides `sqlite3changeset_invert()` to reverse any changeset: + +```c +int sqlite3changeset_invert( + int nIn, const void *pIn, // Input changeset + int *pnOut, void **ppOut // OUT: Inverse of input +); +``` + +**How inversion works:** +- **INSERT** becomes **DELETE** (removes the inserted row) +- **DELETE** becomes **INSERT** (re-inserts the deleted row with original values) +- **UPDATE** swaps old/new values (reverts to previous column values) + +If changeset `C+` is the inverse of `C`, then applying `C` followed by `C+` leaves the database unchanged. + +**Why WAL cannot support inversion:** + +WAL is a forward-only, append-only logging mechanism: + +1. **No logical operations**: WAL contains raw page images, not row-level changes. There's no concept of "the row that was inserted" - just binary page data. + +2. **Checkpointing destroys history**: When WAL frames are checkpointed back to the main database, the WAL is truncated or overwritten. Previous states are not preserved for reversal. + +3. **Undo requires full restore**: To "undo" with WAL, you must restore from a previous snapshot. There's no incremental inverse operation. + +**Benefits of inversion for SyndDB:** + +| Use Case | How Inversion Helps | +|----------|---------------------| +| **Validator rollback** | If invalid state detected at seq N, apply inverse of changeset N to revert | +| **Dispute resolution** | Surgically revert specific transactions without full restore | +| **Optimistic execution** | Apply tentatively, roll back if sequencer rejects/reorders | +| **Point-in-time recovery** | Store changesets + inverses for bidirectional replay | +| **Testing** | Apply changes, verify, then revert - no database reset needed | + +**Example: Validator rollback scenario** +``` +Sequence 100: Changeset C (valid) +Sequence 101: Changeset D (later found to violate constraint) +Sequence 102: Changeset D_inverse (surgical rollback) +Sequence 103: Changeset E (corrected operation) +``` + +With WAL, the validator would need to restore a full snapshot from before sequence 101. + +**Is inversion worth the Session Extension complexity?** + +Strong yes when: +- Validators may need to propose rollbacks for invalid transitions +- System requires point-in-time recovery without full snapshots +- Optimistic execution patterns are used (apply then verify) + +Less critical when: +- Forward-only replication is sufficient +- Full checkpoint restore is acceptable for all rollback scenarios + +**Question:** Do we anticipate validators needing fine-grained rollback, or is snapshot restore acceptable? + --- ## Hybrid Architecture (Option C) @@ -289,6 +353,7 @@ Application (TEE) Rationale: - Validator auditability requirement favors changesets +- Changeset inversion enables surgical rollback that WAL cannot provide - Session Extension complexity is bounded and understood - WAL adaptation would be significant effort for unclear benefit - Litestream for DR is low-effort and provides safety net @@ -302,5 +367,6 @@ Rationale: - [SQLite Session Extension](https://www.sqlite.org/sessionintro.html) - [SQLite WAL Mode](https://www.sqlite.org/wal.html) - [Litestream How It Works](https://litestream.io/how-it-works/) +- [sqlite3changeset_invert() API](https://sqlite.org/session/sqlite3changeset_invert.html) - [SyndDB SPEC](../SPEC.md) - Current implementation: `crates/synddb-client/src/session.rs`, `snapshot_sender.rs`