Skip to content
52 changes: 52 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -351,6 +351,58 @@ This is a thin composition of `GetDataAsync` followed by `WaitForIdleAsync`. The

`WaitForIdleAsync()` provides race-free synchronization with background operations for tests. Uses "was idle at some point" semantics — does not guarantee still idle after completion. See `docs/invariants.md` (Activity tracking invariants).

## Multi-Layer Cache

For workloads with high-latency data sources, you can compose multiple `WindowCache` instances into a layered stack. Each layer uses the layer below it as its data source, allowing you to trade memory for reduced data-source I/O.

```csharp
await using var cache = LayeredWindowCacheBuilder<int, byte[], IntegerFixedStepDomain>
.Create(realDataSource, domain)
.AddLayer(new WindowCacheOptions( // L2: deep background cache
leftCacheSize: 10.0,
rightCacheSize: 10.0,
readMode: UserCacheReadMode.CopyOnRead,
leftThreshold: 0.3,
rightThreshold: 0.3))
.AddLayer(new WindowCacheOptions( // L1: user-facing cache
leftCacheSize: 0.5,
rightCacheSize: 0.5,
readMode: UserCacheReadMode.Snapshot))
.Build();

var result = await cache.GetDataAsync(range, ct);
```

`LayeredWindowCache` implements `IWindowCache` and is `IAsyncDisposable` — it owns and disposes all layers when you dispose it.

**Recommended layer configuration pattern:**
- **Inner layers** (closest to the data source): `CopyOnRead`, large buffer sizes (5–10×), handles the heavy prefetching
- **Outer (user-facing) layer**: `Snapshot`, small buffer sizes (0.3–1.0×), zero-allocation reads

> **Important — buffer ratio requirement:** Inner layer buffers must be **substantially** larger
> than outer layer buffers, not merely slightly larger. When the outer layer rebalances, it
> fetches missing ranges from the inner layer via `GetDataAsync`. Each fetch publishes a
> rebalance intent on the inner layer. If the inner layer's `NoRebalanceRange` is not wide
> enough to contain the outer layer's full `DesiredCacheRange`, the inner layer will also
> rebalance — and re-center toward only one side of the outer layer's gap, leaving it poorly
> positioned for the next rebalance. With undersized inner buffers this becomes a continuous
> cycle (cascading rebalance thrashing). Use a 5–10× ratio and `leftThreshold`/`rightThreshold`
> of 0.2–0.3 on inner layers to ensure the inner layer's stability zone absorbs the outer
> layer's rebalance fetches. See `docs/architecture.md` (Cascading Rebalance Behavior) and
> `docs/scenarios.md` (Scenarios L6 and L7) for the full explanation.

**Three-layer example:**
```csharp
await using var cache = LayeredWindowCacheBuilder<int, byte[], IntegerFixedStepDomain>
.Create(realDataSource, domain)
.AddLayer(l3Options) // L3: 10× CopyOnRead — network/disk absorber
.AddLayer(l2Options) // L2: 2× CopyOnRead — mid-level buffer
.AddLayer(l1Options) // L1: 0.5× Snapshot — user-facing
.Build();
```

For detailed guidance see `docs/storage-strategies.md`.

## License

MIT
154 changes: 154 additions & 0 deletions docs/architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -316,6 +316,160 @@ Disposal respects the single-writer architecture:

---

## Multi-Layer Caches

### Overview

Multiple `WindowCache` instances can be stacked into a cache pipeline where each layer's
`IDataSource` is the layer below it. This is built into the library via three public types:

- **`WindowCacheDataSourceAdapter`** — adapts any `IWindowCache` as an `IDataSource` so it can
serve as a backing store for an outer `WindowCache`.
- **`LayeredWindowCacheBuilder`** — fluent builder that wires the layers together and returns a
`LayeredWindowCache` that owns and disposes all of them.
- **`LayeredWindowCache`** — thin `IWindowCache` wrapper that delegates `GetDataAsync` to the
outermost layer, awaits all layers sequentially (outermost-to-innermost) on `WaitForIdleAsync`,
and disposes all layers outermost-first on disposal.

### Architectural Properties

**Each layer is an independent `WindowCache`.**
Every layer obeys the full single-writer architecture, decision-driven execution, and smart
eventual consistency model described in this document. There is no shared state between layers.

**Data flows inward on miss, outward on return.**
When the outermost layer does not have data in its window, it calls the adapter's `FetchAsync`,
which calls `GetDataAsync` on the next inner layer. This cascades inward until the real data
source is reached. Each layer then caches the data it fetched and returns it up the chain.

**Full-stack convergence via `WaitForIdleAsync`.**
`WaitForIdleAsync` on `LayeredWindowCache` awaits all layers sequentially, outermost to innermost.
The outermost layer must be awaited first, because its rebalance drives fetch requests (via the
adapter) into inner layers — only once the outer layer is idle can inner layers be known to have
received all pending work. This guarantees that calling `GetDataAndWaitForIdleAsync` on a
`LayeredWindowCache` waits for the entire cache stack to converge, not just the user-facing layer.
Each inner layer independently manages its own idle state via `AsyncActivityCounter`.

**Consistent model — not strong consistency between layers.**
The adapter uses `GetDataAsync` (eventual consistency), not `GetDataAndWaitForIdleAsync`. Inner
layers are not forced to converge before serving the outer layer. Each layer serves correct data
immediately; prefetch optimization propagates asynchronously at each layer independently.

**No new concurrency model.** A layered cache is not a multi-consumer scenario. All user
requests flow through the single outermost layer, which remains the sole logical consumer of the
next inner layer (via the adapter). The single-consumer model holds at every layer boundary.

**Disposal order.** `LayeredWindowCache.DisposeAsync` disposes layers outermost-first:
the user-facing layer is stopped first (no new requests flow into inner layers), then each inner
layer is disposed in turn. This mirrors the single-writer disposal sequence at each layer.

### Recommended Layer Configuration

| Layer | `UserCacheReadMode` | Buffer size | Purpose |
|---------------------------------------------|---------------------|-------------|----------------------------------------|
| Innermost (deepest, closest to data source) | `CopyOnRead` | 5–10× | Wide prefetch window; absorbs I/O cost |
| Intermediate (optional) | `CopyOnRead` | 1–3× | Narrows window toward working set |
| Outermost (user-facing) | `Snapshot` | 0.3–1.0× | Zero-allocation reads; minimal memory |

Inner layers with `CopyOnRead` make cache writes cheap (growable list, no copy on write) while
outer `Snapshot` layers make reads cheap (single contiguous array, zero per-read allocation).

### Cascading Rebalance Behavior

This is the most important configuration concern in a layered cache setup.

#### Mechanism

When L1 rebalances, its `CacheDataExtensionService` computes missing ranges
(`DesiredCacheRange \ AssembledRangeData`) and calls the batch `FetchAsync(IEnumerable<Range>, ct)`
on the `WindowCacheDataSourceAdapter`. Because the adapter only implements the single-range
`FetchAsync` overload, the default `IDataSource` interface implementation dispatches one
parallel call per missing range via `Task.WhenAll`.

Each call reaches L2's `GetDataAsync`, which:
1. Serves the data immediately (from L2's cache or by fetching from L2's own data source)
2. **Publishes a rebalance intent on L2** with that individual range

When L1's `DesiredCacheRange` extends beyond L2's current window on both sides, L1's rebalance
produces two gap ranges (left and right). Both `GetDataAsync` calls on L2 happen in parallel.
L2's intent loop processes whichever intent it sees last ("latest wins"), and if that range
falls outside L2's `NoRebalanceRange`, L2 schedules its own background rebalance.

This is a **cascading rebalance**: L1's rebalance triggers L2's rebalance. Under sequential
access with correct configuration this should be rare. Under misconfiguration it becomes a
continuous cycle — every L1 rebalance triggers an L2 rebalance, which re-centers L2 toward
just one gap side, leaving L2 poorly positioned for L1's next rebalance.

#### Natural Mitigations Already in Place

The system provides several natural defences against cascading rebalances, even before
configuration is considered:

- **"Latest wins" semantics**: When two parallel `GetDataAsync` calls publish intents on L2,
the intent loop processes only the surviving (latest) intent. At most one L2 rebalance is
triggered per L1 rebalance burst, regardless of how many gap ranges L1 fetched.
- **Debounce delay**: L2's debounce delay further coalesces rapid sequential intent publications.
Parallel intents from a single L1 rebalance will typically be absorbed into one debounce window.
- **Decision engine work avoidance**: If the surviving intent range falls within L2's
`NoRebalanceRange`, L2's Decision Engine rejects rebalance at Stage 1 (fast path). No L2
rebalance is triggered at all. This is the **desired steady-state** under correct configuration.

#### Configuration Requirements

The natural mitigations are only effective when L2's buffer is substantially larger than L1's.
The goal is that L1's full `DesiredCacheRange` fits comfortably within L2's `NoRebalanceRange`
during normal sequential access — making Stage 1 rejection the norm, not the exception.

**Buffer ratio rule of thumb:**

| Layer | `leftCacheSize` / `rightCacheSize` | `leftThreshold` / `rightThreshold` |
|----------------|------------------------------------|--------------------------------------------|
| L1 (outermost) | 0.3–1.0× | 0.1–0.2 (can be tight — L2 absorbs misses) |
| L2 (inner) | 5–10× L1's buffer | 0.2–0.3 (wider stability zone) |
| L3+ (deeper) | 3–5× the layer above | 0.2–0.3 |

With these ratios, L1's `DesiredCacheRange` (which expands L1's buffer around the request)
typically falls well within L2's `NoRebalanceRange` (which is L2's buffer shrunk by its
thresholds). L2's Decision Engine skips rebalance at Stage 1, and no cascading occurs.

**Why the ratio matters more than the absolute size:**

Suppose L1 has `leftCacheSize=1.0, rightCacheSize=1.0` and `requestedRange` has length 100.
L1's `DesiredCacheRange` will be approximately `[request - 100, request + 100]` (length 300).
For L2's Stage 1 to reject the rebalance, L2's `NoRebalanceRange` must contain that
`[request - 100, request + 100]` interval. L2's `NoRebalanceRange` is derived from
`CurrentCacheRange` by applying L2's thresholds inward. So L2 needs a `CurrentCacheRange`
substantially larger than L1's `DesiredCacheRange`.

#### Anti-Pattern: Buffers Too Close in Size

**What goes wrong when L2's buffer is similar to L1's:**

1. User scrolls → L1 rebalances, extending to `[50, 300]`
2. L1 fetches left gap `[50, 100)` and right gap `(250, 300]` from L2 in parallel
3. Both ranges fall outside L2's `NoRebalanceRange` (L2's buffer isn't large enough to cover them)
4. L2 re-centers toward the last-processed gap — say, `(250, 300]`
5. L2's `CurrentCacheRange` is now `[200, 380]`
6. User scrolls again → L1 rebalances to `[120, 370]`
7. Left gap `[120, 200)` falls outside L2's window — L2 must fetch from its own data source
8. L2 re-centers again → oscillation

**Symptoms:** `l2.RebalanceExecutionCompleted` count approaches `l1.RebalanceExecutionCompleted`.
The inner layer provides no meaningful buffering benefit. Data source I/O per user request is
not reduced compared to a single-layer cache.

**Resolution:** Increase L2's `leftCacheSize` and `rightCacheSize` to 5–10× L1's values, and
set L2's `leftThreshold` / `rightThreshold` to 0.2–0.3.

### See Also

- `README.md` — Multi-Layer Cache usage examples and configuration warning
- `docs/scenarios.md` — Scenarios L6 (cascading rebalance mechanics) and L7 (anti-pattern)
- `docs/storage-strategies.md` — Storage strategy trade-offs for layered configs
- `docs/components/public-api.md` — API reference for the three new public types

---

## Invariants

This document explains the model; the formal guarantees live in `docs/invariants.md`.
Expand Down
29 changes: 29 additions & 0 deletions docs/components/overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ The system is easier to reason about when components are grouped by:

- Public facade: `WindowCache<TRange, TData, TDomain>`
- Public extensions: `WindowCacheExtensions` — opt-in strong consistency mode (`GetDataAndWaitForIdleAsync`)
- Multi-layer support: `WindowCacheDataSourceAdapter`, `LayeredWindowCacheBuilder`, `LayeredWindowCache`
- User Path: assembles requested data and publishes intent
- Intent loop: observes latest intent and runs analytical validation
- Execution: performs debounced, cancellable rebalance work and mutates cache state
Expand Down Expand Up @@ -54,6 +55,34 @@ The system is easier to reason about when components are grouped by:
├── 🟦 RebalanceExecutor<TRange, TData, TDomain>
└── 🟦 CacheDataExtensionService<TRange, TData, TDomain>
└── uses → 🟧 IDataSource<TRange, TData> (user-provided)

──────────────────────────── Multi-Layer Support ────────────────────────────

🟦 LayeredWindowCacheBuilder<TRange, TData, TDomain> [Fluent Builder]
│ Static Create(dataSource, domain) → builder
│ AddLayer(options, diagnostics?) → builder (fluent chain)
│ Build() → LayeredWindowCache
│ internally wires:
│ IDataSource → WindowCache → WindowCacheDataSourceAdapter
│ │
│ ▼
│ WindowCache → WindowCacheDataSourceAdapter → ...
│ │
│ ▼ (outermost)
└─────────────────────────────────► WindowCache
(user-facing layer, index = LayerCount-1)

🟦 LayeredWindowCache<TRange, TData, TDomain> [IWindowCache wrapper]
│ LayerCount: int
│ GetDataAsync() → delegates to outermost WindowCache
│ WaitForIdleAsync() → awaits all layers sequentially, outermost to innermost
│ DisposeAsync() → disposes all layers outermost-first

🟦 WindowCacheDataSourceAdapter<TRange, TData, TDomain> [IDataSource adapter]
│ Wraps IWindowCache as IDataSource
│ FetchAsync() → calls inner cache's GetDataAsync()
│ wraps ReadOnlyMemory<TData> in ReadOnlyMemoryEnumerable<TData> for RangeChunk (avoids temp TData[] alloc)
```

**Component Type Legend:**
Expand Down
50 changes: 49 additions & 1 deletion docs/components/public-api.md
Original file line number Diff line number Diff line change
Expand Up @@ -145,7 +145,55 @@ Composes `GetDataAsync` + `WaitForIdleAsync` into a single call. Returns the sam

**See**: `README.md` (Strong Consistency Mode section) and `docs/architecture.md` for broader context.

## See Also
## Multi-Layer Cache

Three classes support building layered cache stacks where each layer's data source is the layer below it:

### WindowCacheDataSourceAdapter\<TRange, TData, TDomain\>

**File**: `src/SlidingWindowCache/Public/WindowCacheDataSourceAdapter.cs`

**Type**: `sealed class` implementing `IDataSource<TRange, TData>`

Wraps an `IWindowCache` as an `IDataSource`, allowing any `WindowCache` to act as the data source for an outer `WindowCache`. Data is retrieved using eventual consistency (`GetDataAsync`).

- Wraps `ReadOnlyMemory<TData>` (returned by `IWindowCache.GetDataAsync`) in a `ReadOnlyMemoryEnumerable<TData>` to satisfy the `IEnumerable<TData>` contract of `IDataSource.FetchAsync`. This avoids allocating a temporary `TData[]` copy — the wrapper holds only a reference to the existing backing array via `ReadOnlyMemory<TData>`, and the data is enumerated lazily in a single pass during the outer cache's rematerialization.
- Does **not** own the wrapped cache; the caller is responsible for disposing it.

### LayeredWindowCache\<TRange, TData, TDomain\>

**File**: `src/SlidingWindowCache/Public/LayeredWindowCache.cs`

**Type**: `sealed class` implementing `IWindowCache<TRange, TData, TDomain>` and `IAsyncDisposable`

A thin wrapper that:
- Delegates `GetDataAsync` to the outermost layer.
- **`WaitForIdleAsync` awaits all layers sequentially, outermost to innermost.** The outer layer is awaited first because its rebalance drives fetch requests into inner layers. This ensures `GetDataAndWaitForIdleAsync` correctly waits for the entire cache stack to converge.
- **Owns** all layer `WindowCache` instances and disposes them in reverse order (outermost first) when disposed.
- Exposes `LayerCount` for inspection.

Typically created via `LayeredWindowCacheBuilder.Build()` rather than directly.

### LayeredWindowCacheBuilder\<TRange, TData, TDomain\>

**File**: `src/SlidingWindowCache/Public/LayeredWindowCacheBuilder.cs`

**Type**: `sealed class` — fluent builder

```csharp
await using var cache = LayeredWindowCacheBuilder<int, byte[], IntegerFixedStepDomain>
.Create(realDataSource, domain)
.AddLayer(deepOptions) // L2: inner layer (CopyOnRead, large buffers)
.AddLayer(userOptions) // L1: outer layer (Snapshot, small buffers)
.Build();
```

- `Create(dataSource, domain)` — factory entry point; validates both `dataSource` and `domain` are not null.
- `AddLayer(options, diagnostics?)` — adds a layer on top; first call = innermost layer, last call = outermost (user-facing).
- `Build()` — constructs all `WindowCache` instances, wires them via `WindowCacheDataSourceAdapter`, and wraps them in `LayeredWindowCache`.
- Throws `InvalidOperationException` from `Build()` if no layers were added.

**See**: `README.md` (Multi-Layer Cache section) and `docs/storage-strategies.md` for recommended layer configuration patterns.

- `docs/boundary-handling.md`
- `docs/diagnostics.md`
Expand Down
Loading