Skip to content

Search: Use staging index + alias swap for content date lookup sync#3098

Merged
reakaleek merged 2 commits intomainfrom
chartreuse-kitten
Apr 14, 2026
Merged

Search: Use staging index + alias swap for content date lookup sync#3098
reakaleek merged 2 commits intomainfrom
chartreuse-kitten

Conversation

@reakaleek
Copy link
Copy Markdown
Member

What

Replace the delete-then-reindex flow in ContentDateEnrichment with a staging index + atomic alias swap so there is never a window where the lookup index is empty.

Why

The previous sequence called DeleteLookupContentsAsync (which emptied the only durable lookup) before ReindexToLookupAsync wrote a replacement. If the reindex failed, the lookup was gone.

How

  • Rename _lookupIndex to _lookupAlias — the stable name is now an Elasticsearch alias over timestamped backing indices
  • SyncLookupIndexAsync creates a fresh staging index, reindexes into it, refreshes, then atomically swaps the alias via POST /_aliases
  • The old backing index is deleted only after the swap succeeds
  • EnsureLookupIndexAsync resolves the alias via GET /_alias/{name} and creates a backing index + alias if none exists
  • Remove DeleteLookupContentsAsync (no longer needed)

Test plan

  • Verify the project builds clean (dotnet build — 0 warnings, 0 errors)
  • On a fresh environment (no existing lookup index), confirm InitializeAsync creates a timestamped backing index and alias
  • On subsequent runs, confirm SyncLookupIndexAsync creates a new staging index, swaps the alias, and deletes the old index
  • Verify the enrich policy and pipeline continue to reference the alias transparently

🤖 Generated with Claude Code

Replace the delete-then-reindex flow in ContentDateEnrichment with a
staging index + atomic alias swap to eliminate the window where the
lookup index is empty. If a reindex fails, the previous lookup data
remains intact behind the alias.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@reakaleek reakaleek requested a review from a team as a code owner April 14, 2026 08:29
@reakaleek reakaleek requested a review from technige April 14, 2026 08:29
@reakaleek reakaleek added the fix label Apr 14, 2026
@reakaleek reakaleek requested a review from Mpdreamz April 14, 2026 08:31
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 14, 2026

📝 Walkthrough

Walkthrough

The ContentDateEnrichment class transitions from a direct index approach to an alias-based strategy for zero-downtime index updates. The lookup index reference is replaced with a stable alias that routes through a backing index. The sync flow changes from in-place delete-by-query and reindex to a blue-green pattern: create a timestamped staging backing index, reindex content into it, refresh it, atomically swap the alias pointer, then delete the previous backing index. Supporting helper methods are added to generate staging names, resolve current backing indices, manage alias operations, and handle cleanup.

Sequence Diagram(s)

sequenceDiagram
    participant Enricher as ContentDateEnrichment
    participant ES as Elasticsearch API
    participant Source as Lexical Source Alias

    Enricher->>ES: Resolve backing index for lookup alias (GET /_alias/...)
    ES-->>Enricher: currentBackingIndex
    Enricher->>ES: Create staging backing index (PUT /staging-<ts>)
    ES-->>Enricher: created
    Enricher->>ES: Reindex from Source alias to staging index (POST /_reindex)
    ES-->>Enricher: reindex task complete
    Enricher->>ES: Refresh staging index (POST /staging-<ts>/_refresh)
    ES-->>Enricher: refreshed
    Enricher->>ES: Swap alias to staging index (POST /_aliases add staging, remove old)
    ES-->>Enricher: alias swapped
    Enricher->>ES: Delete old backing index (DELETE /old-backing-index)
    ES-->>Enricher: deleted (or warning on failure)
Loading

Suggested labels

enhancement

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 16.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main change: replacing delete-then-reindex with staging index + alias swap for content date lookup synchronization.
Description check ✅ Passed The description is well-detailed and directly related to the changeset, explaining the what, why, and how of the staging index + alias swap refactoring.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
✨ Simplify code
  • Create PR with simplified code
  • Commit simplified code in branch chartreuse-kitten

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/Elastic.Markdown/Exporters/Elasticsearch/ContentDateEnrichment.cs`:
- Around line 57-58: RefreshIndexAsync currently only logs warnings on non-2xx
responses so SwapAliasAsync runs even if the index refresh failed; change the
flow so a failed refresh prevents the alias swap: modify RefreshIndexAsync to
throw (or alternatively return a boolean success) when the refresh response is
not successful, then update all call sites (the calls that currently do await
RefreshIndexAsync(...); await SwapAliasAsync(...); — and the other occurrences
near the SwapAliasAsync/ExecutePolicyAsync sequence) to either await
RefreshIndexAsync and only call SwapAliasAsync on success or let the thrown
exception propagate to stop execution; reference RefreshIndexAsync,
SwapAliasAsync and ExecutePolicyAsync when making the change.
- Around line 68-69: Resolve the alias and bootstrap logic: in
ResolveBackingIndexAsync treat only a 404 from GET /_alias/... as "alias
missing" and for other HTTP/network errors rethrow or propagate them so
transient failures aren't treated as absent alias; if the alias response yields
multiple backing indices, fail deterministically instead of using
FirstOrDefault() so you don't pick nondeterministically. Make
GenerateStagingName collision-resistant by adding a high-entropy suffix (e.g.,
DateTime.UtcNow.Ticks + a short GUID) so concurrent initializers produce unique
staging names. In SyncLookupIndexAsync, change the RefreshIndexAsync call so
that refresh failures throw (or return a non-recoverable error) instead of only
logging warnings, preventing subsequent calls to SwapAliasAsync and
ExecutePolicyAsync when the refresh did not succeed. Ensure references to
_lookupAlias, GenerateStagingName, ResolveBackingIndexAsync,
SyncLookupIndexAsync, RefreshIndexAsync, SwapAliasAsync, and ExecutePolicyAsync
are updated accordingly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 4bf33a51-b531-4bd8-9968-799263886172

📥 Commits

Reviewing files that changed from the base of the PR and between 0490b6f and bf671e1.

📒 Files selected for processing (1)
  • src/Elastic.Markdown/Exporters/Elasticsearch/ContentDateEnrichment.cs

Make RefreshIndexAsync throw on failure so SwapAliasAsync never runs
against an unrefreshed staging index. Distinguish 404 from transient
errors in ResolveBackingIndexAsync and fail deterministically when the
alias points at multiple indices. Add a GUID suffix to staging index
names for collision resistance under concurrent runs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@coderabbitai coderabbitai bot added enhancement and removed fix labels Apr 14, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
src/Elastic.Markdown/Exporters/Elasticsearch/ContentDateEnrichment.cs (1)

98-109: ⚠️ Potential issue | 🔴 Critical

Make the alias swap single-owner under concurrent runs.

EnsureLookupIndexAsync() bootstraps with SwapAliasAsync(null, ...), which becomes an add-only alias update. Elastic’s alias docs explicitly allow one alias to point to multiple indices, say that add creates the alias if it does not exist, and note that multi-action /_aliases requests can still return acknowledged: true with per-action failures when remove misses. Two overlapping initialize/sync calls can therefore leave _lookupAlias attached to more than one backing index while this code logs success, after which ResolveBackingIndexAsync() starts throwing. Parse errors / action_results and make the swap resilient to stale or null oldIndex (for example, remove the alias from all current backings in the same request, or re-resolve and retry before adding). (elastic.co)

Also applies to: 142-165

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/Elastic.Markdown/Exporters/Elasticsearch/ContentDateEnrichment.cs` around
lines 98 - 109, EnsureLookupIndexAsync currently calls SwapAliasAsync(null,
indexName, ct) which issues an add-only alias update and can leave _lookupAlias
attached to multiple indices under concurrent runs; update the bootstrap flow to
make the alias swap single-owner by first resolving current backings via
ResolveBackingIndexAsync (or listing all indices with the alias), then perform a
single multi-action aliases request that removes the alias from all current
backing indices and adds it to GenerateStagingName() in the same request (or, if
you prefer, loop: re-resolve and retry the remove+add if action_results/errors
indicate partial failures), and ensure SwapAliasAsync parses the alias API
response (errors/action_results) to detect per-action failures and surface/retry
them instead of assuming success; also apply the same change to the other
occurrence referenced around SwapAliasAsync usage (lines ~142-165) and to
CreateLookupIndexAsync as needed to coordinate the atomic remove+add behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@src/Elastic.Markdown/Exporters/Elasticsearch/ContentDateEnrichment.cs`:
- Around line 98-109: EnsureLookupIndexAsync currently calls
SwapAliasAsync(null, indexName, ct) which issues an add-only alias update and
can leave _lookupAlias attached to multiple indices under concurrent runs;
update the bootstrap flow to make the alias swap single-owner by first resolving
current backings via ResolveBackingIndexAsync (or listing all indices with the
alias), then perform a single multi-action aliases request that removes the
alias from all current backing indices and adds it to GenerateStagingName() in
the same request (or, if you prefer, loop: re-resolve and retry the remove+add
if action_results/errors indicate partial failures), and ensure SwapAliasAsync
parses the alias API response (errors/action_results) to detect per-action
failures and surface/retry them instead of assuming success; also apply the same
change to the other occurrence referenced around SwapAliasAsync usage (lines
~142-165) and to CreateLookupIndexAsync as needed to coordinate the atomic
remove+add behavior.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 8dbe88f0-5ff2-4587-af03-ff4271faf6ac

📥 Commits

Reviewing files that changed from the base of the PR and between bf671e1 and 3083b86.

📒 Files selected for processing (1)
  • src/Elastic.Markdown/Exporters/Elasticsearch/ContentDateEnrichment.cs

@reakaleek reakaleek merged commit 8efe820 into main Apr 14, 2026
30 checks passed
@reakaleek reakaleek deleted the chartreuse-kitten branch April 14, 2026 10:06
reakaleek added a commit that referenced this pull request Apr 14, 2026
…onments

The staging index + alias swap pattern introduced in #3098 fails in
environments that already have a concrete index with the alias name.
This adds a migration path that reindexes data into a timestamped
backing index, removes the legacy index, and creates the alias.

Also makes enrich policy creation idempotent (delete-before-put) and
adds a last_updated fallback in the sitemap reader for documents
indexed before the content date pipeline existed.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants