Skip to content

Search: Include synonym ID in synonym rule values#2999

Merged
reakaleek merged 5 commits intomainfrom
just-zydeco
Apr 14, 2026
Merged

Search: Include synonym ID in synonym rule values#2999
reakaleek merged 5 commits intomainfrom
just-zydeco

Conversation

@reakaleek
Copy link
Copy Markdown
Member

What

Fix synonym rules to include the ID term in the synonym set values sent to Elasticsearch.

Why

The synonym dictionary uses the first term as the key (and rule ID) with remaining terms as values. When publishing, only the values were joined into the synonym string — omitting the key term. For example, a rule with ID ilm would only have index lifecycle management as its synonym, meaning searching for "ilm" wouldn't trigger the synonym expansion.

How

Prepend the synonym ID to the values array when building the comma-separated synonym string in PublishSynonymsAsync.

Test plan

  • Verify synonym rules in Elasticsearch include the ID term (e.g. "ilm, index lifecycle management" instead of just "index lifecycle management")
  • Search for abbreviation terms (like "ilm") and confirm they match their expanded forms

🤖 Generated with Claude Code

@reakaleek reakaleek requested a review from a team as a code owner March 30, 2026 15:27
@reakaleek reakaleek requested a review from Mpdreamz March 30, 2026 15:27
@reakaleek reakaleek added the fix label Mar 30, 2026
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 30, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

The pull request refactors synonym data storage across the codebase from a dictionary-based structure to a list-based structure. The SearchConfiguration.Synonyms property type changes from IReadOnlyDictionary<string, string[]> to IReadOnlyList<string[]>. ElasticsearchMarkdownExporter is updated to handle this new structure when building synonym rules, assuming the first element of each array serves as the rule identifier. Multiple test helper files are updated to initialize Synonyms using empty collection literals. Additionally, .gitignore is updated to exclude the .superset directory.

Suggested labels

fix

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 10.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main change: including synonym IDs in the synonym rule values sent to Elasticsearch, which is the core fix across multiple files.
Description check ✅ Passed The description clearly explains what the change does, why it was needed, and how it was implemented, and is directly related to the changeset modifications.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
✨ Simplify code
  • Create PR with simplified code
  • Commit simplified code in branch just-zydeco

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@src/Elastic.Markdown/Exporters/Elasticsearch/ElasticsearchMarkdownExporter.cs`:
- Around line 241-242: The publish path (PublishSynonymsAsync) includes the
synonym key when building the synonym string but the constructor logic that
builds indexTimeSynonyms omits it, causing inconsistent synonym sets and hash
inputs; update the constructor/initialization that builds indexTimeSynonyms to
construct synonyms the same way as PublishSynonymsAsync (include the key as the
first token when joining the synonyms) so SynonymRule entries (Id and Synonyms)
are generated identically for both published and index-time paths—refer to
SynonymRule, the id variable and synonym.Value usage and make the join logic
consistent with the existing PublishSynonymsAsync pattern.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: e3fcc140-4fe2-4787-ace3-583446a903c8

📥 Commits

Reviewing files that changed from the base of the PR and between ce8705a and 971921e.

📒 Files selected for processing (1)
  • src/Elastic.Markdown/Exporters/Elasticsearch/ElasticsearchMarkdownExporter.cs

Copy link
Copy Markdown
Contributor

@cotti cotti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think CodeRabbit's suggestion makes sense, otherwise LGTM

@coderabbitai coderabbitai bot added breaking and removed fix labels Mar 30, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/Elastic.Documentation.Configuration/Search/SearchConfiguration.cs`:
- Around line 127-130: The current mapping of searchDto.Synonyms into synonyms
loses the prior uniqueness enforcement on the first term, allowing duplicate
rule IDs (used as s[0] in ElasticsearchMarkdownExporter.Export or similar) to be
produced; update the transformation in SearchConfiguration (where
searchDto.Synonyms is processed) to enforce uniqueness on the first term — e.g.,
group by s.First()/s[0] and either deduplicate (take first occurrence) or throw
a validation exception when duplicates exist — so that the resulting synonyms
array contains only one entry per unique first term and prevents duplicate rule
IDs during export.
🪄 Autofix (Beta)

✅ Autofix completed


ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: ca3cf4ff-97bd-4ae9-ad57-4c5ad825e155

📥 Commits

Reviewing files that changed from the base of the PR and between 971921e and 72f38ff.

📒 Files selected for processing (9)
  • .superset/config.json
  • src/Elastic.Documentation.Configuration/Search/SearchConfiguration.cs
  • src/Elastic.Markdown/Exporters/Elasticsearch/ElasticsearchMarkdownExporter.cs
  • tests-integration/Elastic.Assembler.IntegrationTests/TestHelpers.cs
  • tests-integration/Mcp.Remote.IntegrationTests/McpToolsIntegrationTestsBase.cs
  • tests/Elastic.ApiExplorer.Tests/TestHelpers.cs
  • tests/Elastic.Changelog.Tests/Changelogs/ChangelogTestBase.cs
  • tests/Elastic.Documentation.Build.Tests/TestHelpers.cs
  • tests/Elastic.Markdown.Tests/TestHelpers.cs
✅ Files skipped from review due to trivial changes (6)
  • tests/Elastic.Documentation.Build.Tests/TestHelpers.cs
  • tests-integration/Elastic.Assembler.IntegrationTests/TestHelpers.cs
  • tests-integration/Mcp.Remote.IntegrationTests/McpToolsIntegrationTestsBase.cs
  • tests/Elastic.Markdown.Tests/TestHelpers.cs
  • tests/Elastic.ApiExplorer.Tests/TestHelpers.cs
  • .superset/config.json
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/Elastic.Markdown/Exporters/Elasticsearch/ElasticsearchMarkdownExporter.cs

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 30, 2026

Note

Autofix is a beta feature. Expect some limitations and changes as we gather feedback and continue to improve it.

Fixes Applied Successfully

Fixed 1 file(s) based on 1 unresolved review comment.

Files modified:

  • src/Elastic.Documentation.Configuration/Search/SearchConfiguration.cs

Commit: e97ff8d92db8aabb92c96289ce0426fd2c3921be

The changes have been pushed to the just-zydeco branch.

Time taken: 2m 20s

@cla-checker-service
Copy link
Copy Markdown

cla-checker-service bot commented Mar 30, 2026

💚 CLA has been signed

reakaleek and others added 2 commits March 30, 2026 18:25
The synonym dictionary uses the first term as the key and remaining
terms as values. When publishing to Elasticsearch, only the values
were included in the synonym string, omitting the key term itself.
This meant e.g. "ilm" was the rule ID but not part of the synonym
set, so it wouldn't match "index lifecycle management".

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The dictionary structure used the first term as the key and Skip(1) for
values, which artificially separated one synonym from its group. This
caused the key term to be omitted from synonym rules sent to
Elasticsearch.

Replace Dictionary<string, string[]> with IReadOnlyList<string[]> so all
terms in a synonym group are treated equally. The first term is still
used as the ES rule ID for readability but is no longer excluded from
the synonym string.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@reakaleek reakaleek requested a review from cotti March 30, 2026 16:26
@coderabbitai coderabbitai bot added breaking and removed fix labels Mar 30, 2026
Copy link
Copy Markdown
Member

@Mpdreamz Mpdreamz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM one nitpick

@@ -0,0 +1,7 @@
{
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to add .superset to gitignore.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@coderabbitai coderabbitai bot added fix and removed breaking labels Apr 14, 2026
reakaleek and others added 2 commits April 14, 2026 10:53
The synonyms type was changed from Dictionary<string, string[]> to
IReadOnlyList<string[]> in 5cea709 but the F# authoring test was
not updated.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@reakaleek reakaleek merged commit aa23769 into main Apr 14, 2026
28 checks passed
@reakaleek reakaleek deleted the just-zydeco branch April 14, 2026 10:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants