Search: Include synonym ID in synonym rule values#2999
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughThe pull request refactors synonym data storage across the codebase from a dictionary-based structure to a list-based structure. The Suggested labels
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches✨ Simplify code
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@src/Elastic.Markdown/Exporters/Elasticsearch/ElasticsearchMarkdownExporter.cs`:
- Around line 241-242: The publish path (PublishSynonymsAsync) includes the
synonym key when building the synonym string but the constructor logic that
builds indexTimeSynonyms omits it, causing inconsistent synonym sets and hash
inputs; update the constructor/initialization that builds indexTimeSynonyms to
construct synonyms the same way as PublishSynonymsAsync (include the key as the
first token when joining the synonyms) so SynonymRule entries (Id and Synonyms)
are generated identically for both published and index-time paths—refer to
SynonymRule, the id variable and synonym.Value usage and make the join logic
consistent with the existing PublishSynonymsAsync pattern.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: e3fcc140-4fe2-4787-ace3-583446a903c8
📒 Files selected for processing (1)
src/Elastic.Markdown/Exporters/Elasticsearch/ElasticsearchMarkdownExporter.cs
src/Elastic.Markdown/Exporters/Elasticsearch/ElasticsearchMarkdownExporter.cs
Outdated
Show resolved
Hide resolved
cotti
left a comment
There was a problem hiding this comment.
I think CodeRabbit's suggestion makes sense, otherwise LGTM
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/Elastic.Documentation.Configuration/Search/SearchConfiguration.cs`:
- Around line 127-130: The current mapping of searchDto.Synonyms into synonyms
loses the prior uniqueness enforcement on the first term, allowing duplicate
rule IDs (used as s[0] in ElasticsearchMarkdownExporter.Export or similar) to be
produced; update the transformation in SearchConfiguration (where
searchDto.Synonyms is processed) to enforce uniqueness on the first term — e.g.,
group by s.First()/s[0] and either deduplicate (take first occurrence) or throw
a validation exception when duplicates exist — so that the resulting synonyms
array contains only one entry per unique first term and prevents duplicate rule
IDs during export.
🪄 Autofix (Beta)
✅ Autofix completed
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: ca3cf4ff-97bd-4ae9-ad57-4c5ad825e155
📒 Files selected for processing (9)
.superset/config.jsonsrc/Elastic.Documentation.Configuration/Search/SearchConfiguration.cssrc/Elastic.Markdown/Exporters/Elasticsearch/ElasticsearchMarkdownExporter.cstests-integration/Elastic.Assembler.IntegrationTests/TestHelpers.cstests-integration/Mcp.Remote.IntegrationTests/McpToolsIntegrationTestsBase.cstests/Elastic.ApiExplorer.Tests/TestHelpers.cstests/Elastic.Changelog.Tests/Changelogs/ChangelogTestBase.cstests/Elastic.Documentation.Build.Tests/TestHelpers.cstests/Elastic.Markdown.Tests/TestHelpers.cs
✅ Files skipped from review due to trivial changes (6)
- tests/Elastic.Documentation.Build.Tests/TestHelpers.cs
- tests-integration/Elastic.Assembler.IntegrationTests/TestHelpers.cs
- tests-integration/Mcp.Remote.IntegrationTests/McpToolsIntegrationTestsBase.cs
- tests/Elastic.Markdown.Tests/TestHelpers.cs
- tests/Elastic.ApiExplorer.Tests/TestHelpers.cs
- .superset/config.json
🚧 Files skipped from review as they are similar to previous changes (1)
- src/Elastic.Markdown/Exporters/Elasticsearch/ElasticsearchMarkdownExporter.cs
|
Note Autofix is a beta feature. Expect some limitations and changes as we gather feedback and continue to improve it. Fixes Applied SuccessfullyFixed 1 file(s) based on 1 unresolved review comment. Files modified:
Commit: The changes have been pushed to the Time taken: |
|
💚 CLA has been signed |
The synonym dictionary uses the first term as the key and remaining terms as values. When publishing to Elasticsearch, only the values were included in the synonym string, omitting the key term itself. This meant e.g. "ilm" was the rule ID but not part of the synonym set, so it wouldn't match "index lifecycle management". Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The dictionary structure used the first term as the key and Skip(1) for values, which artificially separated one synonym from its group. This caused the key term to be omitted from synonym rules sent to Elasticsearch. Replace Dictionary<string, string[]> with IReadOnlyList<string[]> so all terms in a synonym group are treated equally. The first term is still used as the ES rule ID for readability but is no longer excluded from the synonym string. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
.superset/config.json
Outdated
| @@ -0,0 +1,7 @@ | |||
| { | |||
There was a problem hiding this comment.
I think we need to add .superset to gitignore.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The synonyms type was changed from Dictionary<string, string[]> to IReadOnlyList<string[]> in 5cea709 but the F# authoring test was not updated. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
What
Fix synonym rules to include the ID term in the synonym set values sent to Elasticsearch.
Why
The synonym dictionary uses the first term as the key (and rule ID) with remaining terms as values. When publishing, only the values were joined into the synonym string — omitting the key term. For example, a rule with ID
ilmwould only haveindex lifecycle managementas its synonym, meaning searching for "ilm" wouldn't trigger the synonym expansion.How
Prepend the synonym ID to the values array when building the comma-separated synonym string in
PublishSynonymsAsync.Test plan
"ilm, index lifecycle management"instead of just"index lifecycle management")🤖 Generated with Claude Code