Implement homograph numbers and improve entry search/sort behavior#2220
Draft
myieye wants to merge 7 commits intoclaude/add-lexeme-headwords-TowRXfrom
Draft
Implement homograph numbers and improve entry search/sort behavior#2220myieye wants to merge 7 commits intoclaude/add-lexeme-headwords-TowRXfrom
myieye wants to merge 7 commits intoclaude/add-lexeme-headwords-TowRXfrom
Conversation
Add HomographNumber (int, 0 = unset) to the Entry model with full round-trip support through CRDT, FwData bridge, and sync. Key changes: - Entry model: add HomographNumber property with Copy() support - CreateEntryChange: persist HomographNumber in CRDT changes - CrdtMiniLcmApi: auto-assign homograph numbers on entry creation when HomographNumber is 0, respecting SecondaryOrder scoping. Updates existing lone entries from 0→1 when a second homograph appears. - FwDataMiniLcmApi: read HomographNumber from ILexEntry, set on create - UpdateEntryProxy: bidirectional HomographNumber sync to LibLCM - EntrySync: include HomographNumber in diff/patch operations - Sorting: uncomment HomographNumber in CRDT sort and search queries - Tests: uncomment sorting tests with HomographNumber, add auto- assignment tests, add sync test verifying LibLCM corrects numbers after entry deletion via two sync cycles https://claude.ai/code/session_01FJj2v135u6KdgVxoK4tRp2
…ering Covers two gaps in homograph number test coverage: - CitationForm-based grouping (different LexemeForms, same CitationForm) - Sequential numbering with 3+ entries (verifies max+1 assignment) https://claude.ai/code/session_01FJj2v135u6KdgVxoK4tRp2
…aphNumber Previously, CreateEntry skipped AssignHomographNumber when an explicit HomographNumber was provided. This left existing lone entries at 0 and accepted out-of-range values. Now we always query the homograph scope (matching headword + SecondaryOrder) to enforce invariants: - Lone entries always get HomographNumber 0 - Explicit values in [1, max+1] are preserved (supports sync reordering) - Out-of-range values are clamped to max+1 https://claude.ai/code/session_01FJj2v135u6KdgVxoK4tRp2
…y promotion Remove clamping logic from AssignHomographNumber. The only case we need to handle is auto-assigning for new FW-Lite entries (HomographNumber == 0). Explicit values from sync/import are trusted as-is — FLEx handles getting homograph numbers right. The key fix remains: always query the homograph scope so a lone existing entry at 0 gets promoted to 1. https://claude.ai/code/session_01FJj2v135u6KdgVxoK4tRp2
Replace the two-stage approach (DB query for headword match, then in-memory filter for SecondaryOrder) with a single DB query using correlated subqueries on MorphTypes, matching the pattern in Sorting.cs. Restore the original if(HomographNumber == 0) guard — explicit values get no special handling at all; only auto-assign for new FW-Lite entries. https://claude.ai/code/session_01FJj2v135u6KdgVxoK4tRp2
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
4246a62 to
2e82fa6
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR implements homograph number auto-assignment for entries with duplicate headwords and significantly improves entry search and sorting behavior to properly handle morphological tokens and secondary ordering.
Key Changes
Homograph Number Management
HomographNumberproperty toEntrymodel to distinguish entries with identical headwordsCrdtMiniLcmApi.AssignHomographNumber()that:MorphType.SecondaryOrderSearch and Filtering Improvements
MorphType.SecondaryOrderto group related morphological variants togetherSorting Enhancements
ApplyHeadwordOrder()method to apply consistent headword-based sorting with secondary order considerationApplyRoughBestMatchOrder()to include secondary order and homograph number in sort criteriaData Model Changes
Headwordcomputed property toEntry(pre-computed by backend, populated during finalization)HeadwordWithTokens(),SearchHeadwords(),ComputeHeadwords()inEntryQueryHelpersEntrySearchService.FilterAndRank()to use new ranking logic with secondary order supportFiltering.csto acceptIQueryable<MorphType>for proper secondary order filteringTest Coverage
API Updates
EntrySearchService.Filter()to acceptWritingSystemIdparameterHeadword()method toHeadwordText()in various test helpers to distinguish from the newHeadwordpropertyNotable Implementation Details
Headwordproperty is computed by the backend and excluded from strict equality checks in sync testshttps://claude.ai/code/session_01FJj2v135u6KdgVxoK4tRp2