Fix key migrations creating duplicate splocalecontainer records#8039
Fix key migrations creating duplicate splocalecontainer records#8039melton-jason merged 18 commits intov7_12_0_5from
Conversation
Somewhat related to #7988, in the sense that both had the same "underlying casue" of using very specific filters in a QuerySet get_or_create
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Plus Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
📝 WalkthroughWalkthroughSchema-config migration code was refactored: typed defaults added, deterministic locale-container/item selection enforced, field enumeration centralized via Changes
Sequence Diagram(s)sequenceDiagram
participant CMD as RunKeyMigrationCommand
participant USC as update_schema_config
participant DB as Database
CMD->>USC: create_table_schema_config_with_defaults(table)
USC->>DB: filter Splocalecontainer by (name, discipline, schemaType).order_by(id).first()
alt container missing
USC->>DB: insert Splocalecontainer
end
USC->>DB: filter Splocalecontaineritem.order_by(id).first() / insert missing items
USC->>DB: bulk_create_splocaleitemstr_idempotent -> delete duplicates, insert missing strings
CMD->>USC: remove_componentparent_item()
USC->>DB: delete specific ComponentParent item rows
Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 4 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@specifyweb/specify/migration_utils/update_schema_config.py`:
- Around line 465-479: The fetched existing Splocalecontaineritem
(fetched_sp_locale_container_item) may have stale fields and is returned as-is;
instead, normalize it to the expected values before returning: set its type to
field_config.java_type, ishidden to field_hidden, isrequired to field_required,
issystem to table.system, picklistname to picklist_name (and version if
desired), detect changes and save the model (sp_locale_container_item.save()) so
reruns heal metadata; perform this update in the else branch where
fetched_sp_locale_container_item is assigned to sp_locale_container_item.
- Around line 314-315: The code filters Splocalecontainer using container_attrs
and Splocalecontaineritem by (container, name) without DB-level uniqueness,
creating a race where concurrent runners can both insert duplicates; fix by
adding DB-level UniqueConstraint/unique_together to the Splocalecontainer Meta
for (discipline, schematype, name) and to Splocalecontaineritem Meta for
(container, name) so the DB prevents duplicates, and/or replace the two-step
filter-then-create logic around Splocalecontainer (where
fetched_sp_locale_container is obtained) and the block creating
Splocalecontaineritem with atomic get_or_create() (or wrap in a transaction with
select_for_update) to eliminate the race window.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 91c06950-21b5-4792-8813-0b378ace521a
📒 Files selected for processing (1)
specifyweb/specify/migration_utils/update_schema_config.py
Also, define a helper on Table to fetch/filter fields
…alecontaineritem Duplicate items and strings would be created everytime the function is run. The cause of this is the same as #7988.
There's no easy way to differentiate schema config items for which these should be applied. I suppose the BEST method might be to be exact: only updating Containers/Items/Strings that exactly match the previous values expected with the migrations they aim to fix/resolve
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
specifyweb/specify/migration_utils/update_schema_config.py (1)
266-268: Use explicitT | Noneannotation fordescriptionparameter.The static analysis tool flagged an implicit
Optional. PEP 484 recommends explicit type annotations.🛠️ Suggested fix
def update_table_schema_config_with_defaults( table_name, discipline_id: int, - description: str = None, + description: str | None = None, apps = global_apps, defaults: TableDefaults | None = None, pending_itemstr_rows: list[dict] | None = None, ):🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@specifyweb/specify/migration_utils/update_schema_config.py` around lines 266 - 268, Change the implicit Optional for the parameter named "description" to an explicit union type by updating its annotation from "description: str = None" to "description: str | None = None" (no runtime change); locate the parameter in the function signature in update_schema_config.py and update the annotation accordingly so static checkers accept the explicit T | None form.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@specifyweb/specify/migration_utils/update_schema_config.py`:
- Around line 377-382: FieldDefaults currently types ishidden and isrequired as
str but they are used as booleans; update the TypedDict so ishidden and
isrequired are NotRequired[bool] instead of NotRequired[str], then run
type-checks and adjust any callers if they rely on string values; locate the
FieldDefaults definition and change the two type annotations (ishidden,
isrequired) to bool to match usage in functions that assign them to boolean
variables.
---
Nitpick comments:
In `@specifyweb/specify/migration_utils/update_schema_config.py`:
- Around line 266-268: Change the implicit Optional for the parameter named
"description" to an explicit union type by updating its annotation from
"description: str = None" to "description: str | None = None" (no runtime
change); locate the parameter in the function signature in
update_schema_config.py and update the annotation accordingly so static checkers
accept the explicit T | None form.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 74a3eac1-44ca-47c2-afb6-e9f3f54321cd
📒 Files selected for processing (5)
specifyweb/specify/management/commands/run_key_migration_functions.pyspecifyweb/specify/migration_utils/sp7_schemaconfig.pyspecifyweb/specify/migration_utils/update_schema_config.pyspecifyweb/specify/migrations/0039_agent_fields_for_loan_and_gift.pyspecifyweb/specify/models_utils/load_datamodel.py
💤 Files with no reviewable changes (2)
- specifyweb/specify/migrations/0039_agent_fields_for_loan_and_gift.py
- specifyweb/specify/migration_utils/sp7_schemaconfig.py
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@specifyweb/specify/migration_utils/update_schema_config.py`:
- Around line 264-269: Change the implicit optional annotation for the parameter
named description in the function signature (the one that currently reads
"description: str = None" in update_schema_config.py) to an explicit Optional
type (e.g., "description: Optional[str] = None"); add or update the typing
import (from typing import Optional) if not already present so the code
type-checks and RUF013 is satisfied.
- Around line 854-877: The Splocalecontaineritem.objects.get_or_create call can
raise MultipleObjectsReturned for legacy duplicates; replace it with a safe
lookup using Splocalecontaineritem.objects.filter(name=COT_FIELD_NAME,
container=container).order_by("id").first() and if that returns None, create a
new Splocalecontaineritem with the same defaults
(picklistname=COT_PICKLIST_NAME, type='ManyToOne', isrequired=True) so
container_item holds either the first existing record or the newly created
instance; leave the subsequent Splocaleitemstr.get_or_create calls as-is but
ensure they use the resolved container_item instance.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 9b650042-fe04-4561-98a5-e0eda27e9917
📒 Files selected for processing (1)
specifyweb/specify/migration_utils/update_schema_config.py
|
Nice fix for that duplicate CO->COT issue, that's the only issue I found so far. Will finish testing soon 👍 |
|
I've been going through non-schema related The bug was related to detecting and applying the default TectonicUnit Tree structure. Previously, if the user had made any changes to the tree structure, Specify would create duplicate ranks as the "expected" ranks would not be returned by the specify7/specifyweb/specify/migration_utils/tectonic_ranks.py Lines 18 to 60 in b4b1602 My fix here (4a1246f) was to broaden the matching behavior on the root rank (the user could have modified the name, title, and/or whether it was enforced), and skip checking or creating further TectonicUnit ranks if a root rank for TectonicUnit already exists (at that point we can assume these defaults had already been applied in the past): specify7/specifyweb/specify/migration_utils/tectonic_ranks.py Lines 18 to 35 in 4a1246f |
|
Tested on multiple databases. No modifications or new duplicates were created to the schema config table records. Besides the TectonicUnit issue, the only other thing I can think of adding is a fix similar to this commit 8f15b93 for better deduplication, but I think we should probably just have a separate deduplication issue for 7.12.1 |
Fixes #7988
Including @melton-jason's changes, and co-written with @melton-jason
This PR seeks to eliminate bugs in the
fix_schema_configstep of therun_key_migration_functions. Particularly, this PR aims to eliminate duplicate SpLocaleContainer, SpLocaleContainerItem, and SpLocaleItemStr creation, as well as prevent in-place modification of existing SpLocaleContainer, SpLocaleContainerItem, and SpLocaleItemStr records.There were a lot of migration functions which directly updated one or more SpLocaleContainer, SpLocaleContainerItem, or SpLocaleItemStr records. This could cause data destruction in the case where the discipline had already run the migration associated with the function, yet the user(s) had since made changes to the Schema Config.
The following SQL statement can be used to identify duplicate Field Labels:
(In this case, a duplicate field label is one where there's more than one label assigned to a field that has the same language and text. Feel free to modify the query to adjust what is considered a dupllicate)
The following SQL statement can be used to export the overall Schema Configuration structure to a CSV file.
You only need to replace
'output.csv'(within theINTO OUTFILEstatement) with the desired path of the CSV:Columns 1-4, the SpLocaleItemStr information. This is information for the table/field labels or descriptions
Columns 5-12, SpLocaleContainer information for Table Labels.
5. DisciplineID
6. SpLocaleContainerID
7. Name (the name of the table)
8. Formatter
9. Aggregator
10. PickList Name
11. Is Hidden
12. Is System
Columns 13-20, SpLocaleContainer information for Table Descriptions.
13. DisciplineID
14. SpLocaleContainerID
15. Name (the name of the table)
16. Formatter
17. Aggregator
18. PickList Name
19. Is Hidden
20. Is System
Columns 21-28, the associated SpLocaleContainer information for Field Labels
21. DisciplineID
22. SpLocaleContainerID
23. Name (the name of the table)
24. Formatter
25. Aggregator
26. PickList Name
27. Is Hidden
28. Is System
Columns 29-35, SpLocaleContainerItem information for Field Labels
29. SpLocaleContainerItemID
30. Name (name of the field)
31. (Field) Formatter
32. PickList Name
33. WebLink Name
34. Is Hidden
35. Is System
Columns 36-43, the associated SpLocaleContainer information for Field Descriptions
36. DisciplineID
37. SpLocaleContainerID
38. Name (the name of the table)
39. Formatter
40. Aggregator
41. PickList Name
42. Is Hidden
43. Is System
Columns 44-50, SpLocaleContainerItem information for Field Descriptions
44. SpLocaleContainerItemID
45. Name (name of the field)
46. (Field) Formatter
47. PickList Name
48. WebLink Name
49. Is Hidden
50. Is System
Checklist
self-explanatory (or properly documented)
specify7/specifyweb/specify/management/commands/run_key_migration_functions.py
Line 50 in ea04665
Testing instructions
In your testing, preferably use a database that has undergone extensive SchemaConfig changes. These database are the most at risk with the previous bugs.
v7.12update applied (Do not update the database yet!)run_key_migration_functions.Summary by CodeRabbit
Refactor
Bug Fixes