Create indexes in django models and migration files#7568
Create indexes in django models and migration files#7568acwhite211 wants to merge 17 commits intomainfrom
Conversation
Triggered by 66a7b84 on branch refs/heads/issue-7482
|
Let me know if anyone thinks of any additional fields that they think would benefit from indexing? |
alesan99
left a comment
There was a problem hiding this comment.
- See that the new migration step for adding the indexes completes successfully.
- Use the QB on fields that have been indexed to see that they run correctly and in a timely manner.
- Use the taxon tree viewer on a database with a large taxon tree. Try all of the tree operations to see that they run in a timely manner.
Migration runs correctly on KUBirds and read operations are snappy 👍
Tried deleting, moves, merges, searching, and importing big trees.
I didn't notice any speed drops either when running this locally.
Triggered by f883c1c on branch refs/heads/issue-7482
bhumikaguptaa
left a comment
There was a problem hiding this comment.
- See that the new migration step for adding the indexes completes successfully.
- Use the QB on fields that have been indexed to see that they run correctly and in a timely manner.
- Use the taxon tree viewer on a database with a large taxon tree. Try all of the tree operations to see that they run in a timely manner.
When I tried to run the MaterialSample table I got the following error:
Specify 7 Crash Report - 2026-03-18T18_28_32.334Z.txt
Link to DB: https://ojsmnh20251211-issue-7482.test.specifysystems.org/specify/query
This looks to be caused by duplicate splocalecontainer records in that database. I added a commit to handle that case 👍 |
bhumikaguptaa
left a comment
There was a problem hiding this comment.
- See that the new migration step for adding the indexes completes successfully.
- Use the QB on fields that have been indexed to see that they run correctly and in a timely manner.
- Use the taxon tree viewer on a database with a large taxon tree. Try all of the tree operations to see that they run in a timely manner.
--
It works as expected. I was able to query on indexed fields without any errors, including materialsample.
Triggered by 0f0f17a on branch refs/heads/issue-7482
️✅ There are no secrets present in this pull request anymore.If these secrets were true positive and are still valid, we highly recommend you to revoke them. 🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request. |
|
Working on doing some more thorough testing to make sure that no action or process in Specify is significantly slowed down from re-indexing. |
grantfitzsimmons
left a comment
There was a problem hiding this comment.
- See that the new migration step for adding the indexes completes successfully.
- Use the QB on fields that have been indexed to see that they run correctly and in a timely manner.
- Use the taxon tree viewer on a database with a large taxon tree. Try all of the tree operations to see that they run in a timely manner.
I did the same as @alesan99– made some really big tree moves and imports. I don't feel the speed difference, but waiting for your testing to be done @acwhite211
Fixes #7482
Create indexes mentioned in the issue into the models and migration files. Some of the indexes mentioned were already present in the existing model, the rest have been added.
The main downside I could see is with adding the tree field indexes is that writes might be too slow. With an index like
taxon.name, write operations like INSERT, UPDATE, and DELETE will take longer, but with the upside of read operations being faster. I know that some of our tree operations make bulk edits to the tree record fields, like the 'Move' action in the tree viewer, so we'll want to be careful in our performance evaluation testing. We'll want to test this on large databases with a big taxon tree to make sure the read and write performance is acceptable.I ran into a problem with the tree viewer timing out after running the index migrations. Solved the issue by rewriting the
get_tree_rows()function to avoid the expensive grouped self-join on tree tables. Instead of joining child and synonym rows and collapsing them with GROUP BY, it now computes child counts and synonym lists with correlated subqueries, which preserves the same response shape while producing a faster query for taxon tree requests.Indexed fields
agentidentifieridentifier,identifiertypeagentspecialtyordernumber,specialtynameagentvariantnameattachmentmetadatanameauthorordernumbercollectionobjectname,projectnumbercollectionobjectgroupguid,namecollectionobjectgrouptypenamecollectionobjectpropertyguidcollectionobjecttypenamecollectionreltypenameexchangeinexchangeinnumberexsiccataitemnumbergeographycommonname,guid,highestchildnodenumber,nodenumbergeographytreedefnamegeographytreedefitemnamegeologictimeperiodhighestchildnodenumber,nodenumbergeologictimeperiodtreedefnamegeologictimeperiodtreedefitemnameinstitutionnetworkaltnamelatlonpolygonnamelithostrathighestchildnodenumber,nodenumberlithostrattreedefnamelithostrattreedefitemnamelocalityguidmaterialsampleguidmorphbankviewviewnameotheridentifieridentifierpicklistfieldname,filterfieldname,tablenamepreparationpropertyguidpreptypenamereferenceworklibrarynumberspauditlogfieldfieldnamespecifyusernamespexportschemaschemanamespexportschemaitemfieldnamespexportschemaitemmappingexportedfieldnamespexportschemamappingmappingnamespfieldvaluedefaultfieldname,tablenamesplocalecontainerpicklistnamesplocalecontaineritempicklistname,weblinknamesppermissionnamespprincipalnamespquerycontextnamespqueryfieldfieldname,formatnamespviewsetobjfilenamestoragehighestchildnodenumber,nodenumberstoragetreedefnamestoragetreedefitemnametaxoncultivarname,groupnumber,highestchildnodenumber,nodenumbertaxontreedefnametaxontreedefitemnametectonicunitfullname,guid,highestchildnodenumber,name,nodenumbertectonicunittreedefnametectonicunittreedefitemnamevoucherrelationshipvouchernumberChecklist
self-explanatory (or properly documented)
Testing instructions