Skip to content

Improve tree search efficiency#7941

Closed
foozleface wants to merge 1 commit intospecify:mainfrom
calacademy-research:cas/perf-tree-search-7752
Closed

Improve tree search efficiency#7941
foozleface wants to merge 1 commit intospecify:mainfrom
calacademy-research:cas/perf-tree-search-7752

Conversation

@foozleface
Copy link
Copy Markdown
Collaborator

Fixes #7752
Contributed by @foozleface

Tree search in QueryComboBox sends a LIKE query per keystroke. On tree tables with 200K+ rows, using "contains" mode generates LIKE '%pattern%' which cannot use B-tree indexes and causes full table scans. This PR changes the default search mode from "contains" to "startsWith", enabling LIKE 'pattern%' which uses B-tree indexes. It also reduces the search result limit from 1000 to 50, since a typeahead dropdown never needs that many results.

Implementation

  • Change the default value of treeSearchAlgorithm user preference from contains to startsWith in UserDefinitions.tsx
  • Export QUERY_COMBO_BOX_SEARCH_LIMIT constant set to 50 (was 1000)
  • Add tests verifying the default search operator and result limit

Testing instructions

  • Open a form with a tree-based QueryComboBox (e.g., Taxon field on a Determination)
  • Type a few characters and verify the typeahead dropdown populates quickly
  • Verify that typing "rosa" matches "Rosaceae" and "Rosales" but not "Pterosaurus" (startsWith behavior)
  • Check user preferences -- the tree search algorithm should now default to "starts with"
  • Run the frontend tests: npx jest --testPathPattern treeSearchEfficiency

import { tables } from '../../DataModel/tables';
import { queryFieldFilterSpecs } from '../../QueryBuilder/FieldFilterSpec';
import { makeComboBoxQuery } from '../helpers';
import { QUERY_COMBO_BOX_SEARCH_LIMIT } from '../index';
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't included in this PR

Copy link
Copy Markdown
Member

@grantfitzsimmons grantfitzsimmons left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Testing instructions

  • Open a form with a tree-based QueryComboBox (e.g., Taxon field on a Determination)
  • Type a few characters and verify the typeahead dropdown populates quickly
  • Check user preferences -- the tree search algorithm should now default to "starts with"
  • Run the frontend tests: npx jest --testPathPattern treeSearchEfficiency

This is pretty simple, just changing the default preference, but it does speed up searches. The automatic test is not passing, however.

@github-project-automation github-project-automation Bot moved this from 📋Back Log to Dev Attention Needed in General Tester Board Apr 10, 2026
@foozleface foozleface force-pushed the cas/perf-tree-search-7752 branch from e15b0f4 to d0e2aa3 Compare April 10, 2026 05:25
@foozleface
Copy link
Copy Markdown
Collaborator Author

Fixed — index.tsx was missing from the PR. Added:

  • export const QUERY_COMBO_BOX_SEARCH_LIMIT = 50 (the constant the test imports)
  • limit: QUERY_COMBO_BOX_SEARCH_LIMIT replacing the hardcoded limit: 1000

All 3 tests pass locally:

Test Suites: 1 passed, 1 total
Tests:       3 passed, 3 total


// Typeahead dropdown doesn't need 1000 results — 50 is more than enough.
// Reducing this from 1000 also cuts the DB query cost significantly.
export const QUERY_COMBO_BOX_SEARCH_LIMIT = 50;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer that we fetch query combo box results in batches on scroll rather than limiting it to 50

Change default tree search algorithm from 'contains' (LIKE '%x%') to
'startsWith' (LIKE 'x%'), enabling B-tree index usage on 200K+ row
tree tables.

Replace hardcoded limit of 1000 with paginated batch-on-scroll loading:
- Initial fetch returns first 50 results (QUERY_COMBO_BOX_PAGE_SIZE)
- Scrolling near the bottom of the dropdown fetches the next 50
- Loading indicator shown while fetching more results
- Continues until all results are loaded

Changes:
- UserDefinitions.tsx: default treeSearchAlgorithm 'contains' -> 'startsWith'
- QueryComboBox/index.tsx: paginated fetchSource with handleScrollEnd
- AutoComplete.tsx: onScrollEnd, isLoadingMore, extraItems props
- treeSearchEfficiency.test.tsx: verify default operator and page size
@foozleface foozleface force-pushed the cas/perf-tree-search-7752 branch from d0e2aa3 to 57ac51a Compare April 10, 2026 16:25
@foozleface
Copy link
Copy Markdown
Collaborator Author

Updated per feedback — replaced the hard limit of 50 with batch-on-scroll pagination:

  • Initial fetch returns first 50 results (QUERY_COMBO_BOX_PAGE_SIZE)
  • Scrolling near the bottom of the dropdown fetches the next 50
  • "Loading..." indicator shown while fetching
  • Continues until all results are loaded

The AutoComplete component gets three new optional props (onScrollEnd, isLoadingMore, extraItems) — backward-compatible, existing callers unaffected.

All 12 tests across AutoComplete + QueryComboBox suites pass.

@acwhite211
Copy link
Copy Markdown
Member

Moved to this PR with additional fixes #8016

@acwhite211 acwhite211 closed this Apr 23, 2026
@github-project-automation github-project-automation Bot moved this from Dev Attention Needed to ✅Done in General Tester Board Apr 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: ✅Done

Development

Successfully merging this pull request may close these issues.

[Large Databases]: Make tree searches by name more efficient

3 participants