Skip to content

[Tech Request]: Fulltext index BloomFilter pre-filter pushdown optimization #23832

@ck89119

Description

@ck89119

Is there an existing issue for the same tech request?

  • I have checked the existing issues.

Does this tech request not affect user experience?

  • This tech request doesn't affect user experience.

What would you like to be added ?

When a fulltext index query has additional WHERE filters on the source table (e.g. WHERE category='news' AND match(content) against('keyword')), push down a BloomFilter of the filtered PKs to the fulltext index table scan, so that irrelevant doc_id rows are skipped at the reader level.

Plan structure (when pushdown is enabled):

outerJoin(scanNode, innerJoin(ft_func_chain, secondScanProject))
  • secondScanProject: scans the source table with the non-fulltext filters, outputs only PK
  • BloomFilter runtime filter: secondScan(build) → ft_func(probe) — pushes filtered PKs into fulltext index table scan
  • IN-list runtime filter: innerJoin(build) → scanNode(probe) — pushes fulltext match results back to source table scan

Why is this needed ?

Currently, fulltext index queries scan the entire index table regardless of additional WHERE conditions on the source table. For example:

SELECT * FROM articles
WHERE category = 'news'
AND match(content) against('database' in natural language mode);

The fulltext index scan processes all doc_id entries, then JOINs with the source table where category='news' filters out most rows. This wastes significant I/O when the WHERE condition is highly selective.

With BloomFilter pre-filter pushdown:

  1. First scan the source table with category='news' to collect matching PKs
  2. Build a BloomFilter from these PKs
  3. Push the BloomFilter down to the fulltext index table reader
  4. Skip blocks/rows whose doc_id doesn't pass the BloomFilter check

This significantly reduces I/O for fulltext queries with selective non-fulltext filters.

Metadata

Metadata

Assignees

Labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions