Skip to content

feat(query): add HybridQuery for FT.HYBRID server-side fusion#9

Open
banker wants to merge 7 commits into
mainfrom
feat/hybrid-search
Open

feat(query): add HybridQuery for FT.HYBRID server-side fusion#9
banker wants to merge 7 commits into
mainfrom
feat/hybrid-search

Conversation

@banker
Copy link
Copy Markdown
Contributor

@banker banker commented May 11, 2026

Adds HybridQuery and SearchIndex.hybridSearch() that delegate text+vector score fusion entirely to Redis via the FT.HYBRID command (introduced in Redis OSS 8.4.0). Unlike Python redisvl's HybridQuery — which issues two queries client-side and fuses ranks itself — this implementation runs as a single round-trip with server-side RRF or LINEAR fusion.

The new method is separate from index.search() because FT.HYBRID has its own command, options shape, and reply format. HybridSearchResult extends SearchResult with executionTime and warnings fields.

Notable behaviours:

  • text + textFieldName triggers tokenize + escape + OR-join. Omitting textFieldName passes the body through verbatim so power users can use full Redis Search syntax.
  • vsimFilter is a raw string in the FT.SEARCH filter dialect (e.g. '@brand:{nike}'). postFilter is a raw string in the FT.AGGREGATE expression dialect (e.g. '@price < 200'). The two clauses use different syntaxes server-side.
  • LOAD always includes @__key so doc.id round-trips. Score aliases set via YIELD_SCORE_AS are not added to LOAD — Redis already injects them, and re-loading triggers "score alias already exists" errors.
  • LOAD/SORTBY field references are auto-prefixed with @ when the user passes bare names; explicit @ or $.path prefixes are preserved.
  • Testcontainer image bumped from redis:8.0 to redis:8.4 for FT.HYBRID support.

Marked @experimental in JSDoc since the underlying client.ft.hybrid() is itself flagged experimental in @redis/search.

Tests: 38 unit tests asserting toCommand() output for representative configs (KNN/RANGE methods, RRF/LINEAR fusion, score aliases, LOAD prefixing, SORTBY, NOSORT, postFilter, TIMEOUT) plus 7 integration tests against a real Redis 8.4 Testcontainer covering each fusion method, each vector method, vsimFilter, postFilter, verbatim text body, and LIMIT.

Adds HybridQuery and SearchIndex.hybridSearch() that delegate text+vector
score fusion entirely to Redis via the FT.HYBRID command (introduced in
Redis OSS 8.4.0). Unlike Python redisvl's HybridQuery — which issues two
queries client-side and fuses ranks itself — this implementation runs as
a single round-trip with server-side RRF or LINEAR fusion.

The new method is separate from index.search() because FT.HYBRID has its
own command, options shape, and reply format. HybridSearchResult<T>
extends SearchResult<T> with executionTime and warnings fields.

Notable behaviours:

- text + textFieldName triggers tokenize + escape + OR-join. Omitting
  textFieldName passes the body through verbatim so power users can use
  full Redis Search syntax.
- vsimFilter is a raw string in the FT.SEARCH filter dialect (e.g.
  '@brand:{nike}'). postFilter is a raw string in the FT.AGGREGATE
  expression dialect (e.g. '@price < 200'). The two clauses use
  different syntaxes server-side.
- LOAD always includes @__key so doc.id round-trips. Score aliases set
  via YIELD_SCORE_AS are *not* added to LOAD — Redis already injects
  them, and re-loading triggers "score alias already exists" errors.
- LOAD/SORTBY field references are auto-prefixed with @ when the user
  passes bare names; explicit @ or $.path prefixes are preserved.
- Testcontainer image bumped from redis:8.0 to redis:8.4 for FT.HYBRID
  support.

Marked @experimental in JSDoc since the underlying client.ft.hybrid()
is itself flagged experimental in @redis/search.

This change is intentionally self-contained: a tiny TokenEscaper and a
HybridTextScorer type are inlined into hybrid.ts so this PR can land
independently of the in-flight filter DSL work. A TODO at the top of
hybrid.ts tracks the cleanup commit that should follow once the filter
DSL merges (dedupe the helpers, widen vsimFilter to accept a typed
FilterExpression, drop HybridTextScorer in favour of the shared name).

Tests: 38 unit tests asserting toCommand() output for representative
configs (KNN/RANGE methods, RRF/LINEAR fusion, score aliases, LOAD
prefixing, SORTBY, NOSORT, postFilter, TIMEOUT) plus 7 integration tests
against a real Redis 8.4 Testcontainer covering each fusion method, each
vector method, vsimFilter, postFilter, verbatim text body, and LIMIT.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@banker banker requested a review from booleanhunter May 11, 2026 13:52
@banker banker marked this pull request as draft May 11, 2026 13:56
The previous matrix tested against redis/redis-stack-server tags
('latest' and '7.4.0-v8'), both of which are Redis Stack 7.x and don't
have FT.HYBRID (introduced in Redis 8.4). With the new HybridQuery
integration tests, CI would fail.

Switch to the official redis:* image — Redis 8 absorbed the Redis Stack
modules (search, JSON, time series, probabilistic) into the base image,
so a separate redis-stack-server is no longer needed. Matrix now tests
against '8.4' (minimum for FT.HYBRID, matching the Testcontainer
pinning) and 'latest'.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@banker banker marked this pull request as ready for review May 12, 2026 13:15
banker and others added 4 commits May 12, 2026 09:19
# Conflicts:
#	.github/workflows/test.yml
Covers the new HybridQuery surface: tokenised vs verbatim text body,
KNN vs RANGE vector method, RRF vs LINEAR fusion, the two filter slots
(vsimFilter in FT.SEARCH dialect, postFilter in FT.AGGREGATE dialect),
score aliases, LOAD prefixing, and the HybridSearchResult return shape
(executionTime + warnings).

Calls out the Redis 8.4+ requirement and the @experimental status of
client.ft.hybrid().

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
# Conflicts:
#	src/index.ts
#	src/indexes/search-index.ts
#	src/query/index.ts
… main

Closes #14.

Removes the inlined TokenEscaper, the HybridTextScorer alias, and the
string-only vsimFilter that hybrid.ts carried so the feature could land
independently of the filter DSL PR. With filter DSL now on main, swap to
the canonical imports:

- Drop the local TokenEscaper; import { TokenEscaper } from
  '../utils/token-escaper.js'. This also brings hybrid's tokenization in
  line with TextQuery's behaviour — wildcards (* and ?) are now escaped
  to literals by default rather than passed through. No existing test
  relied on the previous wildcard-preserving behaviour.
- Drop type HybridTextScorer; use TextScorer from './text.js' instead.
  Removes the HybridTextScorer name from src/index.ts.
- Widen vsimFilter from string to FilterInput so callers can pass either
  a typed FilterExpression (`Tag('brand').eq('nike')`) or a raw filter
  string. Route through renderFilter() from './base.js'.

postFilter remains string-only — it uses the FT.AGGREGATE expression
dialect, which FilterExpression doesn't render.

Adds a unit test exercising FilterExpression as vsimFilter, and updates
the hybrid-search docs to show the new typed form alongside the raw
string form.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@banker
Copy link
Copy Markdown
Contributor Author

banker commented May 13, 2026

Note: latest commit will close #20

@nkanu17
Copy link
Copy Markdown
Contributor

nkanu17 commented May 13, 2026

Addressed the hybrid review pass in b7b087e.
Changes:

  • Always emits a default RRF COMBINE with YIELD_SCORE_AS so default HybridQuery results have a stable combined score alias.
  • Keeps result mapping defensive by falling back to Redis default score keys if present.
  • Adds runtime validation for numResults, offset, timeout, textFieldName, returnFields, sortBy, and noSort/sortBy conflicts.
  • Restores the main CI matrix to Redis Stack latest + 7.4.0-v8, and adds a separate Redis 8.4 HybridQuery job for FT.HYBRID coverage.

Verification run locally:

  • npm run type-check
  • npm run test:unit
  • npm test -- tests/integration/hybrid-search.test.ts
  • REDISVL_SKIP_HYBRID=true npm test -- tests/integration/hybrid-search.test.ts
  • npm run build

Copy link
Copy Markdown
Collaborator

@ymendez-redis ymendez-redis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall I add the search stopwords or is this not in scope yet?

Comment thread src/query/hybrid.ts
return search;
}

private renderTextBody(): string {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is ommitting stopwords intentionally? redis-vl-python uses NLTK.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like there is an issue for adding separately since TS/Node has no NLTK equivalent

#16

Comment thread src/query/hybrid.ts

if (this.noSort) options.NOSORT = true;

if (this.postFilter !== undefined && this.postFilter !== '') {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: the!== ''half of this check is redundant — assertNonEmptyString(config.postFilter, 'postFilter') in the constructor already rejects empty strings. The!==undefined half should stay because the assertion intentionally allows optional fields, and we don't want to emit FILTER = undefined.

Comment thread src/query/hybrid.ts
if (m.type === 'KNN') {
const out: { type: 'KNN'; K: number; EF_RUNTIME?: number } = {
type: 'KNN',
K: m.k ?? 10,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: m.type and m.k is always pre-filled in the constructor and is 10 when not set. No need to add m.k ?? 10 here as this may cause a drift we may not want.

const method = config.vectorMethod ?? { type: 'KNN', k: 10 };

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants