Skip to content

feat(query): add AggregationQuery for FT.AGGREGATE#28

Merged
ajGingrich merged 7 commits into
mainfrom
feat/aggregation-query-v2
May 20, 2026
Merged

feat(query): add AggregationQuery for FT.AGGREGATE#28
ajGingrich merged 7 commits into
mainfrom
feat/aggregation-query-v2

Conversation

@ajGingrich
Copy link
Copy Markdown
Collaborator

Summary

Adds AggregationQuery + SearchIndex.aggregate(), closing the last major gap on the query surface. Mirrors Python redisvl's AggregationQuery (base class only — the hybrid text+vector subclass is out of scope since HybridQuery already covers FT.HYBRID).

  • Fluent builder: groupBy / apply / sortBy / limit / filter / load / params / dialect / timeout / verbatim / addScores. Steps render in the order they're called, mirroring the FT.AGGREGATE pipeline.
  • Reducers namespace exposes factory functions for COUNT, COUNT_DISTINCT, COUNT_DISTINCTISH, SUM, MIN, MAX, AVG, STDDEV, QUANTILE, TOLIST, FIRST_VALUE, RANDOM_SAMPLE — each takes an optional as alias.
  • Constructor accepts FilterInput (string | FilterExpression) for the pre-aggregation filter, matching the rest of the DSL.
  • Result shape: { total, results } where each row is Record<string, string>. Numeric casting is left to the caller — Redis hands back string values over the wire.

Why

Issue #15. After the filter DSL PR landed FilterQuery/CountQuery/VectorRangeQuery/TextQuery (all on FT.SEARCH), FT.AGGREGATE was the next missing primitive. Python redisvl exposes it as a first-class peer; TS users currently have to drop down to the raw client.ft.aggregate() API.

Notable design choices

  • Fluent vs config-object. Used a chainable builder because FT.AGGREGATE is inherently pipeline-shaped (each step operates on the previous step's output). This matches Python redisvl and feels natural for analytics-style queries. (HybridQuery is config-object because it's flat configuration of a single command.)
  • Hybrid aggregation out of scope. Python redisvl has AggregateHybridQuery subclassing AggregationQuery for text+vector via FT.AGGREGATE. Our hybrid story is FT.HYBRID via HybridQuery, so the analogue isn't needed.
  • Returns raw rows, not SearchDocuments. Aggregation rows aren't documents — they're computed groupings. The reducer aliases are the column names, and values come back as strings (FT.AGGREGATE wire format). Numeric casting is the caller's job.

Test plan

  • Unit tests (25) — option-shape assertions across every step kind + validation edge cases. npx vitest run --config vitest.unit.config.ts tests/unit/query/aggregation.test.ts
  • Type check — npm run type-check and npm run type-check:tests both clean.
  • Prettier formatted.
  • Integration tests (5) — against a real Redis 8.x. Run with npm test. (Couldn't exercise the Testcontainer locally; relying on CI.)

Docs

New user guide at website/docs/user-guide/aggregation.md, added to the User Guide sidebar between advanced-vector-search and vectorizers.

Closes #15

🤖 Generated with Claude Code

ajGingrich and others added 3 commits May 15, 2026 15:13
Closes the last major gap on the query surface: a general-purpose
AggregationQuery + SearchIndex.aggregate() pair that wraps FT.AGGREGATE.
Mirrors Python redisvl's AggregationQuery (base class only — the hybrid
text+vector subclass stays out of scope since HybridQuery already covers
FT.HYBRID).

API shape:

- Fluent builder: groupBy / apply / sortBy / limit / filter / load /
  params / dialect / timeout / verbatim / addScores. Steps render in
  the order they're called, mirroring the FT.AGGREGATE pipeline.
- Reducers namespace: count, countDistinct, countDistinctish, sum, min,
  max, avg, stddev, quantile, toList, firstValue, randomSample — each
  takes an optional `as` alias.
- Constructor accepts FilterInput (string | FilterExpression) for the
  pre-aggregation filter, matching the rest of the DSL.
- Bare field names are auto-prefixed with `@`; explicit `@`/`$.` refs
  pass through unchanged.

Result shape returned by index.aggregate() is { total, results } where
each row is Record<string, string>. Numeric casting is left to the
caller — Redis hands back string values over the wire.

Tests: 25 unit tests covering option shape across every step kind +
validation edge cases, plus 5 integration tests against a real Redis
exercising GROUPBY/REDUCE, pre-aggregation filtering, APPLY+SORTBY+LIMIT,
post-aggregation FILTER, and PARAMS binding.

Docs: new website/docs/user-guide/aggregation.md walkthrough.

Closes #15

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ergo has the highest avg unit price (800/4 = 200), not acme
(1625/17 ≈ 95.6). The reducer math was right; the test fixture
arithmetic was wrong.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Re-export AggregationQuery/Reducers (and supporting types) from
  src/index.ts so they're reachable from the package root — without this
  the docs' `import { AggregationQuery } from 'redisvl'` would fail.
- aggregate(): handle rows returned as Map instances (RESP3 / MAP
  type-mapping), which Object.entries() silently turns into {}.
- AggregationQuery.toCommand(): skip PARAMS when the map is empty,
  matching the FT.SEARCH path — Redis rejects `PARAMS 0`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@ajGingrich ajGingrich marked this pull request as ready for review May 15, 2026 22:12
Two surface-bug fixes raised in review:

- `SearchIndex.aggregate()` widens the row value type to
  `string | string[]` and preserves arrays from list reducers like
  `TOLIST` verbatim instead of stringifying them through `String(v)`
  (which silently flattened `['a','b']` into `'a,b'` and made the
  result ambiguous).

- `AggregationQuery.groupBy([])` now renders `GROUPBY 0` for whole-
  result reducers (e.g. average price across the entire match set).
  node-redis already supports this by omitting the `properties` key,
  so the validation just needed to stop rejecting the empty-array
  shape — and we still reject `groupBy([])` with zero reducers since
  that's never meaningful.

Adds unit + integration coverage for both paths.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 18, 2026 20:45
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a fluent AggregationQuery builder plus a SearchIndex.aggregate() method to expose FT.AGGREGATE as a first-class query type, closing the last major gap on the query surface. The builder records pipeline steps (GROUPBY / APPLY / SORTBY / LIMIT / FILTER) in call order and ships a Reducers namespace of factory functions. Unit + integration tests and a docs page round it out.

Changes:

  • New AggregationQuery class + Reducers factory namespace in src/query/aggregation.ts, re-exported from the package root.
  • New SearchIndex.aggregate() that runs the command and normalizes RESP2/RESP3 row shapes.
  • Unit tests, an integration test, a user-guide doc, and a sidebar entry.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
src/query/aggregation.ts New builder, reducer factories, step rendering, validation.
src/indexes/search-index.ts New aggregate() method on the index; row normalization.
src/index.ts Re-exports of the new public surface.
src/query/index.ts Barrel export for aggregation.ts.
tests/unit/query/aggregation.test.ts 25 unit tests for option-shape assertions and validation.
tests/unit/indexes/search-index.test.ts Mock-based aggregate() tests (TOLIST, Map rows, GROUPBY 0).
tests/integration/aggregation.test.ts End-to-end tests against a real Redis.
website/docs/user-guide/aggregation.md New user-guide page with examples and API table.
website/sidebars.ts Inserts the new doc into the user-guide sidebar.
Comments suppressed due to low confidence (1)

website/docs/user-guide/aggregation.md:166

  • Same broken ./hybrid-search.md link as on line 9 — the file doesn't exist in the docs tree.
- [Hybrid search](./hybrid-search.md) — text + vector fusion via `FT.HYBRID`.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread website/docs/user-guide/aggregation.md Outdated
Comment thread src/query/aggregation.ts Outdated
Comment thread src/query/aggregation.ts
Comment thread src/indexes/search-index.ts
Comment thread src/indexes/search-index.ts
Comment thread src/query/aggregation.ts
Comment thread src/query/aggregation.ts Outdated
Comment thread src/query/aggregation.ts
Comment thread website/docs/user-guide/aggregation.md
RESP=3 is rejected at SearchIndex construction (see PR #29), so the
Map-row branch in aggregate() is purely about the RESP=2 Map type-
mapping opt-in, not about RESP3.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Comment thread src/indexes/search-index.ts Outdated
Comment thread src/query/aggregation.ts
- Drop dead HybridQuery references from the AggregationQuery module
  doc and the aggregation docs page so the surface no longer points
  at code or pages that don't exist.
- Replace the bogus `AggregateValue` placeholder in `aggregate()`
  JSDoc with the actual `Record<string, string | string[]>` shape.
- Reject `groupBy(prop, [])`: a property list without reducers is
  invalid in FT.AGGREGATE, mirroring the existing GROUPBY-0 check.
- Switch `Reducers.firstValue` to the options-object form
  (`{ by?, as? }`) for readability; update tests and docs.
- Default `DIALECT` to 2 in `toCommand()` so `$param` substitution
  works without callers having to remember `.dialect(2)`; explicit
  overrides still win.
- Allow `limit(0, 0)` for count-only queries; the previous
  positive-integer guard rejected a valid Redis form.
- Fix the package name in JSDoc and docs imports: `redisvl`
  -> `redis-vl` so copy/paste examples actually resolve.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 18, 2026 22:39
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 14 out of 14 changed files in this pull request and generated 6 comments.

Comments suppressed due to low confidence (1)

website/docs/user-guide/aggregation.md:164

  • Same documentation mismatch as above: the row type is described as Array<Record<string, string>> but aggregate() actually returns Array<Record<string, string | string[]>>. The advice "If you need numeric types, cast at the call site" is also misleading for TOLIST/array reducers, since those values are arrays rather than strings to be cast.
```typescript
const { total, results } = await index.aggregate(q);
// total:   number — the row count Redis reports after aggregation
// results: Array<Record<string, string>> — one entry per emitted row

If you need numeric types, cast at the call site (Number(row.revenue)). Aggregation reducers preserve numeric precision on the server side; the wire format simply hands them back as strings.

</details>

Comment thread website/docs/user-guide/aggregation.md Outdated
Comment thread src/query/aggregation.ts
Comment thread src/query/aggregation.ts Outdated
Comment thread src/query/aggregation.ts
Comment thread src/query/aggregation.ts
Comment thread src/query/aggregation.ts
- prefixFieldRef: reject bare `$name` references as likely typos; only
  `$.path` (JSONPath) is accepted as already-prefixed.
- sortBy(max): allow `0` for symmetry with `limit(0, 0)`.
- Update user-guide row-shape docs to `Record<string, string | string[]>`
  to reflect that `Reducers.toList` returns `string[]`.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@ajGingrich ajGingrich merged commit 3d6a588 into main May 20, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add AggregationQuery (FT.AGGREGATE general query type)

4 participants