Skip to content

feat: SEARCH parity — OFFSET, SCORE THRESHOLD, LOOKUP FROM + RECOMMEND integer literal fix#37

Merged
pavanjava merged 1 commit into
pavanjava:mainfrom
srimon12:feat/search-parity-and-recommend-fix
May 20, 2026
Merged

feat: SEARCH parity — OFFSET, SCORE THRESHOLD, LOOKUP FROM + RECOMMEND integer literal fix#37
pavanjava merged 1 commit into
pavanjava:mainfrom
srimon12:feat/search-parity-and-recommend-fix

Conversation

@srimon12
Copy link
Copy Markdown
Collaborator

@srimon12 srimon12 commented May 19, 2026

Summary

Closes ROADMAP items #2 and #3:

  • SEARCH Parity: Adds OFFSET, SCORE THRESHOLD, and LOOKUP FROM to SEARCH statements, with full support across dense, sparse, hybrid, and GROUP BY execution paths.
  • RECOMMEND Fix: SCORE THRESHOLD now accepts integer literals (e.g., SCORE THRESHOLD 1) instead of requiring a float token.

What's New

Parser

  • New optional clauses in SEARCH ... LIMIT <n>:
    • OFFSET <n> — skip first N results
    • SCORE THRESHOLD <f> — drop results below this score
    • LOOKUP FROM <collection> [VECTOR '<name>'] — cross-collection vector lookup
  • Hardening:
    • Negative OFFSET raises QQLSyntaxError (guarded in both SEARCH and RECOMMEND)
    • OFFSET > 0 combined with GROUP BY raises QQLSyntaxError (Qdrant groups use cursors, not offset)
    • RECOMMEND SCORE THRESHOLD now uses _parse_number() to accept both int and float literals

Executor

  • Forwards offset, score_threshold, and lookup_from to Qdrant SDK for:
    • Dense SEARCH
    • Sparse-only SEARCH
    • Hybrid SEARCH
    • SEARCH ... GROUP BY (offset is intentionally omitted; score_threshold and lookup_from are passed)
  • offset=0 is translated to None when calling the SDK (clean default behavior)

Tests (604 passing)

  • test_search_forwards_offset_score_lookup — dense path
  • test_search_forwards_offset_score_lookup_hybrid — hybrid path
  • test_search_forwards_offset_score_lookup_sparse — sparse path
  • test_search_groups_forwards_score_lookup — grouped path
  • test_search_forwards_offset_0_as_none — verifies 0 → None
  • test_recommend_forwards_offset_0_as_none — verifies RECOMMEND 0 → None
  • Parser tests for:
    • integer SCORE THRESHOLD on SEARCH
    • negative OFFSET rejection
    • GROUP BY + OFFSET rejection
    • LOOKUP FROM with and without VECTOR

Docs & CLI Help

  • docs/search.md — new examples, clause order block, GROUP BY incompatibility warning
  • docs/reference.md — pagination/score filtering examples, updated test count
  • docs/programmatic.md — updated SEARCH and RECOMMEND examples
  • src/qql/cli.py — SEARCH and RECOMMEND help now list all optional clauses

Bug Fix

  • Restored original EXACT merging logic in _parse_search so WITH { ... } EXACT correctly preserves prior with_clause fields while setting exact=True

Files Changed

  • src/qql/ast_nodes.py — document new fields on SearchStmt and RecommendStmt
  • src/qql/parser.py — parse new SEARCH clauses, guard invalid combos, fix RECOMMEND number parsing
  • src/qql/executor.py — wire new params through all search paths
  • src/qql/cli.py — sync help text with parser capabilities
  • docs/search.md — new examples, clause order, GROUP BY warning
  • docs/reference.md — new examples, updated test count
  • docs/programmatic.md — updated code examples
  • tests/test_parser.py — parser coverage for new syntax and edge cases
  • tests/test_executor.py — executor coverage for all search paths and RECOMMEND

Checklist

  • 604 tests passing
  • No breaking changes to existing syntax
  • CLI help synced
  • Docs updated

Summary by CodeRabbit

Release Notes

  • New Features

    • Added pagination support via OFFSET clause for search queries.
    • Added quality filtering via SCORE THRESHOLD clause for both search and recommendation queries.
    • Added cross-collection vector lookup via LOOKUP FROM clause.
  • Documentation

    • Expanded query syntax documentation with examples for pagination, score filtering, and cross-collection retrieval.
    • Added clause ordering requirements and compatibility notes (e.g., OFFSET cannot be used with GROUP BY).

Review Change Stack

…hreshold, and cross-collection lookup support
@srimon12 srimon12 requested a review from pavanjava May 19, 2026 19:56
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 19, 2026

📝 Walkthrough

Walkthrough

This PR extends QQL's SEARCH statement with three new optional clauses: OFFSET for pagination, SCORE THRESHOLD for score-based filtering, and LOOKUP FROM for cross-collection vector retrieval. The feature is implemented across the AST, parser, executor, and documented with examples and syntax reference.

Changes

OFFSET, SCORE THRESHOLD, and LOOKUP FROM Feature

Layer / File(s) Summary
AST Contract and CLI Documentation
src/qql/ast_nodes.py, src/qql/cli.py
SearchStmt gains offset (int, default 0), score_threshold (float|None), and lookup_from (tuple[str, str|None]|None) fields. RecommendStmt field documentation is updated. CLI help text documents all new optional clauses and constraint that OFFSET is unsupported with GROUP BY.
Parser Implementation
src/qql/parser.py
Parser recognizes optional OFFSET <n> (non-negative integer), SCORE THRESHOLD <number>, and LOOKUP FROM <collection> [VECTOR '<name>'] in SEARCH statements. Validates non-negative OFFSET and rejects OFFSET > 0 with GROUP BY. Parsed values are threaded into SearchStmt construction. RECOMMEND parser updated to validate non-negative OFFSET and parse SCORE THRESHOLD as general number (not just FLOAT token).
Executor Implementation
src/qql/executor.py
_execute_search and _execute_search_groups convert node.lookup_from into Qdrant LookupLocation and forward it to query_points/query_points_groups calls across hybrid, sparse-only, and dense-only search variants alongside existing offset and score_threshold parameters.
Parser Test Coverage
tests/test_parser.py
New TestSearch cases validate OFFSET + SCORE THRESHOLD + LOOKUP FROM parsing including vector lookup, float/int score thresholds, and error cases for negative OFFSET and OFFSET with GROUP BY. TestRecommend updated to validate negative OFFSET error and integer SCORE THRESHOLD parsing.
Executor Test Coverage
tests/test_executor.py
New TestSearch cases verify Executor.execute(SearchStmt) forwards offset and score_threshold to query_points/query_points_groups and maps lookup_from.collection/.vector correctly across hybrid, sparse, and grouped variants. TestRecommend verifies offset=0 is translated to None in query_points kwargs.
User Documentation
docs/programmatic.md, docs/reference.md, docs/search.md
Example SEARCH queries updated with SCORE THRESHOLD clauses. New examples demonstrate OFFSET pagination, SCORE THRESHOLD filtering, and LOOKUP FROM cross-collection retrieval. New "Clause Order" section documents strict clause ordering. Added GROUP BY incompatibility notes for OFFSET and RERANK. Test count baseline updated from 549 to 604.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 Hops through queries with offset and grace,
Score thresholds filter at a measured pace,
Cross collections leap with lookup's light,
QQL search bounds grow clearer, bright!

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 3.85% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main changes: adding OFFSET, SCORE THRESHOLD, and LOOKUP FROM support to SEARCH, and fixing RECOMMEND to accept integer score threshold literals.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
tests/test_parser.py (1)

286-289: ⚡ Quick win

Add a boundary test for OFFSET 0 with GROUP BY.

Given the parser only rejects OFFSET > 0, add one test that OFFSET 0 GROUP BY ... parses successfully to lock that contract.

Proposed test
+    def test_search_group_by_with_zero_offset_allowed(self):
+        node = parse("SEARCH notes SIMILAR TO 'hi' LIMIT 5 OFFSET 0 GROUP BY author")
+        assert node.group_by == "author"
+        assert node.offset == 0
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@tests/test_parser.py` around lines 286 - 289, Add a new test function (e.g.,
test_search_group_by_with_offset_zero_parses) in tests/test_parser.py that calls
parse("SEARCH notes SIMILAR TO 'hi' LIMIT 5 OFFSET 0 GROUP BY author") and
asserts that it parses successfully (no QQLSyntaxError is raised) — you can
simply call parse and assert a truthy result or not wrap it in pytest.raises;
this complements the existing test_search_group_by_with_offset_raises which
checks OFFSET > 0 triggers QQLSyntaxError.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/qql/cli.py`:
- Line 105: Update the CLI help text so it matches parser behavior: replace the
current message "OFFSET is not supported with GROUP BY." with a precise
statement that only OFFSET > 0 is disallowed with GROUP BY (e.g., "OFFSET > 0 is
not supported with GROUP BY; OFFSET 0 is allowed"). Locate and edit the help
string used when constructing the OFFSET/limit argument in src/qql/cli.py (the
parser.add_argument help text for the OFFSET option) and run the CLI tests or
smoke commands to confirm the wording now aligns with the parser behavior.

---

Nitpick comments:
In `@tests/test_parser.py`:
- Around line 286-289: Add a new test function (e.g.,
test_search_group_by_with_offset_zero_parses) in tests/test_parser.py that calls
parse("SEARCH notes SIMILAR TO 'hi' LIMIT 5 OFFSET 0 GROUP BY author") and
asserts that it parses successfully (no QQLSyntaxError is raised) — you can
simply call parse and assert a truthy result or not wrap it in pytest.raises;
this complements the existing test_search_group_by_with_offset_raises which
checks OFFSET > 0 triggers QQLSyntaxError.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro Plus

Run ID: ebd297f8-dedf-448b-82bd-48295e5a2a36

📥 Commits

Reviewing files that changed from the base of the PR and between 05be76d and 28dc287.

📒 Files selected for processing (9)
  • docs/programmatic.md
  • docs/reference.md
  • docs/search.md
  • src/qql/ast_nodes.py
  • src/qql/cli.py
  • src/qql/executor.py
  • src/qql/parser.py
  • tests/test_executor.py
  • tests/test_parser.py

Comment thread src/qql/cli.py
Optional: [yellow]GROUP BY[/yellow] <field> [[yellow]GROUP_SIZE[/yellow] <n>]
Group results by a payload field value (default GROUP_SIZE: 3).
Field must be keyword or integer type. RERANK and GROUP BY cannot be combined.
OFFSET is not supported with GROUP BY.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Align GROUP BY/OFFSET help text with actual parser behavior.

Line 105 currently states OFFSET is unsupported with GROUP BY, but parser only rejects OFFSET > 0 (so OFFSET 0 is accepted). Please make docs and behavior consistent.

Proposed doc fix
-                  OFFSET is not supported with GROUP BY.
+                  OFFSET > 0 is not supported with GROUP BY (OFFSET 0 is treated as no offset).
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
OFFSET is not supported with GROUP BY.
OFFSET > 0 is not supported with GROUP BY (OFFSET 0 is treated as no offset).
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/qql/cli.py` at line 105, Update the CLI help text so it matches parser
behavior: replace the current message "OFFSET is not supported with GROUP BY."
with a precise statement that only OFFSET > 0 is disallowed with GROUP BY (e.g.,
"OFFSET > 0 is not supported with GROUP BY; OFFSET 0 is allowed"). Locate and
edit the help string used when constructing the OFFSET/limit argument in
src/qql/cli.py (the parser.add_argument help text for the OFFSET option) and run
the CLI tests or smoke commands to confirm the wording now aligns with the
parser behavior.

@pavanjava pavanjava merged commit 4c2e2ea into pavanjava:main May 20, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants