diff --git a/README.md b/README.md index 28de472..2b65dff 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,6 @@ # QQL — Qdrant Query Language -A SQL-like CLI for [Qdrant](https://qdrant.tech), a high-performance vector database. Instead of writing Python SDK calls, you write natural query statements to insert, search, manage, and delete vector data — including rich SQL-style `WHERE` filters. +A SQL-like CLI for [Qdrant](https://qdrant.tech), a high-performance vector database. Instead of writing Python SDK calls, you write natural query statements to insert, search, manage, and delete vector data — including rich SQL-style `WHERE` filters and hybrid dense+sparse vector search. ``` qql> INSERT INTO COLLECTION notes VALUES {'text': 'Qdrant is a vector database', 'author': 'alice', 'year': 2024} @@ -11,6 +11,12 @@ qql> SEARCH notes SIMILAR TO 'vector storage engines' LIMIT 3 WHERE year >= 2023 Score │ ID │ Payload ────────┼──────────────────────────────────────┼────────────────────────────────────── 0.8931 │ 3f2e1a4b-8c91-4d0e-b123-abc123def456 │ {'text': 'Qdrant is a ...', 'author': 'alice', 'year': 2024} + +qql> SEARCH notes SIMILAR TO 'vector databases' LIMIT 5 USING HYBRID +✓ Found 1 result(s) (hybrid) + Score │ ID │ Payload +────────┼──────────────────────────────────────┼────────────────────────────────────── + 0.9102 │ 3f2e1a4b-8c91-4d0e-b123-abc123def456 │ {'text': 'Qdrant is a ...', 'author': 'alice', 'year': 2024} ``` --- @@ -25,6 +31,7 @@ qql> SEARCH notes SIMILAR TO 'vector storage engines' LIMIT 3 WHERE year >= 2023 - [INSERT — add a point](#insert--add-a-point) - [SEARCH — find similar points](#search--find-similar-points) - [WHERE Clause Filters](#where-clause-filters) + - [Hybrid Search (USING HYBRID)](#hybrid-search-using-hybrid) - [SHOW COLLECTIONS — list collections](#show-collections--list-collections) - [CREATE COLLECTION — create a collection](#create-collection--create-a-collection) - [DROP COLLECTION — delete a collection](#drop-collection--delete-a-collection) @@ -59,9 +66,7 @@ Your query string Qdrant instance ``` -When you run `INSERT`, the `text` field in your dictionary is automatically converted into a dense vector using [Fastembed](https://github.com/qdrant/fastembed). The vector and the rest of your fields (stored as payload) are then upserted into Qdrant together. You never have to manage vectors manually. - -`SEARCH` also embeds your query text and finds the nearest vectors by cosine similarity. An optional `WHERE` clause lets you pre-filter the candidate set using any payload field before similarity ranking — exactly like a SQL `WHERE` on top of a vector search. +When you run `INSERT`, the `text` field is automatically converted into a dense vector using [Fastembed](https://github.com/qdrant/fastembed). In **hybrid mode** (`USING HYBRID`), a sparse BM25 vector is also generated alongside the dense vector, and searches use Qdrant's Reciprocal Rank Fusion (RRF) to merge the results of both retrieval methods. --- @@ -173,6 +178,8 @@ If the collection does not exist yet, it is **created automatically** with the c ``` INSERT INTO COLLECTION VALUES {} INSERT INTO COLLECTION VALUES {} USING MODEL '' +INSERT INTO COLLECTION VALUES {} USING HYBRID +INSERT INTO COLLECTION VALUES {} USING HYBRID DENSE MODEL '' SPARSE MODEL '' ``` **Examples:** @@ -198,25 +205,29 @@ Insert with a specific embedding model: INSERT INTO COLLECTION articles VALUES {'text': 'hello world'} USING MODEL 'BAAI/bge-small-en-v1.5' ``` -Insert with nested metadata and tags: +Insert into a hybrid collection (dense + sparse BM25 vectors): ```sql -INSERT INTO COLLECTION articles VALUES { - 'text': 'Attention is all you need', - 'meta': {'venue': 'NeurIPS', 'citations': 50000}, - 'tags': ['transformers', 'attention', 'nlp'] -} +INSERT INTO COLLECTION articles VALUES {'text': 'Attention is all you need'} USING HYBRID +``` + +Insert with custom models for both dense and sparse: +```sql +INSERT INTO COLLECTION articles VALUES {'text': 'hello world'} + USING HYBRID DENSE MODEL 'BAAI/bge-base-en-v1.5' SPARSE MODEL 'prithivida/Splade_PP_en_v1' ``` **What happens internally:** 1. The `text` value is embedded into a dense vector using the configured model. -2. A UUID is auto-generated as the point ID. -3. All fields (including `text`) are stored in the payload. -4. The point is upserted into Qdrant. +2. In hybrid mode, a sparse BM25 vector is also generated. +3. A UUID is auto-generated as the point ID. +4. All fields (including `text`) are stored in the payload. +5. The point is upserted into Qdrant. **Rules:** - `text` is always required. Omitting it raises an error. - A point ID (UUID) is generated automatically — you do not provide one. - If the collection already exists with a different vector size (from a different model), an error is raised with a clear message. +- Hybrid inserts require a hybrid collection (created with `CREATE COLLECTION ... HYBRID` or auto-created on first `USING HYBRID` insert). --- @@ -230,7 +241,9 @@ An optional `WHERE` clause filters the candidate set **before** similarity ranki ``` SEARCH SIMILAR TO '' LIMIT SEARCH SIMILAR TO '' LIMIT USING MODEL '' -SEARCH SIMILAR TO '' LIMIT [USING MODEL ''] WHERE +SEARCH SIMILAR TO '' LIMIT [USING MODEL ''] WHERE +SEARCH SIMILAR TO '' LIMIT USING HYBRID +SEARCH SIMILAR TO '' LIMIT USING HYBRID [DENSE MODEL ''] [SPARSE MODEL ''] [WHERE ] ``` **Examples:** @@ -250,9 +263,14 @@ Search within a specific category, excluding drafts: SEARCH articles SIMILAR TO 'neural networks' LIMIT 5 WHERE category = 'ml' AND status != 'draft' ``` -Search with a model override and a filter: +Hybrid search (combines dense semantic + sparse BM25 keyword retrieval via RRF): +```sql +SEARCH articles SIMILAR TO 'attention mechanism' LIMIT 10 USING HYBRID +``` + +Hybrid search with a WHERE filter: ```sql -SEARCH articles SIMILAR TO 'embeddings' LIMIT 10 USING MODEL 'BAAI/bge-small-en-v1.5' WHERE year >= 2022 +SEARCH articles SIMILAR TO 'transformers' LIMIT 10 USING HYBRID WHERE year >= 2020 ``` **Output:** @@ -266,7 +284,7 @@ Results are displayed as a table with three columns: 0.8817 │ 7a1b2c3d-... │ {'text': 'Attention is all...', 'tags': [...]} ``` -- **Score** — cosine similarity score between 0 and 1. Higher is more similar. +- **Score** — similarity score. Higher is more relevant. - **ID** — the UUID of the matching point. - **Payload** — all fields stored alongside the vector. @@ -291,11 +309,8 @@ SEARCH articles SIMILAR TO 'ml' LIMIT 10 WHERE status != 'draft' #### Range comparisons ```sql --- Greater than / less than SEARCH articles SIMILAR TO 'ai' LIMIT 5 WHERE score > 0.8 SEARCH articles SIMILAR TO 'ai' LIMIT 5 WHERE year < 2024 - --- Greater than or equal / less than or equal SEARCH articles SIMILAR TO 'ai' LIMIT 5 WHERE score >= 0.75 SEARCH articles SIMILAR TO 'ai' LIMIT 5 WHERE year <= 2023 ``` @@ -310,51 +325,37 @@ SEARCH articles SIMILAR TO 'history of ai' LIMIT 10 WHERE year BETWEEN 2018 AND #### IN and NOT IN ```sql --- Field value must be one of the listed values SEARCH articles SIMILAR TO 'retrieval' LIMIT 10 WHERE status IN ('published', 'reviewed') - --- Field value must not be any of the listed values SEARCH articles SIMILAR TO 'retrieval' LIMIT 10 WHERE status NOT IN ('deleted', 'archived') - --- Trailing commas are allowed -SEARCH articles SIMILAR TO 'x' LIMIT 5 WHERE status IN ('a', 'b',) ``` #### IS NULL and IS NOT NULL ```sql --- Points where the reviewer field is absent or explicitly null SEARCH articles SIMILAR TO 'peer review' LIMIT 5 WHERE reviewer IS NULL - --- Points where reviewer is set to any non-null value SEARCH articles SIMILAR TO 'peer review' LIMIT 5 WHERE reviewer IS NOT NULL ``` #### IS EMPTY and IS NOT EMPTY ```sql --- Points where the tags list is empty SEARCH articles SIMILAR TO 'untagged' LIMIT 5 WHERE tags IS EMPTY - --- Points where the tags list has at least one element SEARCH articles SIMILAR TO 'categorized' LIMIT 5 WHERE tags IS NOT EMPTY ``` #### Full-text MATCH ```sql --- All terms in the string must appear in the field (full-text index required) +-- All terms must appear in the field (requires a Qdrant full-text index) SEARCH articles SIMILAR TO 'search' LIMIT 10 WHERE title MATCH 'vector database' --- Any term in the string can match +-- Any term can match SEARCH articles SIMILAR TO 'search' LIMIT 10 WHERE title MATCH ANY 'embedding retrieval' --- The exact phrase must appear +-- Exact phrase must appear SEARCH articles SIMILAR TO 'search' LIMIT 10 WHERE title MATCH PHRASE 'semantic search' ``` -> Full-text MATCH requires a Qdrant full-text index on the field. Create one in the Qdrant dashboard or via the SDK before using MATCH filters. - #### AND, OR, NOT — logical operators Operator precedence: `NOT` (highest) > `AND` > `OR` (lowest). Use parentheses to override. @@ -369,10 +370,6 @@ SEARCH articles SIMILAR TO 'llm' LIMIT 10 WHERE source = 'arxiv' OR source = 'pu -- NOT: negate a condition SEARCH articles SIMILAR TO 'benchmark' LIMIT 10 WHERE NOT status = 'draft' --- Chained AND (three conditions, all must hold) -SEARCH articles SIMILAR TO 'deep learning' LIMIT 20 - WHERE year >= 2019 AND category = 'cv' AND status != 'retracted' - -- Parentheses to group OR inside AND SEARCH articles SIMILAR TO 'conference paper' LIMIT 10 WHERE (source = 'arxiv' OR source = 'ieee') AND year >= 2022 @@ -383,26 +380,16 @@ SEARCH articles SIMILAR TO 'x' LIMIT 5 WHERE NOT (status = 'draft' OR status = ' #### Dot-notation for nested fields -Qdrant supports nested payload fields accessed with dot notation. Use the same path syntax in `WHERE`: - ```sql --- Filter on meta.source nested field SEARCH articles SIMILAR TO 'wikipedia' LIMIT 5 WHERE meta.source = 'web' - --- Filter on a deeply nested array field SEARCH cities SIMILAR TO 'large city' LIMIT 5 WHERE country.cities[].population > 1000000 ``` -#### Combined example +#### WHERE also works in hybrid mode ```sql --- Semantic search over research papers: --- must be from arxiv or IEEE, published 2020–2023, not retracted, with a reviewer assigned -SEARCH papers SIMILAR TO 'attention mechanism transformers' LIMIT 20 - WHERE (source = 'arxiv' OR source = 'ieee') - AND year BETWEEN 2020 AND 2023 - AND status != 'retracted' - AND reviewer IS NOT NULL +SEARCH articles SIMILAR TO 'deep learning' LIMIT 10 + USING HYBRID WHERE year BETWEEN 2020 AND 2024 AND status = 'published' ``` #### Full filter reference @@ -433,6 +420,94 @@ SEARCH papers SIMILAR TO 'attention mechanism transformers' LIMIT 20 --- +### Hybrid Search (USING HYBRID) + +Hybrid search combines **dense semantic vectors** and **sparse BM25 keyword vectors** in a single query and merges the results with Qdrant's **Reciprocal Rank Fusion (RRF)** algorithm. This typically outperforms either method alone — semantic search handles paraphrases and synonyms, while BM25 handles exact keyword matches. + +#### How it works internally + +1. Both a dense vector (`TextEmbedding`) and a sparse BM25 vector (`SparseTextEmbedding`) are generated from your query text. +2. Qdrant fetches the top candidates from each index independently (`prefetch limit = LIMIT × 4`). +3. The two result lists are merged using RRF — a rank-based fusion that does not require score normalization. +4. The final top-N results are returned. + +#### Step 1: Create a hybrid collection + +A hybrid collection stores both a named dense vector (`"dense"`) and a named sparse vector (`"sparse"`): + +```sql +CREATE COLLECTION articles HYBRID +``` + +This is equivalent to calling Qdrant with: +```python +vectors_config={"dense": VectorParams(size=384, distance=COSINE)}, +sparse_vectors_config={"sparse": SparseVectorParams(modifier=IDF)} +``` + +#### Step 2: Insert with hybrid vectors + +```sql +-- Uses default dense model + Qdrant/bm25 sparse model +INSERT INTO COLLECTION articles VALUES { + 'text': 'Attention is all you need', + 'author': 'Vaswani et al.', + 'year': 2017 +} USING HYBRID +``` + +If the collection does not exist yet, it is created automatically as a hybrid collection on the first `USING HYBRID` insert. + +#### Step 3: Search with hybrid retrieval + +```sql +-- Basic hybrid search +SEARCH articles SIMILAR TO 'transformer architecture' LIMIT 10 USING HYBRID + +-- Hybrid search with a WHERE filter +SEARCH articles SIMILAR TO 'attention' LIMIT 10 USING HYBRID WHERE year >= 2017 + +-- Hybrid with custom dense model +SEARCH articles SIMILAR TO 'embeddings' LIMIT 5 + USING HYBRID DENSE MODEL 'BAAI/bge-base-en-v1.5' + +-- Hybrid with both custom models +SEARCH articles SIMILAR TO 'sparse retrieval' LIMIT 5 + USING HYBRID DENSE MODEL 'BAAI/bge-base-en-v1.5' SPARSE MODEL 'prithivida/Splade_PP_en_v1' + +-- Order of DENSE MODEL / SPARSE MODEL doesn't matter +SEARCH articles SIMILAR TO 'sparse retrieval' LIMIT 5 + USING HYBRID SPARSE MODEL 'prithivida/Splade_PP_en_v1' DENSE MODEL 'BAAI/bge-base-en-v1.5' +``` + +#### Model defaults in hybrid mode + +| Argument | Default | +|---|---| +| Dense model | `self._config.default_model` (same as non-hybrid) | +| Sparse model | `Qdrant/bm25` | + +Both can be overridden independently with `DENSE MODEL` and `SPARSE MODEL`. + +#### Dense vs. hybrid — when to use which + +| Situation | Recommendation | +|---|---| +| Semantic similarity (paraphrasing, synonyms) | Dense only | +| Exact keyword matching (product codes, names) | Hybrid or BM25-only | +| General-purpose retrieval (unknown query distribution) | Hybrid | +| Low latency / small collection | Dense only | + +#### Supported sparse models (Fastembed) + +| Model | Notes | +|---|---| +| `Qdrant/bm25` | Default. Classic BM25 with IDF weighting | +| `prithivida/Splade_PP_en_v1` | SPLADE++ English, strong keyword + semantic overlap | +| `Qdrant/Unicoil` | UniCOIL sparse encoder | + +--- + ### SHOW COLLECTIONS — list collections Lists all collections in the connected Qdrant instance. @@ -468,13 +543,21 @@ Explicitly creates a new empty collection. Collections are also created automati **Syntax:** ``` CREATE COLLECTION +CREATE COLLECTION HYBRID ``` -**Example:** +**Examples:** + +Dense-only collection (standard): ```sql CREATE COLLECTION research_papers ``` +Hybrid collection (dense + sparse BM25): +```sql +CREATE COLLECTION research_papers HYBRID +``` + The collection is created using the **default embedding model's dimensions** (384 for `all-MiniLM-L6-v2`) with **cosine distance**. If the collection already exists, the command succeeds with a message and does nothing. @@ -529,7 +612,7 @@ To find a point's ID, run a SEARCH first and copy the ID from the results table. QQL uses [Fastembed](https://github.com/qdrant/fastembed) to convert text into vectors locally — no external API call is needed. -### Default model +### Dense embedding (default) ``` sentence-transformers/all-MiniLM-L6-v2 @@ -539,22 +622,37 @@ sentence-transformers/all-MiniLM-L6-v2 - Size: ~90 MB (downloaded on first use, cached locally) - Good balance of speed and quality for English text -### Specifying a different model +### Sparse embedding (hybrid mode default) + +``` +Qdrant/bm25 +``` -Add `USING MODEL ''` to INSERT or SEARCH: +- Classic BM25 with IDF weighting +- Indices and values are generated as a sparse vector; no fixed dimensions +- Uses asymmetric encoding: `embed()` for documents, `query_embed()` for queries + +### Specifying models + +Add `USING MODEL ''` for dense-only mode, or `DENSE MODEL` / `SPARSE MODEL` after `USING HYBRID`: ```sql +-- Dense only with custom model INSERT INTO docs VALUES {'text': 'hello'} USING MODEL 'BAAI/bge-small-en-v1.5' SEARCH docs SIMILAR TO 'hello' LIMIT 5 USING MODEL 'BAAI/bge-small-en-v1.5' -``` -`USING MODEL` and `WHERE` can be combined: +-- Hybrid with custom dense model +SEARCH docs SIMILAR TO 'hello' LIMIT 5 USING HYBRID DENSE MODEL 'BAAI/bge-base-en-v1.5' -```sql -SEARCH docs SIMILAR TO 'hello' LIMIT 5 USING MODEL 'BAAI/bge-small-en-v1.5' WHERE year >= 2022 +-- Hybrid with custom sparse model +SEARCH docs SIMILAR TO 'hello' LIMIT 5 USING HYBRID SPARSE MODEL 'prithivida/Splade_PP_en_v1' + +-- Hybrid with both custom +SEARCH docs SIMILAR TO 'hello' LIMIT 5 + USING HYBRID DENSE MODEL 'BAAI/bge-base-en-v1.5' SPARSE MODEL 'prithivida/Splade_PP_en_v1' ``` -### Commonly available Fastembed models +### Commonly available dense models (Fastembed) | Model | Dimensions | Notes | |---|---|---| @@ -564,6 +662,14 @@ SEARCH docs SIMILAR TO 'hello' LIMIT 5 USING MODEL 'BAAI/bge-small-en-v1.5' WHER | `BAAI/bge-large-en-v1.5` | 1024 | Best quality, slowest | | `sentence-transformers/all-mpnet-base-v2` | 768 | Strong semantic similarity | +### Commonly available sparse models (Fastembed) + +| Model | Notes | +|---|---| +| `Qdrant/bm25` | Default sparse model. Classic BM25 + IDF | +| `prithivida/Splade_PP_en_v1` | SPLADE++ — strong keyword + semantic overlap | +| `Qdrant/Unicoil` | UniCOIL sparse encoder | + > Models are downloaded automatically on first use and cached by Fastembed. Loading a new model for the first time takes a few seconds. ### Model consistency rule @@ -628,7 +734,7 @@ The connection config is stored at `~/.qql/config.json`: |---|---| | `url` | Qdrant instance URL | | `secret` | API key (null if not required) | -| `default_model` | Embedding model used when no `USING MODEL` clause is given | +| `default_model` | Dense embedding model used when no `USING MODEL` clause is given | You can edit this file directly to change the default model without reconnecting: @@ -649,7 +755,7 @@ QQL can also be used as a Python library without the CLI: ```python from qql import run_query -# Insert a document +# Insert a document (dense-only) result = run_query( "INSERT INTO COLLECTION notes VALUES {'text': 'hello world', 'author': 'alice', 'year': 2024}", url="http://localhost:6333", @@ -657,19 +763,26 @@ result = run_query( print(result.message) # "Inserted 1 point []" print(result.data) # {"id": "...", "collection": "notes"} -# Basic search +# Insert with hybrid vectors result = run_query( - "SEARCH notes SIMILAR TO 'hello' LIMIT 5", + "INSERT INTO COLLECTION notes VALUES {'text': 'hello world'} USING HYBRID", url="http://localhost:6333", ) -for hit in result.data: - print(hit["score"], hit["id"], hit["payload"]) +print(result.message) # "Inserted 1 point [] (hybrid)" -# Search with a WHERE filter +# Dense search with WHERE filter result = run_query( "SEARCH notes SIMILAR TO 'hello' LIMIT 5 WHERE year >= 2023 AND author != 'bot'", url="http://localhost:6333", ) +for hit in result.data: + print(hit["score"], hit["payload"]) + +# Hybrid search with WHERE filter +result = run_query( + "SEARCH notes SIMILAR TO 'hello' LIMIT 5 USING HYBRID WHERE year >= 2023", + url="http://localhost:6333", +) for hit in result.data: print(hit["score"], hit["payload"]) ``` @@ -687,7 +800,7 @@ client = QdrantClient(url="http://localhost:6333") config = QQLConfig(url="http://localhost:6333") executor = Executor(client, config) -query = "SEARCH articles SIMILAR TO 'deep learning' LIMIT 10 WHERE category = 'cv'" +query = "SEARCH articles SIMILAR TO 'deep learning' LIMIT 10 USING HYBRID WHERE category = 'cv'" tokens = Lexer().tokenize(query) node = Parser(tokens).parse() result = executor.execute(node) @@ -710,7 +823,8 @@ class ExecutionResult: | Operation | `result.data` type | |---|---| -| INSERT | `{"id": "", "collection": ""}` | +| INSERT (dense) | `{"id": "", "collection": ""}` | +| INSERT (hybrid) | `{"id": "", "collection": ""}` | | SEARCH | `[{"id": str, "score": float, "payload": dict}, ...]` | | SHOW COLLECTIONS | `["name1", "name2", ...]` | | CREATE COLLECTION | `None` | @@ -733,12 +847,12 @@ qql/ │ ├── lexer.py # Tokenizer: string → List[Token] │ ├── ast_nodes.py # Frozen dataclasses for each statement and filter type │ ├── parser.py # Recursive descent parser: tokens → AST node -│ ├── embedder.py # Fastembed wrapper with per-model cache -│ └── executor.py # AST node → Qdrant client call + filter conversion +│ ├── embedder.py # Embedder (dense) + SparseEmbedder (BM25) with per-model cache +│ └── executor.py # AST node → Qdrant client call + filter + hybrid search └── tests/ - ├── test_lexer.py # Tokenizer unit tests (keywords, operators, dot-paths) - ├── test_parser.py # Parser unit tests (all statements + WHERE filters) - └── test_executor.py # Executor unit tests (mocked Qdrant client + filter builders) + ├── test_lexer.py # Tokenizer unit tests (keywords, operators, dot-paths, hybrid tokens) + ├── test_parser.py # Parser unit tests (all statements + WHERE filters + hybrid clauses) + └── test_executor.py # Executor unit tests (mocked Qdrant client, filter builders, hybrid ops) ``` --- @@ -751,7 +865,7 @@ Tests do not require a running Qdrant instance — the Qdrant client is mocked. pytest tests/ -v ``` -Expected output: **118 tests passing**. +Expected output: **169 tests passing**. --- @@ -769,3 +883,4 @@ Expected output: **118 tests passing**. | `Unexpected character '@' (at position N)` | A character not part of QQL syntax | Remove or quote the offending character | | `Expected a filter operator after field '...'` | Unknown operator in WHERE clause | Use one of: `=`, `!=`, `>`, `>=`, `<`, `<=`, `IN`, `NOT IN`, `BETWEEN`, `IS NULL`, `IS NOT NULL`, `IS EMPTY`, `IS NOT EMPTY`, `MATCH` | | `Expected ')' ...` | Unclosed parenthesis in WHERE clause | Add the missing `)` to close the group | +| `Qdrant error during SEARCH: ...` | Hybrid search on a non-hybrid collection, or wrong vector names | Ensure the collection was created with `HYBRID` before using `USING HYBRID` in INSERT/SEARCH | diff --git a/pyproject.toml b/pyproject.toml index d536641..8e537ad 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -1,6 +1,6 @@ [project] name = "qql-cli" -version = "0.2.0" +version = "1.0.0" description = "A SQL-like query language CLI wrapper for Qdrant vector database" readme = "README.md" license = { file = "LICENSE" } @@ -32,6 +32,7 @@ dependencies = [ "qdrant-client[fastembed]>=1.13.0", "click>=8.1.0", "rich>=13.0.0", + "prompt_toolkit>=3.0.0", ] [project.urls] diff --git a/src/qql/ast_nodes.py b/src/qql/ast_nodes.py index b8cd943..6a9e81f 100644 --- a/src/qql/ast_nodes.py +++ b/src/qql/ast_nodes.py @@ -116,12 +116,15 @@ class NotExpr: class InsertStmt: collection: str values: dict[str, Any] # must contain "text" key - model: str | None # None → use default + model: str | None # dense model; None → use config default + hybrid: bool = False # if True, also embed + store sparse BM25 vector + sparse_model: str | None = None # sparse model; None → SparseEmbedder.DEFAULT_MODEL @dataclass(frozen=True) class CreateCollectionStmt: collection: str + hybrid: bool = False # if True, create with dense + sparse named vectors @dataclass(frozen=True) @@ -139,7 +142,9 @@ class SearchStmt: collection: str query_text: str limit: int - model: str | None + model: str | None # dense model; None → use config default + hybrid: bool = False # if True, use prefetch+RRF hybrid search + sparse_model: str | None = None # sparse model for hybrid; None → SparseEmbedder.DEFAULT_MODEL query_filter: FilterExpr | None = None # optional WHERE clause; default keeps existing tests valid diff --git a/src/qql/cli.py b/src/qql/cli.py index f0c2bed..b7ab731 100644 --- a/src/qql/cli.py +++ b/src/qql/cli.py @@ -3,8 +3,10 @@ import sys import click +from prompt_toolkit import PromptSession +from prompt_toolkit.formatted_text import HTML +from prompt_toolkit.history import InMemoryHistory from rich.console import Console -from rich.prompt import Prompt from rich.table import Table from .config import delete_config, load_config, save_config, QQLConfig @@ -24,9 +26,10 @@ [yellow]INSERT INTO COLLECTION[/yellow] [yellow]VALUES[/yellow] {[yellow]'text'[/yellow]: '...', ...} Insert a point. 'text' is required and auto-vectorized. Optional: [yellow]USING MODEL[/yellow] '' + Optional: [yellow]USING HYBRID[/yellow] [DENSE MODEL ''] [SPARSE MODEL ''] - [yellow]CREATE COLLECTION[/yellow] - Create a new collection (uses default model dimensions). + [yellow]CREATE COLLECTION[/yellow] [[yellow]HYBRID[/yellow]] + Create a new collection. Add HYBRID for dense+sparse BM25 vectors. [yellow]DROP COLLECTION[/yellow] Delete a collection and all its points. @@ -37,10 +40,19 @@ [yellow]SEARCH[/yellow] [yellow]SIMILAR TO[/yellow] '' [yellow]LIMIT[/yellow] Semantic search by vector similarity. Optional: [yellow]USING MODEL[/yellow] '' + Optional: [yellow]USING HYBRID[/yellow] [DENSE MODEL ''] [SPARSE MODEL ''] + Optional: [yellow]WHERE[/yellow] (e.g. WHERE year > 2020 AND status = 'ok') [yellow]DELETE FROM[/yellow] [yellow]WHERE id =[/yellow] '' Delete a point by its ID. +Keyboard shortcuts: + ← → arrows move cursor within the current line + ↑ ↓ arrows scroll through command history + Ctrl-A / Ctrl-E jump to beginning / end of line + Ctrl-C cancel current input + Ctrl-D exit shell + Type [bold]exit[/bold] or [bold]quit[/bold] to leave the shell. """ @@ -115,10 +127,16 @@ def _launch_repl(cfg: QQLConfig) -> None: console.print(f"[bold cyan]QQL Interactive Shell[/bold cyan] • {cfg.url}") console.print("Type [bold]help[/bold] for available commands or [bold]exit[/bold] to quit.\n") + session: PromptSession[str] = PromptSession(history=InMemoryHistory()) + while True: try: - query = Prompt.ask("[bold green]qql>[/bold green]").strip() - except (EOFError, KeyboardInterrupt): + query = session.prompt(HTML("qql> ")).strip() + except KeyboardInterrupt: + # Ctrl-C clears the current line; continue the loop + continue + except EOFError: + # Ctrl-D exits console.print("\nBye.") break diff --git a/src/qql/embedder.py b/src/qql/embedder.py index dfd5d8d..41243b5 100644 --- a/src/qql/embedder.py +++ b/src/qql/embedder.py @@ -31,3 +31,37 @@ def embed_batch(self, texts: list[str]) -> list[list[float]]: def dimensions(self) -> int: """Return the vector dimensionality by embedding a dummy string.""" return len(self.embed("probe")) + + +class SparseEmbedder: + """Sparse BM25-style embedder using fastembed.SparseTextEmbedding. + + Returns dicts with "indices" and "values" lists (not numpy arrays), + ready for direct construction of qdrant_client SparseVector objects. + + Uses asymmetric embedding: embed() for document indexing, query_embed() + for query-time encoding (BM25 IDF weighting differs at query vs. index time). + """ + + DEFAULT_MODEL = "Qdrant/bm25" + + # Class-level cache mirrors Embedder's pattern + _cache: dict[str, object] = {} + + def __init__(self, model_name: str = DEFAULT_MODEL) -> None: + self._model_name = model_name + if model_name not in SparseEmbedder._cache: + from fastembed import SparseTextEmbedding + + SparseEmbedder._cache[model_name] = SparseTextEmbedding(model_name) + self._model = SparseEmbedder._cache[model_name] + + def embed(self, text: str) -> dict[str, list]: + """Embed a document string. Returns {"indices": [...], "values": [...]}.""" + result = next(iter(self._model.embed([text]))) # type: ignore[attr-defined] + return {"indices": result.indices.tolist(), "values": result.values.tolist()} + + def query_embed(self, text: str) -> dict[str, list]: + """Embed a query string (BM25 applies different IDF weighting at query time).""" + result = next(iter(self._model.query_embed(text))) # type: ignore[attr-defined] + return {"indices": result.indices.tolist(), "values": result.values.tolist()} diff --git a/src/qql/executor.py b/src/qql/executor.py index e1936b0..9408e7d 100644 --- a/src/qql/executor.py +++ b/src/qql/executor.py @@ -10,6 +10,8 @@ Distance, FieldCondition, Filter, + Fusion, + FusionQuery, IsEmptyCondition, IsNullCondition, MatchAny, @@ -18,9 +20,13 @@ MatchText, MatchTextAny, MatchValue, + Modifier, PayloadField, PointStruct, + Prefetch, Range, + SparseVector, + SparseVectorParams, VectorParams, ) @@ -49,7 +55,7 @@ ShowCollectionsStmt, ) from .config import QQLConfig -from .embedder import Embedder +from .embedder import Embedder, SparseEmbedder from .exceptions import QQLRuntimeError @@ -86,6 +92,56 @@ def _execute_insert(self, node: InsertStmt) -> ExecutionResult: if "text" not in node.values: raise QQLRuntimeError("INSERT requires a 'text' field in VALUES") + # ── Hybrid INSERT: dense + sparse vectors ────────────────────────── + if node.hybrid: + dense_model = node.model or self._config.default_model + sparse_model_name = node.sparse_model or SparseEmbedder.DEFAULT_MODEL + dense_embedder = Embedder(dense_model) + sparse_embedder = SparseEmbedder(sparse_model_name) + + dense_vector = dense_embedder.embed(node.values["text"]) + sparse_obj = sparse_embedder.embed(node.values["text"]) + sparse_vector = SparseVector( + indices=sparse_obj["indices"], + values=sparse_obj["values"], + ) + + # Auto-create hybrid collection if it doesn't exist yet + if not self._client.collection_exists(node.collection): + self._client.create_collection( + collection_name=node.collection, + vectors_config={ + "dense": VectorParams( + size=len(dense_vector), distance=Distance.COSINE + ) + }, + sparse_vectors_config={ + "sparse": SparseVectorParams(modifier=Modifier.IDF) + }, + ) + + point_id = str(uuid.uuid4()) + try: + self._client.upsert( + collection_name=node.collection, + points=[ + PointStruct( + id=point_id, + vector={"dense": dense_vector, "sparse": sparse_vector}, + payload=dict(node.values), + ) + ], + ) + except UnexpectedResponse as e: + raise QQLRuntimeError(f"Qdrant error during INSERT: {e}") from e + + return ExecutionResult( + success=True, + message=f"Inserted 1 point [{point_id}] (hybrid)", + data={"id": point_id, "collection": node.collection}, + ) + + # ── Standard dense-only INSERT ───────────────────────────────────── model_name = node.model or self._config.default_model embedder = Embedder(model_name) vector = embedder.embed(node.values["text"]) @@ -115,6 +171,29 @@ def _execute_create(self, node: CreateCollectionStmt) -> ExecutionResult: success=True, message=f"Collection '{node.collection}' already exists", ) + + # ── Hybrid collection: named dense + sparse vectors ──────────────── + if node.hybrid: + embedder = Embedder(self._config.default_model) + dims = embedder.dimensions + self._client.create_collection( + collection_name=node.collection, + vectors_config={ + "dense": VectorParams(size=dims, distance=Distance.COSINE) + }, + sparse_vectors_config={ + "sparse": SparseVectorParams(modifier=Modifier.IDF) + }, + ) + return ExecutionResult( + success=True, + message=( + f"Collection '{node.collection}' created " + f"(hybrid: {dims}-dim dense + BM25 sparse, cosine distance)" + ), + ) + + # ── Standard dense-only collection ───────────────────────────────── embedder = Embedder(self._config.default_model) dims = embedder.dimensions self._client.create_collection( @@ -148,16 +227,64 @@ def _execute_search(self, node: SearchStmt) -> ExecutionResult: if not self._client.collection_exists(node.collection): raise QQLRuntimeError(f"Collection '{node.collection}' does not exist") - model_name = node.model or self._config.default_model - embedder = Embedder(model_name) - vector = embedder.embed(node.query_text) - + # Build WHERE filter (shared by both hybrid and dense-only paths) qdrant_filter: Filter | None = None if node.query_filter is not None: qdrant_filter = self._wrap_as_filter( self._build_qdrant_filter(node.query_filter) ) + # ── Hybrid SEARCH: prefetch dense+sparse, fuse with RRF ─────────── + if node.hybrid: + dense_model = node.model or self._config.default_model + sparse_model_name = node.sparse_model or SparseEmbedder.DEFAULT_MODEL + dense_embedder = Embedder(dense_model) + sparse_embedder = SparseEmbedder(sparse_model_name) + + dense_vector = dense_embedder.embed(node.query_text) + sparse_obj = sparse_embedder.query_embed(node.query_text) + sparse_vector = SparseVector( + indices=sparse_obj["indices"], + values=sparse_obj["values"], + ) + + try: + response = self._client.query_points( + collection_name=node.collection, + prefetch=[ + Prefetch( + query=dense_vector, + using="dense", + limit=node.limit * 4, + ), + Prefetch( + query=sparse_vector, + using="sparse", + limit=node.limit * 4, + ), + ], + query=FusionQuery(fusion=Fusion.RRF), + limit=node.limit, + query_filter=qdrant_filter, + ) + except UnexpectedResponse as e: + raise QQLRuntimeError(f"Qdrant error during SEARCH: {e}") from e + + results = [ + {"id": str(h.id), "score": round(h.score, 4), "payload": h.payload} + for h in response.points + ] + return ExecutionResult( + success=True, + message=f"Found {len(results)} result(s) (hybrid)", + data=results, + ) + + # ── Standard dense-only SEARCH ───────────────────────────────────── + model_name = node.model or self._config.default_model + embedder = Embedder(model_name) + vector = embedder.embed(node.query_text) + try: response = self._client.query_points( collection_name=node.collection, @@ -293,16 +420,26 @@ def _wrap_as_filter(self, qdrant_expr: Any) -> Filter: # ── Collection helpers ──────────────────────────────────────────────── def _ensure_collection(self, name: str, vector_size: int) -> None: - """Create the collection if it doesn't exist. Raises on dimension mismatch.""" + """Create the collection if it doesn't exist. Raises on dimension mismatch. + + For named-vector (hybrid) collections the validation is skipped — those + collections are managed directly by the hybrid insert/create paths. + """ if self._client.collection_exists(name): info = self._client.get_collection(name) - existing_size = info.config.params.vectors.size # type: ignore[union-attr] - if existing_size != vector_size: - raise QQLRuntimeError( - f"Vector dimension mismatch: collection '{name}' expects " - f"{existing_size} dims, but model produces {vector_size} dims. " - f"Specify a compatible model with USING MODEL ''." - ) + vectors = info.config.params.vectors # type: ignore[union-attr] + if isinstance(vectors, dict): + # Named-vector (hybrid) collection — skip validation here; + # the hybrid insert path manages its own collection creation. + pass + else: + # Unnamed single-vector collection: validate dimensions + if vectors.size != vector_size: + raise QQLRuntimeError( + f"Vector dimension mismatch: collection '{name}' expects " + f"{vectors.size} dims, but model produces {vector_size} dims. " + f"Specify a compatible model with USING MODEL ''." + ) else: self._client.create_collection( collection_name=name, diff --git a/src/qql/lexer.py b/src/qql/lexer.py index e3bf8e3..029a8a3 100644 --- a/src/qql/lexer.py +++ b/src/qql/lexer.py @@ -12,6 +12,9 @@ class TokenKind(Enum): VALUES = auto() USING = auto() MODEL = auto() + HYBRID = auto() + DENSE = auto() + SPARSE = auto() CREATE = auto() DROP = auto() SHOW = auto() @@ -69,6 +72,9 @@ class TokenKind(Enum): "VALUES": TokenKind.VALUES, "USING": TokenKind.USING, "MODEL": TokenKind.MODEL, + "HYBRID": TokenKind.HYBRID, + "DENSE": TokenKind.DENSE, + "SPARSE": TokenKind.SPARSE, "CREATE": TokenKind.CREATE, "DROP": TokenKind.DROP, "SHOW": TokenKind.SHOW, diff --git a/src/qql/parser.py b/src/qql/parser.py index 17fe0f9..f141745 100644 --- a/src/qql/parser.py +++ b/src/qql/parser.py @@ -77,17 +77,39 @@ def _parse_insert(self) -> InsertStmt: self._expect(TokenKind.VALUES) values = self._parse_dict() model: str | None = None + hybrid: bool = False + sparse_model: str | None = None if self._peek().kind == TokenKind.USING: self._advance() # consume USING - self._expect(TokenKind.MODEL) - model = self._expect(TokenKind.STRING).value - return InsertStmt(collection=collection, values=values, model=model) + if self._peek().kind == TokenKind.HYBRID: + self._advance() # consume HYBRID + hybrid = True + # Optional DENSE MODEL and/or SPARSE MODEL sub-clauses, any order + while self._peek().kind in (TokenKind.DENSE, TokenKind.SPARSE): + sub = self._advance() + self._expect(TokenKind.MODEL) + m = self._expect(TokenKind.STRING).value + if sub.kind == TokenKind.DENSE: + model = m + else: + sparse_model = m + else: + self._expect(TokenKind.MODEL) + model = self._expect(TokenKind.STRING).value + return InsertStmt( + collection=collection, values=values, model=model, + hybrid=hybrid, sparse_model=sparse_model, + ) def _parse_create(self) -> CreateCollectionStmt: self._expect(TokenKind.CREATE) self._expect(TokenKind.COLLECTION) collection = self._parse_identifier() - return CreateCollectionStmt(collection=collection) + hybrid: bool = False + if self._peek().kind == TokenKind.HYBRID: + self._advance() + hybrid = True + return CreateCollectionStmt(collection=collection, hybrid=hybrid) def _parse_drop(self) -> DropCollectionStmt: self._expect(TokenKind.DROP) @@ -109,10 +131,25 @@ def _parse_search(self) -> SearchStmt: self._expect(TokenKind.LIMIT) limit = int(self._expect(TokenKind.INTEGER).value) model: str | None = None + hybrid: bool = False + sparse_model: str | None = None if self._peek().kind == TokenKind.USING: - self._advance() - self._expect(TokenKind.MODEL) - model = self._expect(TokenKind.STRING).value + self._advance() # consume USING + if self._peek().kind == TokenKind.HYBRID: + self._advance() # consume HYBRID + hybrid = True + # Optional DENSE MODEL and/or SPARSE MODEL sub-clauses, any order + while self._peek().kind in (TokenKind.DENSE, TokenKind.SPARSE): + sub = self._advance() + self._expect(TokenKind.MODEL) + m = self._expect(TokenKind.STRING).value + if sub.kind == TokenKind.DENSE: + model = m + else: + sparse_model = m + else: + self._expect(TokenKind.MODEL) + model = self._expect(TokenKind.STRING).value query_filter: FilterExpr | None = None if self._peek().kind == TokenKind.WHERE: self._advance() # consume WHERE @@ -122,6 +159,8 @@ def _parse_search(self) -> SearchStmt: query_text=query_text, limit=limit, model=model, + hybrid=hybrid, + sparse_model=sparse_model, query_filter=query_filter, ) diff --git a/tests/test_executor.py b/tests/test_executor.py index b3e45af..fd70b8a 100644 --- a/tests/test_executor.py +++ b/tests/test_executor.py @@ -381,3 +381,300 @@ def test_wrap_as_filter_wraps_field_condition(self, executor): result = executor._wrap_as_filter(fc) assert isinstance(result, Filter) assert result.must[0] is fc + + +# ── Hybrid vector executor tests ────────────────────────────────────────────── + +FAKE_SPARSE = {"indices": [1, 42, 100], "values": [0.22, 0.8, 0.3]} + + +@pytest.fixture +def mock_sparse_embedder(mocker): + mock = mocker.MagicMock() + mock.embed.return_value = FAKE_SPARSE + mock.query_embed.return_value = FAKE_SPARSE + mocker.patch("qql.executor.SparseEmbedder", return_value=mock) + return mock + + +class TestHybridCreate: + def test_create_hybrid_uses_named_vector_config(self, executor, mock_client): + node = CreateCollectionStmt(collection="articles", hybrid=True) + result = executor.execute(node) + mock_client.create_collection.assert_called_once() + kw = mock_client.create_collection.call_args.kwargs + assert "sparse_vectors_config" in kw + assert "dense" in kw["vectors_config"] + assert "sparse" in kw["sparse_vectors_config"] + assert result.success is True + assert "hybrid" in result.message + + def test_create_hybrid_existing_collection_is_noop(self, executor, mock_client): + mock_client.collection_exists.return_value = True + node = CreateCollectionStmt(collection="existing", hybrid=True) + result = executor.execute(node) + mock_client.create_collection.assert_not_called() + assert "already exists" in result.message + + def test_create_non_hybrid_unchanged(self, executor, mock_client): + from qdrant_client.models import VectorParams + + node = CreateCollectionStmt(collection="col", hybrid=False) + executor.execute(node) + kw = mock_client.create_collection.call_args.kwargs + assert isinstance(kw["vectors_config"], VectorParams) + assert "sparse_vectors_config" not in kw + + +class TestHybridInsert: + def test_hybrid_insert_upsert_has_named_vectors( + self, executor, mock_client, mock_sparse_embedder + ): + mock_client.collection_exists.return_value = True + node = InsertStmt( + collection="col", values={"text": "hello"}, model=None, hybrid=True + ) + result = executor.execute(node) + mock_client.upsert.assert_called_once() + point = mock_client.upsert.call_args.kwargs["points"][0] + assert "dense" in point.vector + assert "sparse" in point.vector + assert result.success is True + assert "hybrid" in result.message + + def test_hybrid_insert_sparse_is_SparseVector( + self, executor, mock_client, mock_sparse_embedder + ): + from qdrant_client.models import SparseVector + + mock_client.collection_exists.return_value = True + node = InsertStmt( + collection="col", values={"text": "hello"}, model=None, hybrid=True + ) + executor.execute(node) + point = mock_client.upsert.call_args.kwargs["points"][0] + assert isinstance(point.vector["sparse"], SparseVector) + + def test_hybrid_insert_auto_creates_hybrid_collection( + self, executor, mock_client, mock_sparse_embedder + ): + mock_client.collection_exists.return_value = False + node = InsertStmt( + collection="col", values={"text": "hello"}, model=None, hybrid=True + ) + executor.execute(node) + kw = mock_client.create_collection.call_args.kwargs + assert "sparse_vectors_config" in kw + assert "dense" in kw["vectors_config"] + + def test_hybrid_insert_skips_create_when_exists( + self, executor, mock_client, mock_sparse_embedder + ): + mock_client.collection_exists.return_value = True + node = InsertStmt( + collection="col", values={"text": "hello"}, model=None, hybrid=True + ) + executor.execute(node) + mock_client.create_collection.assert_not_called() + + def test_hybrid_insert_uses_custom_dense_model( + self, executor, mock_client, mock_sparse_embedder, mocker + ): + mock_client.collection_exists.return_value = True + node = InsertStmt( + collection="col", values={"text": "hi"}, model="BAAI/bge-small-en-v1.5", + hybrid=True, + ) + executor.execute(node) + # Embedder should have been called with the custom dense model name + call_args = mocker.patch.object # already patched by mock_embedder fixture + # Verify through the dense vector in the upsert call + point = mock_client.upsert.call_args.kwargs["points"][0] + assert "dense" in point.vector + + def test_hybrid_insert_uses_custom_sparse_model( + self, executor, mock_client, mocker + ): + mock_client.collection_exists.return_value = True + mock_sparse = mocker.MagicMock() + mock_sparse.embed.return_value = FAKE_SPARSE + sparse_cls = mocker.patch("qql.executor.SparseEmbedder", return_value=mock_sparse) + node = InsertStmt( + collection="col", values={"text": "hi"}, model=None, + hybrid=True, sparse_model="prithivida/Splade_PP_en_v1", + ) + executor.execute(node) + sparse_cls.assert_called_once_with("prithivida/Splade_PP_en_v1") + + def test_non_hybrid_insert_uses_flat_vector(self, executor, mock_client): + node = InsertStmt( + collection="col", values={"text": "hello"}, model=None, hybrid=False + ) + executor.execute(node) + point = mock_client.upsert.call_args.kwargs["points"][0] + assert isinstance(point.vector, list) + + def test_hybrid_insert_missing_text_raises( + self, executor, mock_client, mock_sparse_embedder + ): + node = InsertStmt( + collection="col", values={"author": "alice"}, model=None, hybrid=True + ) + with pytest.raises(QQLRuntimeError, match="'text' field"): + executor.execute(node) + + +class TestHybridSearch: + def test_hybrid_search_uses_prefetch( + self, executor, mock_client, mock_sparse_embedder, mocker + ): + mock_client.collection_exists.return_value = True + mock_resp = mocker.MagicMock() + mock_resp.points = [] + mock_client.query_points.return_value = mock_resp + + node = SearchStmt( + collection="col", query_text="ml", limit=10, model=None, hybrid=True + ) + result = executor.execute(node) + mock_client.query_points.assert_called_once() + kw = mock_client.query_points.call_args.kwargs + assert "prefetch" in kw + assert len(kw["prefetch"]) == 2 + assert result.success is True + assert "hybrid" in result.message + + def test_hybrid_search_uses_rrf_fusion( + self, executor, mock_client, mock_sparse_embedder, mocker + ): + from qdrant_client.models import Fusion, FusionQuery + + mock_client.collection_exists.return_value = True + mock_resp = mocker.MagicMock() + mock_resp.points = [] + mock_client.query_points.return_value = mock_resp + + node = SearchStmt( + collection="col", query_text="q", limit=5, model=None, hybrid=True + ) + executor.execute(node) + kw = mock_client.query_points.call_args.kwargs + assert isinstance(kw["query"], FusionQuery) + assert kw["query"].fusion == Fusion.RRF + + def test_hybrid_search_prefetch_limit_is_4x( + self, executor, mock_client, mock_sparse_embedder, mocker + ): + mock_client.collection_exists.return_value = True + mock_resp = mocker.MagicMock() + mock_resp.points = [] + mock_client.query_points.return_value = mock_resp + + node = SearchStmt( + collection="col", query_text="q", limit=5, model=None, hybrid=True + ) + executor.execute(node) + prefetches = mock_client.query_points.call_args.kwargs["prefetch"] + assert all(p.limit == 20 for p in prefetches) + + def test_hybrid_search_prefetch_using_fields( + self, executor, mock_client, mock_sparse_embedder, mocker + ): + mock_client.collection_exists.return_value = True + mock_resp = mocker.MagicMock() + mock_resp.points = [] + mock_client.query_points.return_value = mock_resp + + node = SearchStmt( + collection="col", query_text="q", limit=5, model=None, hybrid=True + ) + executor.execute(node) + prefetches = mock_client.query_points.call_args.kwargs["prefetch"] + usings = {p.using for p in prefetches} + assert usings == {"dense", "sparse"} + + def test_hybrid_search_with_where_filter( + self, executor, mock_client, mock_sparse_embedder, mocker + ): + from qql.ast_nodes import CompareExpr + from qdrant_client.models import Filter + + mock_client.collection_exists.return_value = True + mock_resp = mocker.MagicMock() + mock_resp.points = [] + mock_client.query_points.return_value = mock_resp + + node = SearchStmt( + collection="col", query_text="q", limit=5, model=None, hybrid=True, + query_filter=CompareExpr(field="year", op=">", value=2020), + ) + executor.execute(node) + kw = mock_client.query_points.call_args.kwargs + assert kw.get("query_filter") is not None + assert isinstance(kw["query_filter"], Filter) + + def test_hybrid_search_nonexistent_collection_raises( + self, executor, mock_client, mock_sparse_embedder + ): + mock_client.collection_exists.return_value = False + node = SearchStmt( + collection="ghost", query_text="q", limit=5, model=None, hybrid=True + ) + with pytest.raises(QQLRuntimeError, match="does not exist"): + executor.execute(node) + + def test_non_hybrid_search_unchanged(self, executor, mock_client, mocker): + mock_client.collection_exists.return_value = True + mock_resp = mocker.MagicMock() + mock_resp.points = [] + mock_client.query_points.return_value = mock_resp + + node = SearchStmt( + collection="col", query_text="q", limit=5, model=None, hybrid=False + ) + executor.execute(node) + kw = mock_client.query_points.call_args.kwargs + assert "prefetch" not in kw or kw.get("prefetch") is None + + def test_hybrid_search_uses_custom_sparse_model( + self, executor, mock_client, mocker + ): + mock_client.collection_exists.return_value = True + mock_resp = mocker.MagicMock() + mock_resp.points = [] + mock_client.query_points.return_value = mock_resp + + mock_sparse = mocker.MagicMock() + mock_sparse.query_embed.return_value = FAKE_SPARSE + sparse_cls = mocker.patch("qql.executor.SparseEmbedder", return_value=mock_sparse) + + node = SearchStmt( + collection="col", query_text="q", limit=5, model=None, + hybrid=True, sparse_model="prithivida/Splade_PP_en_v1", + ) + executor.execute(node) + sparse_cls.assert_called_once_with("prithivida/Splade_PP_en_v1") + + +class TestEnsureCollectionHybridCompat: + def test_named_vector_collection_skips_validation(self, executor, mock_client): + from qdrant_client.models import VectorParams + + mock_client.collection_exists.return_value = True + # Simulate a named-vector (hybrid) collection: vectors is a dict + mock_client.get_collection.return_value.config.params.vectors = { + "dense": VectorParams(size=384, distance="Cosine") + } + # Should not raise even with a different size argument + executor._ensure_collection("hybrid_col", 384) + mock_client.create_collection.assert_not_called() + + def test_unnamed_vector_mismatch_still_raises(self, executor, mock_client): + from qdrant_client.models import VectorParams + + mock_client.collection_exists.return_value = True + mock_client.get_collection.return_value.config.params.vectors = VectorParams( + size=768, distance="Cosine" + ) + with pytest.raises(QQLRuntimeError, match="dimension mismatch"): + executor._ensure_collection("col", 384) diff --git a/tests/test_lexer.py b/tests/test_lexer.py index 93a00ea..df41b59 100644 --- a/tests/test_lexer.py +++ b/tests/test_lexer.py @@ -185,3 +185,47 @@ class TestEOF: def test_ends_with_eof(self): tokens = tokenize("hello") assert tokens[-1].kind == TokenKind.EOF + + +class TestHybridKeyword: + def test_hybrid_keyword_uppercase(self): + ks = kinds("HYBRID") + assert ks[0] == TokenKind.HYBRID + + def test_hybrid_keyword_lowercase(self): + ks = kinds("hybrid") + assert ks[0] == TokenKind.HYBRID + + def test_dense_keyword(self): + ks = kinds("DENSE") + assert ks[0] == TokenKind.DENSE + + def test_dense_keyword_lowercase(self): + ks = kinds("dense") + assert ks[0] == TokenKind.DENSE + + def test_sparse_keyword(self): + ks = kinds("SPARSE") + assert ks[0] == TokenKind.SPARSE + + def test_sparse_keyword_lowercase(self): + ks = kinds("sparse") + assert ks[0] == TokenKind.SPARSE + + def test_hybrid_in_create_statement(self): + ks = kinds("CREATE COLLECTION articles HYBRID") + assert ks[3] == TokenKind.HYBRID + + def test_hybrid_in_search_statement(self): + ks = kinds("SEARCH col SIMILAR TO 'q' LIMIT 5 USING HYBRID") + assert TokenKind.HYBRID in ks + + def test_dense_as_identifier_in_dotted_path(self): + tokens = tokenize("dense.field") + assert tokens[0].kind == TokenKind.IDENTIFIER + assert tokens[0].value == "dense.field" + + def test_sparse_as_identifier_in_dotted_path(self): + tokens = tokenize("sparse.value") + assert tokens[0].kind == TokenKind.IDENTIFIER + assert tokens[0].value == "sparse.value" diff --git a/tests/test_parser.py b/tests/test_parser.py index adcb2ed..19b2650 100644 --- a/tests/test_parser.py +++ b/tests/test_parser.py @@ -326,3 +326,154 @@ def test_not_negates_parenthesized_group(self): def test_missing_rparen_raises(self): with pytest.raises(QQLSyntaxError): parse("SEARCH docs SIMILAR TO 'x' LIMIT 5 WHERE (a = '1'") + + +# ── Hybrid vector tests ─────────────────────────────────────────────────────── + +class TestHybridCreate: + def test_create_hybrid_sets_flag(self): + node = parse("CREATE COLLECTION articles HYBRID") + assert isinstance(node, CreateCollectionStmt) + assert node.collection == "articles" + assert node.hybrid is True + + def test_create_non_hybrid_default_false(self): + node = parse("CREATE COLLECTION articles") + assert node.hybrid is False + + def test_create_hybrid_case_insensitive(self): + node = parse("create collection col hybrid") + assert node.hybrid is True + + +class TestHybridInsert: + def test_insert_using_hybrid_sets_flag(self): + node = parse("INSERT INTO COLLECTION col VALUES {'text': 'hi'} USING HYBRID") + assert isinstance(node, InsertStmt) + assert node.hybrid is True + assert node.model is None + assert node.sparse_model is None + + def test_insert_non_hybrid_default(self): + node = parse("INSERT INTO COLLECTION col VALUES {'text': 'hi'}") + assert node.hybrid is False + assert node.sparse_model is None + + def test_insert_using_model_still_works(self): + node = parse("INSERT INTO COLLECTION col VALUES {'text': 'hi'} USING MODEL 'my-model'") + assert node.hybrid is False + assert node.model == "my-model" + assert node.sparse_model is None + + def test_insert_hybrid_dense_model(self): + node = parse( + "INSERT INTO COLLECTION col VALUES {'text': 'hi'} " + "USING HYBRID DENSE MODEL 'BAAI/bge-small-en-v1.5'" + ) + assert node.hybrid is True + assert node.model == "BAAI/bge-small-en-v1.5" + assert node.sparse_model is None + + def test_insert_hybrid_sparse_model(self): + node = parse( + "INSERT INTO COLLECTION col VALUES {'text': 'hi'} " + "USING HYBRID SPARSE MODEL 'Qdrant/bm25'" + ) + assert node.hybrid is True + assert node.model is None + assert node.sparse_model == "Qdrant/bm25" + + def test_insert_hybrid_both_models(self): + node = parse( + "INSERT INTO COLLECTION col VALUES {'text': 'hi'} " + "USING HYBRID DENSE MODEL 'BAAI/bge-base-en-v1.5' SPARSE MODEL 'Qdrant/bm25'" + ) + assert node.hybrid is True + assert node.model == "BAAI/bge-base-en-v1.5" + assert node.sparse_model == "Qdrant/bm25" + + def test_insert_hybrid_both_models_reversed_order(self): + node = parse( + "INSERT INTO COLLECTION col VALUES {'text': 'hi'} " + "USING HYBRID SPARSE MODEL 'Qdrant/bm25' DENSE MODEL 'BAAI/bge-base-en-v1.5'" + ) + assert node.hybrid is True + assert node.model == "BAAI/bge-base-en-v1.5" + assert node.sparse_model == "Qdrant/bm25" + + +class TestHybridSearch: + def test_search_using_hybrid_sets_flag(self): + node = parse("SEARCH articles SIMILAR TO 'ml' LIMIT 10 USING HYBRID") + assert isinstance(node, SearchStmt) + assert node.hybrid is True + assert node.model is None + assert node.sparse_model is None + + def test_search_non_hybrid_default(self): + node = parse("SEARCH articles SIMILAR TO 'ml' LIMIT 10") + assert node.hybrid is False + assert node.sparse_model is None + + def test_search_using_model_still_works(self): + node = parse("SEARCH articles SIMILAR TO 'ml' LIMIT 5 USING MODEL 'my-model'") + assert node.hybrid is False + assert node.model == "my-model" + assert node.sparse_model is None + + def test_search_hybrid_dense_model(self): + node = parse( + "SEARCH articles SIMILAR TO 'ml' LIMIT 10 " + "USING HYBRID DENSE MODEL 'BAAI/bge-small-en-v1.5'" + ) + assert node.hybrid is True + assert node.model == "BAAI/bge-small-en-v1.5" + assert node.sparse_model is None + + def test_search_hybrid_sparse_model(self): + node = parse( + "SEARCH articles SIMILAR TO 'ml' LIMIT 10 " + "USING HYBRID SPARSE MODEL 'prithivida/Splade_PP_en_v1'" + ) + assert node.hybrid is True + assert node.model is None + assert node.sparse_model == "prithivida/Splade_PP_en_v1" + + def test_search_hybrid_both_models(self): + node = parse( + "SEARCH articles SIMILAR TO 'ml' LIMIT 10 " + "USING HYBRID DENSE MODEL 'BAAI/bge-base-en-v1.5' SPARSE MODEL 'Qdrant/bm25'" + ) + assert node.hybrid is True + assert node.model == "BAAI/bge-base-en-v1.5" + assert node.sparse_model == "Qdrant/bm25" + + def test_search_hybrid_both_models_reversed_order(self): + node = parse( + "SEARCH articles SIMILAR TO 'ml' LIMIT 10 " + "USING HYBRID SPARSE MODEL 'Qdrant/bm25' DENSE MODEL 'BAAI/bge-base-en-v1.5'" + ) + assert node.hybrid is True + assert node.model == "BAAI/bge-base-en-v1.5" + assert node.sparse_model == "Qdrant/bm25" + + def test_search_hybrid_with_where(self): + node = parse( + "SEARCH articles SIMILAR TO 'ml' LIMIT 10 USING HYBRID WHERE year > 2020" + ) + assert node.hybrid is True + assert isinstance(node.query_filter, CompareExpr) + assert node.query_filter.field == "year" + + def test_search_hybrid_dense_model_and_where(self): + node = parse( + "SEARCH articles SIMILAR TO 'ml' LIMIT 10 " + "USING HYBRID DENSE MODEL 'BAAI/bge-small-en-v1.5' WHERE year > 2020" + ) + assert node.hybrid is True + assert node.model == "BAAI/bge-small-en-v1.5" + assert isinstance(node.query_filter, CompareExpr) + + def test_search_hybrid_limit_preserved(self): + node = parse("SEARCH col SIMILAR TO 'q' LIMIT 7 USING HYBRID") + assert node.limit == 7