Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 4 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
[![PyPI version](https://img.shields.io/pypi/v/qql-cli?color=blue&label=PyPI)](https://pypi.org/project/qql-cli/)
[![Python 3.12+](https://img.shields.io/pypi/pyversions/qql-cli)](https://pypi.org/project/qql-cli/)
[![MIT License](https://img.shields.io/badge/license-MIT-green)](LICENSE)
[![Tests](https://img.shields.io/badge/tests-375%20passing-brightgreen)](tests/)
[![Tests](https://img.shields.io/badge/tests-405%20passing-brightgreen)](tests/)

Write `INSERT`, `SEARCH`, `RECOMMEND`, `DELETE`, and `CREATE COLLECTION` statements instead of Python SDK calls. Supports hybrid dense+sparse vector search, cross-encoder reranking, quantization (scalar, turbo, binary, product), SQL-style `WHERE` filters, script execution, and collection dump/restore.

Expand Down Expand Up @@ -48,7 +48,7 @@ Your query string
Qdrant instance
```

When you run `INSERT`, the `text` field is automatically converted into a dense vector using [Fastembed](https://github.com/qdrant/fastembed). In **hybrid mode** (`USING HYBRID`), a sparse BM25 vector is also generated alongside the dense vector, and searches use Qdrant's Reciprocal Rank Fusion (RRF) to merge the results of both retrieval methods.
When you run `INSERT`, the `text` field is automatically converted into a dense vector using [Fastembed](https://github.com/qdrant/fastembed). In **hybrid mode** (`USING HYBRID`), a sparse BM25 vector is also generated alongside the dense vector, and searches use Qdrant's Reciprocal Rank Fusion (RRF) by default to merge the results of both retrieval methods. You can switch hybrid search to DBSF with `FUSION 'dbsf'`.

---

Expand Down Expand Up @@ -102,6 +102,7 @@ INSERT BULK INTO COLLECTION articles VALUES [{'text': '...'}, {'text': '...'}]
SEARCH articles SIMILAR TO 'query' LIMIT 10
SEARCH articles SIMILAR TO 'query' LIMIT 10 WHERE year >= 2020
SEARCH articles SIMILAR TO 'query' LIMIT 10 USING HYBRID
SEARCH articles SIMILAR TO 'query' LIMIT 10 USING HYBRID FUSION 'dbsf'
SEARCH articles SIMILAR TO 'query' LIMIT 10 USING HYBRID RERANK

-- Recommend
Expand Down Expand Up @@ -137,7 +138,7 @@ Tests do not require a running Qdrant instance — the Qdrant client is mocked.
pytest tests/ -v
```

Expected: **375 tests passing**.
Expected: **405 tests passing**.

---

Expand Down
2 changes: 1 addition & 1 deletion docs/getting-started.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ Your query string
Qdrant instance
```

When you run `INSERT`, the `text` field is automatically converted into a dense vector using [Fastembed](https://github.com/qdrant/fastembed). In **hybrid mode** (`USING HYBRID`), a sparse BM25 vector is also generated alongside the dense vector, and searches use Qdrant's Reciprocal Rank Fusion (RRF) to merge the results of both retrieval methods.
When you run `INSERT`, the `text` field is automatically converted into a dense vector using [Fastembed](https://github.com/qdrant/fastembed). In **hybrid mode** (`USING HYBRID`), a sparse BM25 vector is also generated alongside the dense vector, and searches use Qdrant's Reciprocal Rank Fusion (RRF) by default to merge the results of both retrieval methods. You can override that with `FUSION 'dbsf'` on hybrid searches.

---

Expand Down
2 changes: 1 addition & 1 deletion docs/index.html
Original file line number Diff line number Diff line change
Expand Up @@ -114,7 +114,7 @@ <h1>QQL</h1>
<a href="https://pypi.org/project/qql-cli/"><img src="https://img.shields.io/pypi/v/qql-cli?color=blue&label=PyPI" alt="PyPI version" /></a>
<a href="https://pypi.org/project/qql-cli/"><img src="https://img.shields.io/pypi/pyversions/qql-cli" alt="Python versions" /></a>
<a href="https://github.com/pavanjava/qql/blob/main/LICENSE"><img src="https://img.shields.io/badge/license-MIT-green" alt="MIT License" /></a>
<a href="https://github.com/pavanjava/qql/actions"><img src="https://img.shields.io/badge/tests-375%20passing-brightgreen" alt="375 tests" /></a>
<a href="https://github.com/pavanjava/qql/actions"><img src="https://img.shields.io/badge/tests-405%20passing-brightgreen" alt="405 tests" /></a>
</div>

<pre><span class="cmt"># Install</span>
Expand Down
5 changes: 4 additions & 1 deletion docs/reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,9 @@ SEARCH docs SIMILAR TO 'hello' LIMIT 5 USING MODEL 'BAAI/bge-small-en-v1.5'
-- Hybrid with custom dense model
SEARCH docs SIMILAR TO 'hello' LIMIT 5 USING HYBRID DENSE MODEL 'BAAI/bge-base-en-v1.5'

-- Hybrid with explicit fusion strategy
SEARCH docs SIMILAR TO 'hello' LIMIT 5 USING HYBRID FUSION 'dbsf'

-- Hybrid with both custom
SEARCH docs SIMILAR TO 'hello' LIMIT 5
USING HYBRID DENSE MODEL 'BAAI/bge-base-en-v1.5' SPARSE MODEL 'prithivida/Splade_PP_en_v1'
Expand Down Expand Up @@ -159,7 +162,7 @@ Tests do not require a running Qdrant instance — the Qdrant client is mocked.
pytest tests/ -v
```

Expected output: **375 tests passing**.
Expected output: **405 tests passing**.

---

Expand Down
12 changes: 8 additions & 4 deletions docs/search.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ SEARCH <collection_name> SIMILAR TO '<query_text>' LIMIT <n>
SEARCH <collection_name> SIMILAR TO '<query_text>' LIMIT <n> USING MODEL '<model_name>'
SEARCH <collection_name> SIMILAR TO '<query_text>' LIMIT <n> [USING MODEL '<model>'] WHERE <filter>
SEARCH <collection_name> SIMILAR TO '<query_text>' LIMIT <n> USING HYBRID
SEARCH <collection_name> SIMILAR TO '<query_text>' LIMIT <n> USING HYBRID [DENSE MODEL '<model>'] [SPARSE MODEL '<model>'] [WHERE <filter>]
SEARCH <collection_name> SIMILAR TO '<query_text>' LIMIT <n> USING HYBRID [FUSION 'rrf|dbsf'] [DENSE MODEL '<model>'] [SPARSE MODEL '<model>'] [WHERE <filter>]
SEARCH <collection_name> SIMILAR TO '<query_text>' LIMIT <n> USING SPARSE [MODEL '<sparse_model>']
SEARCH <collection_name> SIMILAR TO '<query_text>' LIMIT <n> EXACT
SEARCH <collection_name> SIMILAR TO '<query_text>' LIMIT <n> [USING ...] [WHERE <filter>] [RERANK] WITH { hnsw_ef: <n>, exact: true|false, acorn: true|false }
Expand All @@ -33,7 +33,7 @@ Search only papers published after 2020:
SEARCH articles SIMILAR TO 'deep learning' LIMIT 10 WHERE year > 2020
```

Hybrid search (combines dense semantic + sparse BM25 keyword retrieval via RRF):
Hybrid search (combines dense semantic + sparse BM25 keyword retrieval via RRF by default):
```sql
SEARCH articles SIMILAR TO 'attention mechanism' LIMIT 10 USING HYBRID
```
Expand Down Expand Up @@ -100,13 +100,13 @@ SEARCH articles SIMILAR TO 'RAG' LIMIT 10 WHERE tag = 'li' WITH { acorn: true }

## Hybrid Search (USING HYBRID)

Hybrid search combines **dense semantic vectors** and **sparse BM25 keyword vectors** in a single query and merges the results with Qdrant's **Reciprocal Rank Fusion (RRF)** algorithm. This typically outperforms either method alone.
Hybrid search combines **dense semantic vectors** and **sparse BM25 keyword vectors** in a single query. By default QQL merges the two result sets with Qdrant's **Reciprocal Rank Fusion (RRF)** algorithm, and you can optionally switch to **DBSF** with a `FUSION` clause.

### How it works internally

1. Both a dense vector (`TextEmbedding`) and a sparse BM25 vector (`SparseTextEmbedding`) are generated from your query text.
2. Qdrant fetches the top candidates from each index independently (`prefetch limit = LIMIT × 4`).
3. The two result lists are merged using RRF — a rank-based fusion that does not require score normalization.
3. The two result lists are merged using the selected fusion strategy (`RRF` by default, or `DBSF` when requested).
4. The final top-N results are returned.

### Step 1: Create a hybrid collection
Expand Down Expand Up @@ -139,6 +139,9 @@ SEARCH articles SIMILAR TO 'transformer architecture' LIMIT 10 USING HYBRID
-- Hybrid search with a WHERE filter
SEARCH articles SIMILAR TO 'attention' LIMIT 10 USING HYBRID WHERE year >= 2017

-- Hybrid with DBSF fusion
SEARCH articles SIMILAR TO 'hybrid retrieval' LIMIT 10 USING HYBRID FUSION 'dbsf'

-- Hybrid with custom dense model
SEARCH articles SIMILAR TO 'embeddings' LIMIT 5
USING HYBRID DENSE MODEL 'BAAI/bge-base-en-v1.5'
Expand All @@ -154,6 +157,7 @@ SEARCH articles SIMILAR TO 'sparse retrieval' LIMIT 5
|---|---|
| Dense model | configured default (`sentence-transformers/all-MiniLM-L6-v2`) |
| Sparse model | `Qdrant/bm25` |
| Fusion | `rrf` |

### Dense vs. hybrid — when to use which

Expand Down
1 change: 1 addition & 0 deletions src/qql/ast_nodes.py
Original file line number Diff line number Diff line change
Expand Up @@ -187,6 +187,7 @@ class SearchStmt:
limit: int
model: str | None # dense model; None → use config default
hybrid: bool = False # if True, use prefetch+RRF hybrid search
fusion: str | None = None # hybrid fusion strategy; None → default rrf
sparse_only: bool = False # if True, query only the sparse vector (no dense)
sparse_model: str | None = None # sparse model for hybrid/sparse-only; None → SparseEmbedder.DEFAULT_MODEL
query_filter: FilterExpr | None = None # optional WHERE clause; default keeps existing tests valid
Expand Down
2 changes: 1 addition & 1 deletion src/qql/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@
[yellow]SEARCH[/yellow] <name> [yellow]SIMILAR TO[/yellow] '<text>' [yellow]LIMIT[/yellow] <n>
Semantic search by vector similarity.
Optional: [yellow]USING MODEL[/yellow] '<model>'
Optional: [yellow]USING HYBRID[/yellow] [DENSE MODEL '<model>'] [SPARSE MODEL '<model>']
Optional: [yellow]USING HYBRID[/yellow] [FUSION 'rrf|dbsf'] [DENSE MODEL '<model>'] [SPARSE MODEL '<model>']
Optional: [yellow]USING SPARSE[/yellow] [MODEL '<model>'] sparse-vector-only search
Optional: [yellow]WHERE[/yellow] <filter> (e.g. WHERE year > 2020 AND status = 'ok')
Optional: [yellow]RERANK[/yellow] [MODEL '<model>'] rerank results with a cross-encoder
Expand Down
13 changes: 11 additions & 2 deletions src/qql/executor.py
Original file line number Diff line number Diff line change
Expand Up @@ -429,7 +429,7 @@ def _execute_search(self, node: SearchStmt) -> ExecutionResult:
# enough material to reorder; only `node.limit` results are returned.
fetch_limit = node.limit * _RERANK_FETCH_MULTIPLIER if node.rerank else node.limit

# ── Hybrid SEARCH: prefetch dense+sparse, fuse with RRF ───────────
# ── Hybrid SEARCH: prefetch dense+sparse, fuse with the requested strategy ──
if node.hybrid:
dense_model = node.model or self._config.default_model
sparse_model_name = node.sparse_model or SparseEmbedder.DEFAULT_MODEL
Expand Down Expand Up @@ -460,7 +460,7 @@ def _execute_search(self, node: SearchStmt) -> ExecutionResult:
params=search_params,
),
],
query=FusionQuery(fusion=Fusion.RRF),
query=FusionQuery(fusion=self._resolve_hybrid_fusion(node.fusion)),
limit=fetch_limit,
query_filter=qdrant_filter,
)
Expand Down Expand Up @@ -563,6 +563,15 @@ def _execute_search(self, node: SearchStmt) -> ExecutionResult:
data=results,
)

def _resolve_hybrid_fusion(self, fusion: str | None) -> Fusion:
if fusion is None or fusion == "rrf":
return Fusion.RRF
if fusion == "dbsf":
return Fusion.DBSF
raise QQLRuntimeError(
f"Unsupported hybrid fusion '{fusion}'; expected 'rrf' or 'dbsf'"
)

def _execute_recommend(self, node: RecommendStmt) -> ExecutionResult:
if not self._client.collection_exists(node.collection):
raise QQLRuntimeError(f"Collection '{node.collection}' does not exist")
Expand Down
2 changes: 2 additions & 0 deletions src/qql/lexer.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ class TokenKind(Enum):
USING = auto()
MODEL = auto()
HYBRID = auto()
FUSION = auto()
DENSE = auto()
SPARSE = auto()
RERANK = auto()
Expand Down Expand Up @@ -102,6 +103,7 @@ class TokenKind(Enum):
"USING": TokenKind.USING,
"MODEL": TokenKind.MODEL,
"HYBRID": TokenKind.HYBRID,
"FUSION": TokenKind.FUSION,
"DENSE": TokenKind.DENSE,
"SPARSE": TokenKind.SPARSE,
"RERANK": TokenKind.RERANK,
Expand Down
17 changes: 15 additions & 2 deletions src/qql/parser.py
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,8 @@
TokenKind.LTE: "<=",
}

_HYBRID_FUSION_VALUES = {"rrf", "dbsf"}


class Parser:
def __init__(self, tokens: list[Token]) -> None:
Expand Down Expand Up @@ -304,16 +306,26 @@ def _parse_search(self) -> SearchStmt:

model: str | None = None
hybrid: bool = False
fusion: str | None = None
sparse_only: bool = False
sparse_model: str | None = None
if self._peek().kind == TokenKind.USING:
self._advance() # consume USING
if self._peek().kind == TokenKind.HYBRID:
self._advance() # consume HYBRID
hybrid = True
# Optional DENSE MODEL and/or SPARSE MODEL sub-clauses, any order
while self._peek().kind in (TokenKind.DENSE, TokenKind.SPARSE):
# Optional FUSION / DENSE MODEL / SPARSE MODEL sub-clauses, any order.
while self._peek().kind in (TokenKind.FUSION, TokenKind.DENSE, TokenKind.SPARSE):
sub = self._advance()
if sub.kind == TokenKind.FUSION:
value_tok = self._expect(TokenKind.STRING)
fusion = value_tok.value.lower()
if fusion not in _HYBRID_FUSION_VALUES:
raise QQLSyntaxError(
f"Unsupported hybrid fusion '{value_tok.value}'; expected 'rrf' or 'dbsf'",
value_tok.pos,
)
continue
self._expect(TokenKind.MODEL)
m = self._expect(TokenKind.STRING).value
if sub.kind == TokenKind.DENSE:
Expand Down Expand Up @@ -368,6 +380,7 @@ def _parse_search(self) -> SearchStmt:
limit=limit,
model=model,
hybrid=hybrid,
fusion=fusion,
sparse_only=sparse_only,
sparse_model=sparse_model,
query_filter=query_filter,
Expand Down
23 changes: 23 additions & 0 deletions tests/test_executor.py
Original file line number Diff line number Diff line change
Expand Up @@ -1063,6 +1063,29 @@ def test_hybrid_search_uses_rrf_fusion(
assert isinstance(kw["query"], FusionQuery)
assert kw["query"].fusion == Fusion.RRF

def test_hybrid_search_uses_dbsf_fusion(
self, executor, mock_client, mock_sparse_embedder, mocker
):
from qdrant_client.models import Fusion, FusionQuery

mock_client.collection_exists.return_value = True
mock_resp = mocker.MagicMock()
mock_resp.points = []
mock_client.query_points.return_value = mock_resp

node = SearchStmt(
collection="col",
query_text="q",
limit=5,
model=None,
hybrid=True,
fusion="dbsf",
)
executor.execute(node)
kw = mock_client.query_points.call_args.kwargs
assert isinstance(kw["query"], FusionQuery)
assert kw["query"].fusion == Fusion.DBSF

def test_hybrid_search_prefetch_limit_is_4x(
self, executor, mock_client, mock_sparse_embedder, mocker
):
Expand Down
4 changes: 4 additions & 0 deletions tests/test_lexer.py
Original file line number Diff line number Diff line change
Expand Up @@ -212,6 +212,10 @@ def test_sparse_keyword_lowercase(self):
ks = kinds("sparse")
assert ks[0] == TokenKind.SPARSE

def test_fusion_keyword(self):
ks = kinds("FUSION")
assert ks[0] == TokenKind.FUSION

def test_hybrid_in_create_statement(self):
ks = kinds("CREATE COLLECTION articles HYBRID")
assert ks[3] == TokenKind.HYBRID
Expand Down
22 changes: 22 additions & 0 deletions tests/test_parser.py
Original file line number Diff line number Diff line change
Expand Up @@ -704,6 +704,24 @@ def test_search_hybrid_with_where(self):
assert isinstance(node.query_filter, CompareExpr)
assert node.query_filter.field == "year"

def test_search_hybrid_with_dbsf_fusion(self):
node = parse(
"SEARCH docs SIMILAR TO 'q' LIMIT 10 USING HYBRID FUSION 'dbsf'"
)
assert node.hybrid is True
assert node.fusion == "dbsf"

def test_search_hybrid_with_fusion_and_models(self):
node = parse(
"SEARCH docs SIMILAR TO 'q' LIMIT 10 "
"USING HYBRID FUSION 'rrf' SPARSE MODEL 'Qdrant/bm25' "
"DENSE MODEL 'BAAI/bge-base-en-v1.5'"
)
assert node.hybrid is True
assert node.fusion == "rrf"
assert node.sparse_model == "Qdrant/bm25"
assert node.model == "BAAI/bge-base-en-v1.5"

def test_search_hybrid_dense_model_and_where(self):
node = parse(
"SEARCH articles SIMILAR TO 'ml' LIMIT 10 "
Expand All @@ -713,6 +731,10 @@ def test_search_hybrid_dense_model_and_where(self):
assert node.model == "BAAI/bge-small-en-v1.5"
assert isinstance(node.query_filter, CompareExpr)

def test_search_hybrid_rejects_unknown_fusion(self):
with pytest.raises(QQLSyntaxError, match="Unsupported hybrid fusion"):
parse("SEARCH docs SIMILAR TO 'q' LIMIT 10 USING HYBRID FUSION 'x'")

def test_search_hybrid_limit_preserved(self):
node = parse("SEARCH col SIMILAR TO 'q' LIMIT 7 USING HYBRID")
assert node.limit == 7
Expand Down
Loading