Skip to content

Commit 301bad5

Browse files
committed
Fixed all issues
1 parent 573f850 commit 301bad5

8 files changed

Lines changed: 280 additions & 5 deletions

File tree

README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55
[![PyPI version](https://img.shields.io/pypi/v/qql-cli?color=blue&label=PyPI)](https://pypi.org/project/qql-cli/)
66
[![Python 3.12+](https://img.shields.io/pypi/pyversions/qql-cli)](https://pypi.org/project/qql-cli/)
77
[![MIT License](https://img.shields.io/badge/license-MIT-green)](LICENSE)
8-
[![Tests](https://img.shields.io/badge/tests-485%20passing-brightgreen)](tests/)
8+
[![Tests](https://img.shields.io/badge/tests-500%20passing-brightgreen)](tests/)
99

1010
Write `INSERT`, `SELECT`, `SEARCH`, `SCROLL`, `RECOMMEND`, `UPDATE`, `DELETE`, and `CREATE COLLECTION` statements instead of Python SDK calls. Supports hybrid dense+sparse vector search, grouped search (GROUP BY), cross-encoder reranking, quantization (scalar, turbo, binary, product), SQL-style `WHERE` filters, script execution, and collection dump/restore.
1111

@@ -158,7 +158,7 @@ Tests do not require a running Qdrant instance — the Qdrant client is mocked.
158158
pytest tests/ -v
159159
```
160160

161-
Expected: **485 tests passing**.
161+
Expected: **500 tests passing**.
162162

163163
---
164164

docs/reference.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -162,7 +162,7 @@ Tests do not require a running Qdrant instance — the Qdrant client is mocked.
162162
pytest tests/ -v
163163
```
164164

165-
Expected output: **485 tests passing**.
165+
Expected output: **500 tests passing**.
166166

167167
---
168168

@@ -188,6 +188,8 @@ Expected output: **485 tests passing**.
188188
| `Expected a vector list [...] after point ID in UPDATE SET VECTOR` | UPDATE SET VECTOR missing the `[...]` float array | Add the vector array: `UPDATE ... SET VECTOR WHERE id = '...' [0.1, 0.2, ...]` |
189189
| `Qdrant error during UPDATE VECTOR: ...` | Point does not exist, or vector dimensions mismatch | Verify the point ID exists and the vector length matches the collection's dimensions |
190190
| `Qdrant error during UPDATE PAYLOAD: ...` | Qdrant rejected the payload update | Check field values and collection state |
191+
| `Vector elements must be numeric; got invalid value: ...` | A non-numeric value (string, boolean, or null) was present in the vector array for `UPDATE SET VECTOR` | Ensure all vector elements are floats: `UPDATE … [0.1, 0.2, …, 0.N]` |
192+
| `GROUP_SIZE must be a positive integer, got N` | `GROUP_SIZE 0` or a negative value was specified | Use a positive integer: `GROUP_SIZE 3` |
191193
| `Qdrant error during SCROLL: ...` | Qdrant rejected scroll request | Verify collection state, filter, and cursor (`AFTER`) value |
192194
| `Unknown index type '...'` | Invalid schema type in CREATE INDEX | Use one of: `keyword`, `integer`, `float`, `bool`, `text`, `geo`, `datetime` |
193195
| `Qdrant error during CREATE INDEX: ...` | Qdrant rejected the index creation | Check field name and collection state |

docs/search.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -362,7 +362,7 @@ SEARCH <collection> SIMILAR TO '<query>' LIMIT <n> USING HYBRID GROUP BY <field>
362362

363363
- **`LIMIT <n>`** — maximum number of **groups** to return.
364364
- **`GROUP_SIZE <m>`** — maximum number of points per group (default: **3**).
365-
- **`GROUP BY <field>`** — the payload field whose values define the groups. Dot-notation is supported (e.g. `meta.author`). The field should be indexed as `keyword` or `integer` for best performance.
365+
- **`GROUP BY <field>`** — the payload field whose values define the groups. **Must be a string (keyword) or number (integer) field** — this is enforced by Qdrant. Dot-notation is supported (e.g. `meta.author`). Array-valued fields are allowed: a point with multiple values for the field can appear in multiple groups. The field should be indexed as `keyword` or `integer` for best performance (see [CREATE INDEX](collections.md)).
366366
- `WHERE` filters, `USING HYBRID`, and `USING MODEL` are all compatible with GROUP BY.
367367
- **`GROUP BY` and `RERANK` cannot be combined** in the same statement — this raises a syntax error.
368368

src/qql/cli.py

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -71,6 +71,9 @@
7171
Optional: [yellow]RERANK[/yellow] [MODEL '<model>'] rerank results with a cross-encoder
7272
Optional: [yellow]EXACT[/yellow] bypass HNSW and perform exact search
7373
Optional: [yellow]WITH[/yellow] { hnsw_ef: <int>, exact: <bool>, acorn: <bool> } search parameters
74+
Optional: [yellow]GROUP BY[/yellow] <field> [[yellow]GROUP_SIZE[/yellow] <n>]
75+
Group results by a payload field value (default GROUP_SIZE: 3).
76+
Field must be keyword or integer type. RERANK and GROUP BY cannot be combined.
7477
7578
[yellow]RECOMMEND FROM[/yellow] <name> [yellow]POSITIVE IDS[/yellow] (<id>, ...)
7679
Find points similar to known examples.
@@ -82,6 +85,15 @@
8285
[yellow]DELETE FROM[/yellow] <name> [yellow]WHERE id =[/yellow] '<id>'
8386
Delete a point by its ID.
8487
88+
[yellow]UPDATE[/yellow] <name> [yellow]SET VECTOR WHERE id =[/yellow] '<id>'|<int> [<vector>]
89+
Replace the dense vector for a single point by ID.
90+
The point must already exist. Vector is a float array: [0.1, 0.2, ..., 0.N]
91+
92+
[yellow]UPDATE[/yellow] <name> [yellow]SET PAYLOAD WHERE id =[/yellow] '<id>'|<int> {<payload>}
93+
[yellow]UPDATE[/yellow] <name> [yellow]SET PAYLOAD WHERE[/yellow] <filter> {<payload>}
94+
Merge new key/value pairs into a point's payload (additive; existing fields preserved).
95+
Supports all WHERE filter operators. Filter-based updates affect all matching points.
96+
8597
Script files (in-shell):
8698
[yellow]EXECUTE[/yellow] <path> or [yellow]\\e[/yellow] <path>
8799
Run a .qql script file. Statements are executed in order.
@@ -458,6 +470,32 @@ def _run_and_print(executor: Executor, query: str) -> None:
458470
console.print(_format_collection_diagnostics(result.data))
459471
return
460472

473+
# Pretty-print grouped search results (GROUP BY)
474+
if (
475+
isinstance(result.data, list)
476+
and result.data
477+
and isinstance(result.data[0], dict)
478+
and "group_id" in result.data[0]
479+
):
480+
for group in result.data:
481+
console.print(f"\n[bold cyan]Group: {group['group_id']}[/bold cyan]")
482+
hits = group.get("hits", [])
483+
if hits:
484+
tbl = Table(show_header=True, header_style="bold")
485+
tbl.add_column("Score", style="green", no_wrap=True, justify="right")
486+
tbl.add_column("ID")
487+
tbl.add_column("Payload")
488+
for hit in hits:
489+
tbl.add_row(
490+
str(hit["score"]),
491+
str(hit["id"]),
492+
str(hit.get("payload", {})),
493+
)
494+
console.print(tbl)
495+
else:
496+
console.print(" (no hits)")
497+
return
498+
461499
# Pretty-print search results
462500
if isinstance(result.data, list) and result.data and isinstance(result.data[0], dict) and "score" in result.data[0]:
463501
table = Table(show_header=True, header_style="bold cyan")

src/qql/executor.py

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -983,6 +983,23 @@ def _execute_search_groups(
983983
query_filter=qdrant_filter,
984984
)
985985
label = "hybrid, grouped"
986+
elif node.sparse_only:
987+
sparse_model_name = node.sparse_model or SparseEmbedder.DEFAULT_MODEL
988+
sparse_obj = SparseEmbedder(sparse_model_name).query_embed(node.query_text)
989+
sparse_vector = SparseVector(
990+
indices=sparse_obj["indices"],
991+
values=sparse_obj["values"],
992+
)
993+
response = self._client.query_points_groups(
994+
collection_name=node.collection,
995+
group_by=node.group_by,
996+
query=sparse_vector,
997+
using="sparse",
998+
limit=node.limit,
999+
group_size=node.group_size,
1000+
query_filter=qdrant_filter,
1001+
)
1002+
label = "sparse, grouped"
9861003
else:
9871004
model_name = node.model or self._config.default_model
9881005
vector = Embedder(model_name).embed(node.query_text)

src/qql/parser.py

Lines changed: 14 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -439,7 +439,13 @@ def _parse_search(self) -> SearchStmt:
439439
)
440440
if self._peek().kind == TokenKind.GROUP_SIZE:
441441
self._advance() # consume GROUP_SIZE
442+
gs_tok = self._peek()
442443
group_size = int(self._expect(TokenKind.INTEGER).value)
444+
if group_size <= 0:
445+
raise QQLSyntaxError(
446+
f"GROUP_SIZE must be a positive integer, got {group_size}",
447+
gs_tok.pos,
448+
)
443449
return SearchStmt(
444450
collection=collection,
445451
query_text=query_text,
@@ -566,10 +572,17 @@ def _parse_update(self) -> UpdateVectorStmt | UpdatePayloadStmt:
566572
"Expected a vector list [...] after point ID in UPDATE SET VECTOR",
567573
self._peek().pos,
568574
)
575+
try:
576+
coerced = tuple(float(v) for v in vector_val)
577+
except (ValueError, TypeError) as exc:
578+
raise QQLSyntaxError(
579+
f"Vector elements must be numeric; got invalid value: {exc}",
580+
self._peek().pos,
581+
) from exc
569582
return UpdateVectorStmt(
570583
collection=collection,
571584
point_id=point_id,
572-
vector=tuple(float(v) for v in vector_val),
585+
vector=coerced,
573586
)
574587

575588
if self._peek().kind == TokenKind.PAYLOAD:

tests/test_executor.py

Lines changed: 170 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2299,3 +2299,173 @@ def test_update_payload_passes_wait_true(self, executor, mock_client):
22992299
executor.execute(node)
23002300
kwargs = mock_client.set_payload.call_args.kwargs
23012301
assert kwargs.get("wait") is True
2302+
2303+
2304+
# ── PR #28 review gap fixes ───────────────────────────────────────────────────
2305+
2306+
class TestSearchGroupBySparse:
2307+
"""Gap 1 & 6 — sparse-only grouped search must use the sparse path."""
2308+
2309+
def test_sparse_only_grouped_calls_query_points_groups(self, executor, mock_client, mocker):
2310+
mock_client.collection_exists.return_value = True
2311+
mock_response = mocker.MagicMock()
2312+
mock_response.groups = []
2313+
mock_client.query_points_groups.return_value = mock_response
2314+
2315+
mock_sparse = mocker.MagicMock()
2316+
mock_sparse.query_embed.return_value = {"indices": [0, 1], "values": [0.5, 0.5]}
2317+
mocker.patch("qql.executor.SparseEmbedder", return_value=mock_sparse)
2318+
2319+
node = SearchStmt(
2320+
collection="articles", query_text="q", limit=5, model=None,
2321+
sparse_only=True, group_by="category", group_size=3,
2322+
)
2323+
executor.execute(node)
2324+
mock_client.query_points_groups.assert_called_once()
2325+
kwargs = mock_client.query_points_groups.call_args.kwargs
2326+
assert kwargs.get("using") == "sparse"
2327+
# Must NOT have called dense Embedder
2328+
from qql.embedder import Embedder as _Embedder # noqa: F401
2329+
# mock_embedder fixture patches Embedder; query_points not called confirms no dense path
2330+
mock_client.query_points.assert_not_called()
2331+
2332+
def test_sparse_only_grouped_label_in_message(self, executor, mock_client, mocker):
2333+
mock_client.collection_exists.return_value = True
2334+
mock_response = mocker.MagicMock()
2335+
mock_response.groups = []
2336+
mock_client.query_points_groups.return_value = mock_response
2337+
2338+
mock_sparse = mocker.MagicMock()
2339+
mock_sparse.query_embed.return_value = {"indices": [0], "values": [1.0]}
2340+
mocker.patch("qql.executor.SparseEmbedder", return_value=mock_sparse)
2341+
2342+
node = SearchStmt(
2343+
collection="articles", query_text="q", limit=5, model=None,
2344+
sparse_only=True, group_by="tag", group_size=2,
2345+
)
2346+
result = executor.execute(node)
2347+
assert "sparse" in result.message
2348+
assert "grouped" in result.message
2349+
2350+
2351+
class TestSearchGroupByAdvanced:
2352+
"""Gaps 7 & 8 — fusion and search params forwarding in grouped search."""
2353+
2354+
def test_grouped_hybrid_fusion_dbsf(self, executor, mock_client, mocker):
2355+
from qdrant_client.models import Fusion
2356+
mock_client.collection_exists.return_value = True
2357+
mock_response = mocker.MagicMock()
2358+
mock_response.groups = []
2359+
mock_client.query_points_groups.return_value = mock_response
2360+
2361+
mock_sparse = mocker.MagicMock()
2362+
mock_sparse.query_embed.return_value = {"indices": [0], "values": [1.0]}
2363+
mocker.patch("qql.executor.SparseEmbedder", return_value=mock_sparse)
2364+
2365+
node = SearchStmt(
2366+
collection="articles", query_text="q", limit=3, model=None,
2367+
hybrid=True, fusion="dbsf", group_by="category", group_size=2,
2368+
)
2369+
executor.execute(node)
2370+
kwargs = mock_client.query_points_groups.call_args.kwargs
2371+
fusion_query = kwargs.get("query")
2372+
assert fusion_query is not None
2373+
assert fusion_query.fusion == Fusion.DBSF
2374+
2375+
def test_grouped_search_params_with_clause_forwarded(self, executor, mock_client, mocker):
2376+
mock_client.collection_exists.return_value = True
2377+
mock_response = mocker.MagicMock()
2378+
mock_response.groups = []
2379+
mock_client.query_points_groups.return_value = mock_response
2380+
2381+
node = SearchStmt(
2382+
collection="articles", query_text="q", limit=5, model=None,
2383+
with_clause=SearchWith(exact=True), group_by="category",
2384+
)
2385+
executor.execute(node)
2386+
kwargs = mock_client.query_points_groups.call_args.kwargs
2387+
assert kwargs.get("search_params") is not None
2388+
2389+
2390+
class TestUpdateVectorVectorShape:
2391+
"""Gaps 12 & 13 — verify exact vector shape sent to Qdrant for named/unnamed collections."""
2392+
2393+
def test_update_vector_unnamed_collection_sends_plain_list(self, executor, mock_client):
2394+
from qql.ast_nodes import UpdateVectorStmt
2395+
mock_client.collection_exists.return_value = True
2396+
# Unnamed collection: get_collection returns non-dict vectors
2397+
mock_vectors = mocker.MagicMock() if False else type("V", (), {})()
2398+
info = mock_client.get_collection.return_value
2399+
info.config.params.vectors = [None] # list → not a dict → unnamed
2400+
2401+
node = UpdateVectorStmt(collection="articles", point_id=1, vector=(0.1, 0.2, 0.3))
2402+
executor.execute(node)
2403+
kwargs = mock_client.update_vectors.call_args.kwargs
2404+
pv = kwargs["points"][0]
2405+
assert isinstance(pv.vector, list)
2406+
assert pv.vector == [0.1, 0.2, 0.3]
2407+
2408+
def test_update_vector_named_collection_sends_dict(self, executor, mock_client):
2409+
from qql.ast_nodes import UpdateVectorStmt
2410+
mock_client.collection_exists.return_value = True
2411+
# Named collection: get_collection returns dict vectors
2412+
info = mock_client.get_collection.return_value
2413+
info.config.params.vectors = {"dense": object(), "sparse": object()} # dict → named
2414+
2415+
node = UpdateVectorStmt(collection="articles", point_id="id-1", vector=(0.5, 0.6))
2416+
executor.execute(node)
2417+
kwargs = mock_client.update_vectors.call_args.kwargs
2418+
pv = kwargs["points"][0]
2419+
assert isinstance(pv.vector, dict)
2420+
assert "dense" in pv.vector
2421+
assert pv.vector["dense"] == [0.5, 0.6]
2422+
2423+
def test_update_vector_exact_values_preserved(self, executor, mock_client):
2424+
from qql.ast_nodes import UpdateVectorStmt
2425+
mock_client.collection_exists.return_value = True
2426+
info = mock_client.get_collection.return_value
2427+
info.config.params.vectors = [None] # unnamed
2428+
2429+
vec = (0.11, 0.22, 0.33, 0.44)
2430+
node = UpdateVectorStmt(collection="articles", point_id=99, vector=vec)
2431+
executor.execute(node)
2432+
kwargs = mock_client.update_vectors.call_args.kwargs
2433+
assert kwargs["points"][0].vector == list(vec)
2434+
2435+
2436+
class TestUpdatePayloadMessages:
2437+
"""Gaps 17 — assert specific message text for both update-payload branches."""
2438+
2439+
def test_filter_based_update_message_contains_filter_based(self, executor, mock_client):
2440+
from qql.ast_nodes import UpdatePayloadStmt, CompareExpr
2441+
mock_client.collection_exists.return_value = True
2442+
node = UpdatePayloadStmt(
2443+
collection="articles",
2444+
payload={"status": "done"},
2445+
query_filter=CompareExpr(field="year", op="<", value=2020),
2446+
)
2447+
result = executor.execute(node)
2448+
assert "filter-based" in result.message
2449+
2450+
def test_id_based_update_message_contains_point_id(self, executor, mock_client):
2451+
from qql.ast_nodes import UpdatePayloadStmt
2452+
mock_client.collection_exists.return_value = True
2453+
node = UpdatePayloadStmt(
2454+
collection="articles", point_id="abc-999", payload={"tag": "ai"}
2455+
)
2456+
result = executor.execute(node)
2457+
assert "abc-999" in result.message
2458+
2459+
def test_filter_based_set_payload_receives_filter_object(self, executor, mock_client):
2460+
from qql.ast_nodes import UpdatePayloadStmt, CompareExpr
2461+
from qdrant_client.models import Filter
2462+
mock_client.collection_exists.return_value = True
2463+
node = UpdatePayloadStmt(
2464+
collection="articles",
2465+
payload={"x": 1},
2466+
query_filter=CompareExpr(field="cat", op="=", value="tech"),
2467+
)
2468+
executor.execute(node)
2469+
kwargs = mock_client.set_payload.call_args.kwargs
2470+
# SDK verified: PointsSelector accepts rest.Filter — must receive Filter, not a list
2471+
assert isinstance(kwargs["points"], Filter)

tests/test_parser.py

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1368,3 +1368,38 @@ def test_update_payload_dotted_filter_field(self):
13681368
assert isinstance(node, UpdatePayloadStmt)
13691369
assert node.query_filter is not None
13701370
assert node.payload == {"reviewed": True}
1371+
1372+
1373+
# ── PR #28 review gap fixes ───────────────────────────────────────────────────
1374+
1375+
class TestSearchGroupByValidation:
1376+
"""Parser-level validation added for PR #28 gaps 2 and 2."""
1377+
1378+
def test_group_size_zero_raises(self):
1379+
with pytest.raises(QQLSyntaxError, match="GROUP_SIZE must be a positive integer"):
1380+
parse("SEARCH articles SIMILAR TO 'q' LIMIT 5 GROUP BY category GROUP_SIZE 0")
1381+
1382+
def test_group_size_negative_raises(self):
1383+
with pytest.raises(QQLSyntaxError, match="GROUP_SIZE must be a positive integer"):
1384+
parse("SEARCH articles SIMILAR TO 'q' LIMIT 5 GROUP BY category GROUP_SIZE -1")
1385+
1386+
1387+
class TestUpdateVectorValidation:
1388+
"""PR #28 gap 11 — non-numeric vector elements should raise QQLSyntaxError."""
1389+
1390+
def test_non_numeric_string_element_raises(self):
1391+
with pytest.raises(QQLSyntaxError, match="Vector elements must be numeric"):
1392+
parse("UPDATE articles SET VECTOR WHERE id = 1 ['abc', 0.2, 0.3]")
1393+
1394+
def test_none_element_raises(self):
1395+
# null parsed as Python None → TypeError → QQLSyntaxError
1396+
with pytest.raises(QQLSyntaxError, match="Vector elements must be numeric"):
1397+
parse("UPDATE articles SET VECTOR WHERE id = 1 [null, 0.2]")
1398+
1399+
1400+
class TestUpdateSetInvalidTargetMessage:
1401+
"""PR #28 gap 16 — explicit error message for bad SET target."""
1402+
1403+
def test_invalid_set_target_message(self):
1404+
with pytest.raises(QQLSyntaxError, match="Expected VECTOR or PAYLOAD after SET"):
1405+
parse("UPDATE articles SET FOOBAR WHERE id = 1 [0.1]")

0 commit comments

Comments
 (0)