pavanjava · srimon12 · May 15, 2026 · May 15, 2026 · May 15, 2026 · May 15, 2026
diff --git a/README.md b/README.md
@@ -5,9 +5,9 @@
 [![PyPI version](https://img.shields.io/pypi/v/qql-cli?color=blue&label=PyPI)](https://pypi.org/project/qql-cli/)
 [![Python 3.12+](https://img.shields.io/pypi/pyversions/qql-cli)](https://pypi.org/project/qql-cli/)
 [![MIT License](https://img.shields.io/badge/license-MIT-green)](LICENSE)
-[![Tests](https://img.shields.io/badge/tests-405%20passing-brightgreen)](tests/)
+[![Tests](https://img.shields.io/badge/tests-500%20passing-brightgreen)](tests/)
 
-Write `INSERT`, `SELECT`, `SEARCH`, `SCROLL`, `RECOMMEND`, `DELETE`, and `CREATE COLLECTION` statements instead of Python SDK calls. Supports hybrid dense+sparse vector search, cross-encoder reranking, quantization (scalar, turbo, binary, product), SQL-style `WHERE` filters, script execution, and collection dump/restore.
+Write `INSERT`, `SELECT`, `SEARCH`, `SCROLL`, `RECOMMEND`, `UPDATE`, `DELETE`, and `CREATE COLLECTION` statements instead of Python SDK calls. Supports hybrid dense+sparse vector search, grouped search (GROUP BY), cross-encoder reranking, quantization (scalar, turbo, binary, product), SQL-style `WHERE` filters, script execution, and collection dump/restore.
 
 ```
 qql> INSERT INTO COLLECTION notes VALUES {'text': 'Qdrant is a vector database', 'author': 'alice', 'year': 2024}
@@ -82,9 +82,9 @@ Full documentation lives in the [`docs/`](docs/) folder and at **[pavanjava.gith
 |---|---|
 | [Getting Started](docs/getting-started.md) | Installation, connecting, first queries |
 | [INSERT / INSERT BULK](docs/insert.md) | Adding documents, batch inserts, payload types |
-| [SEARCH / SELECT / SCROLL / RECOMMEND / Hybrid / RERANK](docs/search.md) | Semantic search, point retrieval, pagination, hybrid, reranking, recommendations |
+| [SEARCH / SELECT / SCROLL / RECOMMEND / Hybrid / GROUP BY / RERANK](docs/search.md) | Semantic search, grouped search, point retrieval, pagination, hybrid, reranking, recommendations |
 | [WHERE Filters](docs/filters.md) | Full SQL-style filter operators |
-| [Collections & Quantization](docs/collections.md) | SHOW, CREATE, DROP, QUANTIZE (scalar/turbo/binary/product), CREATE INDEX |
+| [Collections & Quantization](docs/collections.md) | SHOW, CREATE, DROP, QUANTIZE (scalar/turbo/binary/product), CREATE INDEX, UPDATE VECTOR, UPDATE PAYLOAD |
 | [Scripts: EXECUTE / DUMP](docs/scripts.md) | Script files, collection backup/restore |
 | [Programmatic Usage](docs/programmatic.md) | Use QQL as a Python library |
 | [Reference: Models / Config / Errors](docs/reference.md) | Embedding models, config file, error reference |
@@ -128,6 +128,17 @@ SHOW COLLECTIONS
 SHOW COLLECTION articles
 DROP COLLECTION articles
 
+-- Search with grouping
+SEARCH articles SIMILAR TO 'query' LIMIT 5 GROUP BY category
+SEARCH articles SIMILAR TO 'query' LIMIT 5 GROUP BY category GROUP_SIZE 3
+SEARCH articles SIMILAR TO 'query' LIMIT 5 WHERE year >= 2020 GROUP BY category GROUP_SIZE 2
+SEARCH articles SIMILAR TO 'query' LIMIT 5 USING HYBRID GROUP BY category
+
+-- Update
+UPDATE articles SET VECTOR WHERE id = '3f2e1a4b-...' [0.1, 0.2, 0.3, 0.4]
+UPDATE articles SET PAYLOAD WHERE id = '3f2e1a4b-...' {'year': 2025, 'status': 'active'}
+UPDATE articles SET PAYLOAD WHERE category = 'draft' {'status': 'published'}
+
 -- Delete
 DELETE FROM articles WHERE id = '3f2e1a4b-...'
 DELETE FROM articles WHERE year < 2020
@@ -147,7 +158,7 @@ Tests do not require a running Qdrant instance — the Qdrant client is mocked.
 pytest tests/ -v
 ```
 
-Expected: **405 tests passing**.
+Expected: **500 tests passing**.
 
 ---
 

diff --git a/docs/collections.md b/docs/collections.md
@@ -310,3 +310,68 @@ DELETE FROM articles WHERE year < 2020 AND status = 'draft'
 **Notes:**
 - If no points match the filter or ID, the operation succeeds silently with a count of 0.
 - The collection itself must exist; deleting from a non-existent collection raises an error.
+
+---
+
+## UPDATE SET VECTOR — replace a point's dense vector
+
+Replaces the stored dense vector for a **single point** identified by its ID. The point must already exist in the collection. Use this when you want to refresh an embedding without changing the payload.
+
+**Syntax:**
+```
+UPDATE <collection> SET VECTOR WHERE id = '<point_id>' [<vector>]
+UPDATE <collection> SET VECTOR WHERE id = <integer_id>  [<vector>]
+```
+
+The vector is provided as a JSON-style float array `[v1, v2, ..., vN]`. The array length must match the collection's configured vector dimensions.
+
+**Examples:**
+
+```sql
+-- Replace vector by UUID
+UPDATE articles SET VECTOR WHERE id = '3f2e1a4b-8c91-4d0e-b123-abc123def456' [0.1, 0.2, 0.3, 0.4]
+
+-- Replace vector by integer ID
+UPDATE articles SET VECTOR WHERE id = 42 [0.1, 0.2, 0.3, 0.4]
+```
+
+**Notes:**
+- Only single-point updates are supported (by ID). Bulk or filter-based vector updates are not supported.
+- The point must already exist; this operation does not create new points.
+- The collection must exist; updating from a non-existent collection raises an error.
+- For hybrid collections, the dense vector named `"dense"` is updated. Sparse vectors are managed separately.
+
+---
+
+## UPDATE SET PAYLOAD — merge fields into a point's payload
+
+Merges new key/value pairs into the payload of one or more points. **Existing fields not mentioned in the update are preserved** (additive merge, not a full replace). Use a `WHERE` filter to update multiple points at once.
+
+**Syntax:**
+```
+UPDATE <collection> SET PAYLOAD WHERE id = '<point_id>' {<payload>}
+UPDATE <collection> SET PAYLOAD WHERE id = <integer_id>  {<payload>}
+UPDATE <collection> SET PAYLOAD WHERE <filter>            {<payload>}
+```
+
+**Examples:**
+
+```sql
+-- Update a single point by UUID
+UPDATE articles SET PAYLOAD WHERE id = '3f2e1a4b-8c91-4d0e-b123-abc123def456' {'year': 2025, 'status': 'active'}
+
+-- Update a single point by integer ID
+UPDATE articles SET PAYLOAD WHERE id = 42 {'category': 'tech'}
+
+-- Update all points matching a filter
+UPDATE articles SET PAYLOAD WHERE category = 'draft' {'status': 'published'}
+
+-- Compound filter update
+UPDATE articles SET PAYLOAD WHERE year < 2020 AND status = 'draft' {'archived': true}
+```
+
+**Notes:**
+- **Merge semantics:** only the fields in `{…}` are written; all other existing payload fields are preserved.
+- If no points match the filter, the operation succeeds silently with no changes.
+- The collection must exist; updating from a non-existent collection raises an error.
+- All `WHERE` filter operators supported by `DELETE` are also supported here (see [WHERE Filters](filters.md)).
diff --git a/docs/reference.md b/docs/reference.md
@@ -162,7 +162,7 @@ Tests do not require a running Qdrant instance — the Qdrant client is mocked.
 pytest tests/ -v
 ```
 
-Expected output: **405 tests passing**.
+Expected output: **500 tests passing**.
 
 ---
 
@@ -174,14 +174,23 @@ Expected output: **405 tests passing**.
 | `Connection failed: ...` | Qdrant unreachable at given URL | Check that Qdrant is running and the URL is correct |
 | `INSERT requires a 'text' field in VALUES` | `text` key missing from the VALUES dict | Add `'text': '...'` to your dict |
 | `Vector dimension mismatch: collection '...' expects X dims, but model produces Y dims` | Model used in INSERT differs from the one used to create the collection | Use `USING MODEL` to specify the same model as the collection was created with |
-| `Collection '...' does not exist` | SEARCH / SCROLL / SELECT / DROP / DELETE on a non-existent collection | Check name spelling or run `SHOW COLLECTIONS` |
+| `Collection '...' does not exist` | SEARCH / SCROLL / SELECT / DROP / DELETE / UPDATE on a non-existent collection | Check name spelling or run `SHOW COLLECTIONS` |
 | `Unexpected token '...'; expected a QQL statement keyword` | Unrecognized statement | Check the query syntax and supported statement list |
 | `SELECT requires a string or integer point id, got '...'` | `SELECT` used with a non-ID filter value | Use `SELECT * FROM <collection> WHERE id = '<id>'` or an integer ID |
 | `Unterminated string literal (at position N)` | A string is missing its closing quote | Close the string with a matching `'` or `"` |
 | `Unexpected character '@' (at position N)` | A character not part of QQL syntax | Remove or quote the offending character |
 | `Expected a filter operator after field '...'` | Unknown operator in WHERE clause | Use one of: `=`, `!=`, `>`, `>=`, `<`, `<=`, `IN`, `NOT IN`, `BETWEEN`, `IS NULL`, `IS NOT NULL`, `IS EMPTY`, `IS NOT EMPTY`, `MATCH` |
 | `Expected ')' ...` | Unclosed parenthesis in WHERE clause | Add the missing `)` to close the group |
 | `Qdrant error during SEARCH: ...` | Hybrid search on a non-hybrid collection, or wrong vector names | Ensure the collection was created with `HYBRID` before using `USING HYBRID` in INSERT/SEARCH |
+| `Qdrant error during GROUP BY SEARCH: ...` | GROUP BY on an unindexed field, or unsupported field type | Ensure the group-by field is indexed as `keyword` or `integer` via `CREATE INDEX` |
+| `GROUP BY and RERANK cannot be combined ...` | Both GROUP BY and RERANK specified in the same SEARCH | Remove one of the two clauses |
+| `Expected VECTOR or PAYLOAD after SET, got '...'` | Unknown keyword after SET in UPDATE | Use `UPDATE ... SET VECTOR ...` or `UPDATE ... SET PAYLOAD ...` |
+| `Expected a vector list [...] after point ID in UPDATE SET VECTOR` | UPDATE SET VECTOR missing the `[...]` float array | Add the vector array: `UPDATE ... SET VECTOR WHERE id = '...' [0.1, 0.2, ...]` |
+| `Qdrant error during UPDATE VECTOR: ...` | Point does not exist, or vector dimensions mismatch | Verify the point ID exists and the vector length matches the collection's dimensions |
+| `Qdrant error during UPDATE PAYLOAD: ...` | Qdrant rejected the payload update | Check field values and collection state |
+| `Vector elements must be numeric floats; boolean values are not allowed` | A boolean (`true` or `false`) was present in the vector array for `UPDATE SET VECTOR` — `float(True)` silently equals `1.0` in Python, so this is caught explicitly | Replace booleans with numeric floats: `UPDATE … [0.1, 0.2, …, 0.N]` |
+| `Vector elements must be numeric; got invalid value: ...` | A non-numeric value (string or null) was present in the vector array for `UPDATE SET VECTOR` | Ensure all vector elements are floats: `UPDATE … [0.1, 0.2, …, 0.N]` |
+| `GROUP_SIZE must be a positive integer, got N` | `GROUP_SIZE 0` or a negative value was specified | Use a positive integer: `GROUP_SIZE 3` |
 | `Qdrant error during SCROLL: ...` | Qdrant rejected scroll request | Verify collection state, filter, and cursor (`AFTER`) value |
 | `Unknown index type '...'` | Invalid schema type in CREATE INDEX | Use one of: `keyword`, `integer`, `float`, `bool`, `text`, `geo`, `datetime` |
 | `Qdrant error during CREATE INDEX: ...` | Qdrant rejected the index creation | Check field name and collection state |
diff --git a/docs/search.md b/docs/search.md
@@ -343,3 +343,68 @@ SEARCH articles SIMILAR TO 'semantic search' LIMIT 5
 | Large collections with keyword-heavy queries | `USING HYBRID RERANK` |
 
 > **Note on scores:** After reranking, the `score` column shows the cross-encoder's raw logit (can be any real number, unbounded). Do not compare reranked scores to non-reranked cosine similarity scores.
+
+---
+
+## SEARCH … GROUP BY — grouped results
+
+Returns the top-scoring points **grouped by a payload field value**. Instead of a single flat ranked list, results are organised into groups — each group contains the top-scoring points that share the same value for the specified field.
+
+Useful for **result diversification**: e.g. "return the 3 best articles from each category", or "show the top 2 papers per author".
+
+**Syntax:**
+```
+SEARCH <collection> SIMILAR TO '<query>' LIMIT <n> GROUP BY <field>
+SEARCH <collection> SIMILAR TO '<query>' LIMIT <n> GROUP BY <field> GROUP_SIZE <m>
+SEARCH <collection> SIMILAR TO '<query>' LIMIT <n> [WHERE <filter>] GROUP BY <field> [GROUP_SIZE <m>]
+SEARCH <collection> SIMILAR TO '<query>' LIMIT <n> USING HYBRID GROUP BY <field> [GROUP_SIZE <m>]
+```
+
+- **`LIMIT <n>`** — maximum number of **groups** to return.
+- **`GROUP_SIZE <m>`** — maximum number of points per group (default: **3**).
+- **`GROUP BY <field>`** — the payload field whose values define the groups. **Must be a string (keyword) or number (integer) field** — this is enforced by Qdrant. Dot-notation is supported (e.g. `meta.author`). Array-valued fields are allowed: a point with multiple values for the field can appear in multiple groups. The field should be indexed as `keyword` or `integer` for best performance (see [CREATE INDEX](collections.md)).
+- `WHERE` filters, `USING HYBRID`, and `USING MODEL` are all compatible with GROUP BY.
+- **`GROUP BY` and `RERANK` cannot be combined** in the same statement — this raises a syntax error.
+
+**Examples:**
+
+Top 5 categories, up to 3 articles each (default group_size):
+```sql
+SEARCH articles SIMILAR TO 'machine learning' LIMIT 5 GROUP BY category
+```
+
+Top 3 authors, up to 2 papers each:
+```sql
+SEARCH papers SIMILAR TO 'neural networks' LIMIT 3 GROUP BY author GROUP_SIZE 2
+```
+
+Grouped search with a payload filter:
+```sql
+SEARCH articles SIMILAR TO 'deep learning' LIMIT 5 WHERE year >= 2022 GROUP BY category GROUP_SIZE 4
+```
+
+Grouped hybrid search:
+```sql
+SEARCH articles SIMILAR TO 'vector databases' LIMIT 4 USING HYBRID GROUP BY category GROUP_SIZE 3
+```
+
+**Output:**
+
+```
+✓ Found 3 group(s) by 'category' (grouped)
+Group: machine-learning
+ Score  │ ID                                   │ Payload
+────────┼──────────────────────────────────────┼────────────────────────────────────────
+ 0.9312 │ 3f2e1a4b-8c91-4d0e-b123-abc123def456 │ {'text': '...', 'category': 'machine-learning'}
+ 0.8845 │ 9a1b2c3d-4e5f-6789-abcd-ef0123456789 │ {'text': '...', 'category': 'machine-learning'}
+
+Group: nlp
+ Score  │ ID                                   │ Payload
+────────┼──────────────────────────────────────┼────────────────────────────────────────
+ 0.9100 │ 1a2b3c4d-5e6f-7890-bcde-f01234567890 │ {'text': '...', 'category': 'nlp'}
+```
+
+> **Tip:** For GROUP BY to work efficiently, create a payload index on the grouping field first:
+> ```sql
+> CREATE INDEX ON COLLECTION articles FOR category TYPE keyword
+> ```
diff --git a/pyproject.toml b/pyproject.toml
@@ -1,6 +1,6 @@
 [project]
 name = "qql-cli"
-version = "2.2.0"
+version = "2.3.0"
 description = "QQL is a SQL-like query language and CLI for Qdrant vector database. Write INSERT, SEARCH, RECOMMEND, DELETE, and CREATE COLLECTION statements instead of Python SDK calls. Supports hybrid dense+sparse vector search, cross-encoder reranking, quantization (scalar, turbo, binary, product), WHERE clause filters, script execution, and collection dump/restore."
 readme = "README.md"
 license = { file = "LICENSE" }

diff --git a/src/qql/ast_nodes.py b/src/qql/ast_nodes.py
@@ -213,6 +213,8 @@ class SearchStmt:
     rerank: bool = False                    # if True, apply cross-encoder reranking post-Qdrant
     rerank_model: str | None = None         # cross-encoder model; None → CrossEncoderEmbedder.DEFAULT_MODEL
     with_clause: SearchWith | None = None
+    group_by: str | None = None             # GROUP BY field name; None → normal flat search
+    group_size: int = 3                     # max points per group (ignored when group_by is None)
 
 
 @dataclass(frozen=True)
@@ -237,6 +239,23 @@ class DeleteStmt:
     query_filter: FilterExpr | None = None
 
 
+@dataclass(frozen=True)
+class UpdateVectorStmt:
+    """UPDATE <collection> SET VECTOR WHERE id = <id> [vector...]"""
+    collection: str
+    point_id: str | int
+    vector: tuple[float, ...]   # dense vector as immutable tuple (frozen=True compatible)
+
+
+@dataclass(frozen=True)
+class UpdatePayloadStmt:
+    """UPDATE <collection> SET PAYLOAD WHERE <filter|id> {payload}"""
+    collection: str
+    payload: dict[str, Any]
+    point_id: str | int | None = None        # mutually exclusive with query_filter
+    query_filter: FilterExpr | None = None
+
+
 # Union type for all top-level statement nodes
 ASTNode = (
     InsertStmt
@@ -251,4 +270,6 @@ class DeleteStmt:
     | SearchStmt
     | RecommendStmt
     | DeleteStmt
+    | UpdateVectorStmt
+    | UpdatePayloadStmt
 )
diff --git a/src/qql/cli.py b/src/qql/cli.py
@@ -71,6 +71,9 @@
       Optional: [yellow]RERANK[/yellow] [MODEL '<model>']   rerank results with a cross-encoder
       Optional: [yellow]EXACT[/yellow]   bypass HNSW and perform exact search
       Optional: [yellow]WITH[/yellow] { hnsw_ef: <int>, exact: <bool>, acorn: <bool> }   search parameters
+      Optional: [yellow]GROUP BY[/yellow] <field> [[yellow]GROUP_SIZE[/yellow] <n>]
+                  Group results by a payload field value (default GROUP_SIZE: 3).
+                  Field must be keyword or integer type. RERANK and GROUP BY cannot be combined.
 
   [yellow]RECOMMEND FROM[/yellow] <name> [yellow]POSITIVE IDS[/yellow] (<id>, ...)
       Find points similar to known examples.
@@ -82,6 +85,15 @@
   [yellow]DELETE FROM[/yellow] <name> [yellow]WHERE id =[/yellow] '<id>'
       Delete a point by its ID.
 
+  [yellow]UPDATE[/yellow] <name> [yellow]SET VECTOR WHERE id =[/yellow] '<id>'|<int> [<vector>]
+      Replace the dense vector for a single point by ID.
+      The point must already exist. Vector is a float array: [0.1, 0.2, ..., 0.N]
+
+  [yellow]UPDATE[/yellow] <name> [yellow]SET PAYLOAD WHERE id =[/yellow] '<id>'|<int> {<payload>}
+  [yellow]UPDATE[/yellow] <name> [yellow]SET PAYLOAD WHERE[/yellow] <filter> {<payload>}
+      Merge new key/value pairs into a point's payload (additive; existing fields preserved).
+      Supports all WHERE filter operators. Filter-based updates affect all matching points.
+
 Script files (in-shell):
   [yellow]EXECUTE[/yellow] <path>   or   [yellow]\\e[/yellow] <path>
       Run a .qql script file. Statements are executed in order.
@@ -458,6 +470,32 @@ def _run_and_print(executor: Executor, query: str) -> None:
         console.print(_format_collection_diagnostics(result.data))
         return
 
+    # Pretty-print grouped search results (GROUP BY)
+    if (
+        isinstance(result.data, list)
+        and result.data
+        and isinstance(result.data[0], dict)
+        and "group_id" in result.data[0]
+    ):
+        for group in result.data:
+            console.print(f"\n[bold cyan]Group: {group['group_id']}[/bold cyan]")
+            hits = group.get("hits", [])
+            if hits:
+                tbl = Table(show_header=True, header_style="bold")
+                tbl.add_column("Score", style="green", no_wrap=True, justify="right")
+                tbl.add_column("ID")
+                tbl.add_column("Payload")
+                for hit in hits:
+                    tbl.add_row(
+                        str(hit["score"]),
+                        str(hit["id"]),
+                        str(hit.get("payload", {})),
+                    )
+                console.print(tbl)
+            else:
+                console.print("  (no hits)")
+        return
+
     # Pretty-print search results
     if isinstance(result.data, list) and result.data and isinstance(result.data[0], dict) and "score" in result.data[0]:
         table = Table(show_header=True, header_style="bold cyan")