Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -84,7 +84,7 @@ Full documentation lives in the [`docs/`](docs/) folder and at **[pavanjava.gith
| [INSERT / INSERT BULK](docs/insert.md) | Adding documents, batch inserts, payload types |
| [SEARCH / SELECT / SCROLL / RECOMMEND / Hybrid / RERANK](docs/search.md) | Semantic search, point retrieval, pagination, hybrid, reranking, recommendations |
| [WHERE Filters](docs/filters.md) | Full SQL-style filter operators |
| [Collections & Quantization](docs/collections.md) | CREATE, DROP, QUANTIZE (scalar/turbo/binary/product), CREATE INDEX |
| [Collections & Quantization](docs/collections.md) | SHOW, CREATE, DROP, QUANTIZE (scalar/turbo/binary/product), CREATE INDEX |
| [Scripts: EXECUTE / DUMP](docs/scripts.md) | Script files, collection backup/restore |
| [Programmatic Usage](docs/programmatic.md) | Use QQL as a Python library |
| [Reference: Models / Config / Errors](docs/reference.md) | Embedding models, config file, error reference |
Expand Down Expand Up @@ -125,6 +125,7 @@ CREATE COLLECTION articles QUANTIZE TURBO BITS 2
CREATE COLLECTION articles QUANTIZE TURBO BITS 1.5 ALWAYS RAM
CREATE INDEX ON COLLECTION articles FOR year TYPE integer
SHOW COLLECTIONS
SHOW COLLECTION articles
DROP COLLECTION articles

-- Delete
Expand Down
58 changes: 58 additions & 0 deletions docs/collections.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,64 @@ SHOW COLLECTIONS

---

## SHOW COLLECTION — inspect one collection

Returns collection diagnostics for a single collection using Qdrant's collection info.

**Syntax:**
```sql
SHOW COLLECTION <collection_name>
```

**What it shows:**

- Point count
- Indexed vector count
- Segment count
- Vector names, dimensions, and distance metrics
- Dense vs hybrid topology
- Sparse vector modifiers when present
- Quantization mode
- HNSW configuration
- Payload indexes detected by Qdrant
- Shard, replica, and write consistency settings

**Example:**
```sql
SHOW COLLECTION research_papers
```

**Output:**
```
OK Collection 'research_papers' diagnostics
Collection: research_papers
Status : green
Points : 12450
Indexed vectors : 12450
Segments : 3
Topology : hybrid
Vector 'dense' : 768 dims, Cosine distance
Sparse 'sparse' : modifier=idf
Quantization : scalar
HNSW M : 16
HNSW ef_construct : 100
Payload indexes:
category: keyword
year: integer
Shards : 1
Replicas : 1
Write consistency : 1
```

**Notes:**

- `Topology` is `dense` for standard collections and `hybrid` when sparse vectors are configured alongside dense vectors.
- Dense collections with named vectors still report their vector names and dimensions.
- If no payload indexes exist, QQL prints `Payload indexes : none`.
- Raises an error if the collection does not exist.

---

## CREATE COLLECTION — create a collection

Explicitly creates a new empty collection. Collections are also created automatically on the first INSERT, so this command is optional — use it when you want to pre-create a collection before inserting data.
Expand Down
3 changes: 3 additions & 0 deletions docs/getting-started.md
Original file line number Diff line number Diff line change
Expand Up @@ -144,6 +144,9 @@ SCROLL FROM notes LIMIT 10
-- List all collections
SHOW COLLECTIONS

-- Inspect one collection's diagnostics
SHOW COLLECTION notes

-- Retrieve a point by ID
SELECT * FROM notes WHERE id = 1
```
Expand Down
10 changes: 10 additions & 0 deletions docs/programmatic.md
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,15 @@ result = run_query(
url="http://localhost:6333",
)
print(result.message) # "Deleted N point(s)"

# Inspect collection diagnostics
result = run_query(
"SHOW COLLECTION notes",
url="http://localhost:6333",
)
print(result.data["topology"]) # "dense" or "hybrid"
print(result.data["vectors"]) # {"": {...}} or {"dense": {...}, ...}
print(result.data["payload_schema"]) # {"field": "keyword", ...} or None
```

---
Expand Down Expand Up @@ -132,6 +141,7 @@ class ExecutionResult:
| SCROLL | `{"points": [{"id": str, "payload": dict}, ...], "next_offset": str \| None}` |
| RECOMMEND | `[{"id": str, "score": float, "payload": dict}, ...]` |
| SHOW COLLECTIONS | `["name1", "name2", ...]` |
| SHOW COLLECTION | `{"name": str, "status": str, "points_count": int \| None, "indexed_vectors_count": int \| None, "segments_count": int, "topology": str, "vectors": dict, "sparse_vectors": dict \| None, "quantization": str \| None, "hnsw_config": dict, "payload_schema": dict \| None, "sharding": dict}` |
| CREATE COLLECTION | `None` |
| CREATE INDEX | `None` |
| DROP COLLECTION | `None` |
Expand Down
1 change: 1 addition & 0 deletions src/qql/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@
"Executor",
"Lexer",
"Parser",
"load_config",
"run_query",
]

Expand Down
6 changes: 6 additions & 0 deletions src/qql/ast_nodes.py
Original file line number Diff line number Diff line change
Expand Up @@ -180,6 +180,11 @@ class ShowCollectionsStmt:
pass


@dataclass(frozen=True)
class ShowCollectionStmt:
collection: str


@dataclass(frozen=True)
class SelectStmt:
collection: str
Expand Down Expand Up @@ -240,6 +245,7 @@ class DeleteStmt:
| CreateIndexStmt
| DropCollectionStmt
| ShowCollectionsStmt
| ShowCollectionStmt
| SelectStmt
| ScrollStmt
| SearchStmt
Expand Down
69 changes: 67 additions & 2 deletions src/qql/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,11 @@
[yellow]SHOW COLLECTIONS[/yellow]
List all collections in the connected Qdrant instance.

[yellow]SHOW COLLECTION[/yellow] <name>
Show detailed diagnostics for a single collection: point count, vector
config, distance metric, quantization, HNSW parameters, payload indexes,
and sharding info.

[yellow]SCROLL FROM[/yellow] <name> [yellow]LIMIT[/yellow] <n>
Paginate points by ID order.
Optional: [yellow]WHERE[/yellow] <filter>
Expand Down Expand Up @@ -136,7 +141,7 @@ def connect(url: str, secret: str | None) -> None:

cfg = QQLConfig(url=url, secret=secret)
save_config(cfg)
console.print(f"[bold green]Connected.[/bold green] Config saved to ~/.qql/config.json\n")
console.print("[bold green]Connected.[/bold green] Config saved to ~/.qql/config.json\n")
_launch_repl(cfg)


Expand Down Expand Up @@ -257,7 +262,7 @@ def dump(collection: str, output: str, batch_size: int) -> None:
f"\n[bold green]Done.[/bold green] "
f"{written} point(s) written"
+ (f", [yellow]{skipped} skipped[/yellow] (no 'text' field)" if skipped else "")
+ f"."
+ "."
)


Expand Down Expand Up @@ -362,6 +367,61 @@ def _launch_repl(cfg: QQLConfig) -> None:
_run_and_print(executor, query)


def _format_collection_diagnostics(data: dict) -> str:
"""Format SHOW COLLECTION <name> diagnostics into a rich string."""
lines = []

lines.append(f"[bold cyan]Collection:[/bold cyan] {data['name']}")
lines.append(f" Status : {data['status']}")
lines.append(f" Points : {data['points_count']}")
lines.append(f" Indexed vectors : {data['indexed_vectors_count']}")
lines.append(f" Segments : {data['segments_count']}")
lines.append(f" Topology : {data['topology']}")

# Vectors
vectors = data["vectors"]
for vname, vconf in vectors.items():
label = f" Vector '{vname}'" if vname else " Vector"
lines.append(f"{label} : {vconf['size']} dims, {vconf['distance']} distance")

# Sparse vectors
if data["sparse_vectors"]:
for sname, sconf in data["sparse_vectors"].items():
lines.append(f" Sparse '{sname}' : modifier={sconf['modifier']}")

lines.append(f" Quantization : {data['quantization'] or 'none'}")

# HNSW config
hnsw = data["hnsw_config"]
lines.append(f" HNSW M : {hnsw['m']}")
lines.append(f" HNSW ef_construct : {hnsw['ef_construct']}")
if hnsw.get("full_scan_threshold") is not None:
lines.append(f" HNSW full_scan_thres : {hnsw['full_scan_threshold']}")
if hnsw.get("max_indexing_threads") is not None:
lines.append(f" HNSW max_idx_threads : {hnsw['max_indexing_threads']}")
if hnsw.get("on_disk") is not None:
lines.append(f" HNSW on_disk : {hnsw['on_disk']}")
if hnsw.get("payload_m") is not None:
lines.append(f" HNSW payload_m : {hnsw['payload_m']}")

# Payload schema
schema = data["payload_schema"]
if schema:
lines.append(" Payload indexes:")
for field, dtype in schema.items():
lines.append(f" {field}: {dtype}")
else:
lines.append(" Payload indexes : none")

# Sharding
sh = data["sharding"]
lines.append(f" Shards : {sh['shard_number']}")
lines.append(f" Replicas : {sh['replication_factor']}")
lines.append(f" Write consistency : {sh['write_consistency_factor']}")

return "\n".join(lines)


def _run_and_print(executor: Executor, query: str) -> None:
try:
tokens = Lexer().tokenize(query)
Expand Down Expand Up @@ -393,6 +453,11 @@ def _run_and_print(executor: Executor, query: str) -> None:
console.print(table)
return

# Pretty-print SHOW COLLECTION <name> diagnostics
if isinstance(result.data, dict) and "topology" in result.data:
console.print(_format_collection_diagnostics(result.data))
return

# Pretty-print search results
if isinstance(result.data, list) and result.data and isinstance(result.data[0], dict) and "score" in result.data[0]:
table = Table(show_header=True, header_style="bold cyan")
Expand Down
1 change: 0 additions & 1 deletion src/qql/config.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
from __future__ import annotations

import json
import os
from dataclasses import asdict, dataclass
from pathlib import Path

Expand Down
107 changes: 106 additions & 1 deletion src/qql/executor.py
Original file line number Diff line number Diff line change
Expand Up @@ -80,16 +80,17 @@
ScrollStmt,
SearchStmt,
SearchWith,
ShowCollectionStmt,
ShowCollectionsStmt,
)
from .config import QQLConfig
from .embedder import CrossEncoderEmbedder, Embedder, SparseEmbedder
from .exceptions import QQLRuntimeError

_RERANK_FETCH_MULTIPLIER = 4
_HYBRID_PREFETCH_MULTIPLIER = 4
_COLLECTION_VISIBILITY_TIMEOUT_SECONDS = 5.0
_COLLECTION_VISIBILITY_POLL_SECONDS = 0.05
from .exceptions import QQLRuntimeError


@dataclass
Expand Down Expand Up @@ -117,6 +118,8 @@ def execute(self, node: ASTNode) -> ExecutionResult:
return self._execute_drop(node)
if isinstance(node, ShowCollectionsStmt):
return self._execute_show(node)
if isinstance(node, ShowCollectionStmt):
return self._execute_show_collection(node)
if isinstance(node, ScrollStmt):
return self._execute_scroll(node)
if isinstance(node, SelectStmt):
Expand Down Expand Up @@ -418,6 +421,108 @@ def _execute_show(self, node: ShowCollectionsStmt) -> ExecutionResult:
data=names,
)

def _execute_show_collection(self, node: ShowCollectionStmt) -> ExecutionResult:
if not self._client.collection_exists(node.collection):
raise QQLRuntimeError(f"Collection '{node.collection}' does not exist")

info = self._client.get_collection(node.collection)
config = info.config
params = config.params

# ── Vector topology ────────────────────────────────────────────────
vectors = params.vectors # type: ignore[union-attr]
sparse_vector_params = params.sparse_vectors or {}
if isinstance(vectors, dict):
vector_details = {}
for vname, vconfig in vectors.items():
vector_details[vname] = {
"size": vconfig.size,
"distance": str(vconfig.distance) if vconfig.distance else None,
}
elif vectors is None:
raise QQLRuntimeError(
f"Collection '{node.collection}' has no vector configuration"
)
else:
vector_details = {
"": {
"size": vectors.size,
"distance": str(vectors.distance) if vectors.distance else None,
}
}
topology = "hybrid" if sparse_vector_params else "dense"

# ── Sparse vector config ───────────────────────────────────────────
sparse_vectors = {}
if sparse_vector_params:
for sname, sconfig in sparse_vector_params.items():
sparse_vectors[sname] = {
"modifier": str(sconfig.modifier) if sconfig.modifier else None,
}

# ── Quantization ───────────────────────────────────────────────────
quant_config = config.quantization_config
quantization = None
if quant_config is not None:
qtype = type(quant_config).__name__
if hasattr(quant_config, "scalar"):
quantization = "scalar"
elif hasattr(quant_config, "binary"):
quantization = "binary"
elif hasattr(quant_config, "product"):
quantization = "product"
elif hasattr(quant_config, "turbo"):
quantization = "turbo"
else:
quantization = qtype

# ── HNSW config ────────────────────────────────────────────────────
hnsw = {
"m": config.hnsw_config.m,
"ef_construct": config.hnsw_config.ef_construct,
}
if config.hnsw_config.full_scan_threshold is not None:
hnsw["full_scan_threshold"] = config.hnsw_config.full_scan_threshold
if config.hnsw_config.max_indexing_threads is not None:
hnsw["max_indexing_threads"] = config.hnsw_config.max_indexing_threads
if config.hnsw_config.on_disk is not None:
hnsw["on_disk"] = config.hnsw_config.on_disk
if config.hnsw_config.payload_m is not None:
hnsw["payload_m"] = config.hnsw_config.payload_m

# ── Payload schema / indexes ───────────────────────────────────────
payload_indexes = {}
for field_name, idx_info in (info.payload_schema or {}).items():
payload_indexes[field_name] = str(idx_info.data_type)

# ── Sharding / replication ─────────────────────────────────────────
sharding = {
"shard_number": params.shard_number,
"replication_factor": params.replication_factor,
"write_consistency_factor": params.write_consistency_factor,
}

data = {
"name": node.collection,
"status": str(info.status),
"points_count": info.points_count,
"indexed_vectors_count": info.indexed_vectors_count,
"segments_count": info.segments_count,
"topology": topology,
"vectors": vector_details,
"sparse_vectors": sparse_vectors or None,
"quantization": quantization,
"hnsw_config": hnsw,
"payload_schema": payload_indexes or None,
"sharding": sharding,
}

return ExecutionResult(
success=True,
message=f"Collection '{node.collection}' diagnostics",
data=data,
)

def _execute_scroll(self, node: ScrollStmt) -> ExecutionResult:
if not self._client.collection_exists(node.collection):
raise QQLRuntimeError(f"Collection '{node.collection}' does not exist")
Expand Down
Loading
Loading