Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
95 changes: 95 additions & 0 deletions docs/codedocs/api-reference/embeddings.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
---
title: "Embeddings"
description: "Reference for the TF-IDF vectorizer and vector utilities exported by pmll_memory_mcp."
---

The embeddings module provides the long-term layer's local vectorization primitives.

## Import Path

```python
from pmll_memory_mcp import TfIdfVectorizer, embed, cosine_similarity
```

Source file: `mcp/pmll_memory_mcp/embeddings.py`

## `TfIdfVectorizer`

Constructor:

```python
TfIdfVectorizer() -> None
```

### Property: `vocab_size`

```python
vocab_size: int
```

Current number of terms in the vocabulary.

### `add_document`

```python
add_document(text: str) -> None
```

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `text` | `str` | — | Document text to fold into corpus statistics. |

### `vectorize`

```python
vectorize(text: str) -> list[float]
```

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `text` | `str` | — | Text to convert into a normalized TF-IDF vector. |

## Functions

### `embed`

```python
embed(text: str) -> list[float]
```

Adds the text to the module-level vectorizer and returns its vector.

### `cosine_similarity`

```python
cosine_similarity(a: list[float], b: list[float]) -> float
```

Returns a score between `0.0` and `1.0` for aligned non-negative vectors.

## Behavior Notes

- `TfIdfVectorizer` gives you an isolated corpus. That is the right choice when you need reproducible vector dimensions inside one test or workflow.
- `embed()` uses the module-level singleton managed by `get_vectorizer()` in the source. That is convenient for the graph layer because every new node contributes to the shared vocabulary.
- `cosine_similarity()` only compares the overlapping vector length. In practice that works because both vectors usually come from the same vectorizer instance.

## Example

```python
from pmll_memory_mcp import TfIdfVectorizer, embed, cosine_similarity

vectorizer = TfIdfVectorizer()
vectorizer.add_document("authentication login user")
vectorizer.add_document("authentication login password")

a = vectorizer.vectorize("authentication login user")
b = vectorizer.vectorize("authentication login password")
print(cosine_similarity(a, b))
print(embed("session cache and semantic search"))
```

## Notes

- `embed()` uses a module-level shared vectorizer, while `TfIdfVectorizer()` gives you an isolated one.
- Vector dimensions grow as the vocabulary grows.
- The module also defines `tokenize()`, `get_vectorizer()`, and `reset_vectorizer()` in `mcp/pmll_memory_mcp/embeddings.py`; they are useful for testing and internals even though `__init__.py` does not re-export them.
139 changes: 139 additions & 0 deletions docs/codedocs/api-reference/memory-graph.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,139 @@
---
title: "Memory Graph"
description: "Reference for the long-term graph functions exported by pmll_memory_mcp."
---

The memory graph module is the long-term retrieval engine behind semantic search and traversal.

## Import Path

```python
from pmll_memory_mcp import (
upsert_node,
create_relation,
search_graph,
prune_stale_links,
add_interlinked_context,
retrieve_with_traversal,
get_graph_stats,
clear_graph,
)
```

Source file: `mcp/pmll_memory_mcp/memory_graph.py`

## Functions

### `upsert_node`

```python
upsert_node(
session_id: str,
node_type: NodeType,
label: str,
content: str,
metadata: dict[str, str] | None = None,
) -> MemoryNode
```

Creates or updates a typed node.

### `create_relation`

```python
create_relation(
session_id: str,
source_id: str,
target_id: str,
relation: RelationType,
weight: float | None = None,
metadata: dict[str, str] | None = None,
) -> MemoryEdge | None
```

Creates a typed edge or updates the weight of an existing duplicate.

### `search_graph`

```python
search_graph(
session_id: str,
query: str,
max_depth: int = 1,
top_k: int = 5,
edge_filter: list[RelationType] | None = None,
) -> GraphSearchResult
```

Runs semantic search, then neighbor traversal.

### `prune_stale_links`

```python
prune_stale_links(
session_id: str,
threshold: float | None = None,
) -> dict[str, int]
```

Removes decayed edges and old orphan nodes.

### `add_interlinked_context`

```python
add_interlinked_context(
session_id: str,
items: list[dict[str, Any]],
auto_link: bool = True,
) -> dict[str, Any]
```

Bulk-adds nodes and optional similarity edges.

### `retrieve_with_traversal`

```python
retrieve_with_traversal(
session_id: str,
start_node_id: str,
max_depth: int = 2,
edge_filter: list[RelationType] | None = None,
) -> list[TraversalResult]
```

Walks outward from a starting node.

### `get_graph_stats`

```python
get_graph_stats(session_id: str) -> dict[str, Any]
```

Returns node, edge, type, and relation counts.

### `clear_graph`

```python
clear_graph(session_id: str) -> int
```

Clears the graph for the session and returns the removed object count.

## Example

```python
from pmll_memory_mcp import (
upsert_node,
create_relation,
search_graph,
get_graph_stats,
)

sid = "api-ref-graph"
service = upsert_node(sid, "concept", "service", "Processes requests")
queue = upsert_node(sid, "concept", "queue", "Buffers jobs")
create_relation(sid, service.id, queue.id, "depends_on")

print(search_graph(sid, "job processing").direct[0].node.label)
print(get_graph_stats(sid))
```
119 changes: 119 additions & 0 deletions docs/codedocs/api-reference/pmmemorystore.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,119 @@
---
title: "PMMemoryStore"
description: "Reference for the short-term KV store class exported by pmll_memory_mcp."
---

`PMMemoryStore` is the short-term session cache exported from `pmll_memory_mcp` and implemented in `mcp/pmll_memory_mcp/kv_store.py`.

## Import Path

```python
from pmll_memory_mcp import PMMemoryStore
```

Source file: `mcp/pmll_memory_mcp/kv_store.py`

## Constructor

```python
PMMemoryStore(silo_size: int = 256) -> None
```

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `silo_size` | `int` | `256` | Informational silo capacity carried on the instance. |

## Public Methods

### `peek`

```python
peek(key: str) -> tuple[bool, str \| None, int \| None]
```

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `key` | `str` | — | Cache key to inspect. |

Returns a tuple `(hit, value, index)`.

Example:

```python
store = PMMemoryStore()
print(store.peek("user:1"))
```

### `set`

```python
set(key: str, value: str) -> int
```

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `key` | `str` | — | Cache key to store. |
| `value` | `str` | — | Resolved string payload. |

Returns the slot index used for the entry.

Example:

```python
store = PMMemoryStore()
slot = store.set("user:1", "{"name": "Ada"}")
print(slot)
```

### `flush`

```python
flush() -> int
```

Returns the number of slots cleared.

Example:

```python
store = PMMemoryStore()
store.set("a", "1")
store.set("b", "2")
print(store.flush())
```

### `__len__`

```python
__len__() -> int
```

Returns the number of stored slots.

### `__contains__`

```python
__contains__(key: object) -> bool
```

Returns `True` when the key exists in the store.

## Common Combined Pattern

```python
from pmll_memory_mcp import PMMemoryStore

store = PMMemoryStore()

if not store.peek("docs")[0]:
store.set("docs", "cached docs payload")

print(len(store), "docs" in store)
```

## Notes

- Existing keys are updated in place and keep their original slot index.
- The constructor does not enforce a hard limit on writes.
- The server wrappers usually create instances indirectly through `get_store(session_id)` in the same source module, which is why application code should think in terms of session lifecycle rather than a single global cache.
- Because `peek()` only reports resolved values, it pairs naturally with `peek_context()` when you also need to account for in-flight work.
Loading
Loading