Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
108 changes: 108 additions & 0 deletions src/app/docs/kagent/concepts/agent-memory/page.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,108 @@
---
title: "Agent Memory"
pageOrder: 5
description: "Enable vector-backed long-term memory for agents to learn from past interactions."
---

export const metadata = {
title: "Agent Memory",
description: "Enable vector-backed long-term memory for agents to learn from past interactions.",
author: "kagent.dev"
};

# Agent Memory

kagent provides long-term memory for agents using vector similarity search. Agents can automatically save and retrieve relevant context across conversations.

## Overview

Memory in kagent is:
- **Vector-backed** — Uses embedding models to encode memories as 768-dimensional vectors
- **Searchable** — Retrieves relevant memories via cosine similarity
- **Automatic** — Extracts and saves memories periodically without explicit user action
- **Time-bounded** — Memories expire after a configurable TTL (default 15 days)

## Supported Storage Backends

| Backend | Description |
|---------|-------------|
| **pgvector** (PostgreSQL) | Full-featured vector search using the pgvector extension |
| **Turso/libSQL** (SQLite) | Lightweight alternative using SQLite-compatible storage |

## Configuration

### Enable Memory on an Agent

Memory is enabled by adding a `memory` field to the agent spec. It references a `ModelConfig` resource whose embedding provider generates memory vectors.

```yaml
apiVersion: kagent.dev/v1alpha2
kind: Agent
metadata:
name: memory-agent
spec:
type: Declarative
declarative:
modelConfig: default-model-config
systemMessage: "You are a helpful assistant with long-term memory."
memory:
modelConfig: embedding-model-config
```

### Memory with Custom TTL

```yaml
memory:
modelConfig: embedding-model-config
ttlDays: 30
```

### MemorySpec Fields

| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `modelConfig` | string | Yes | Name of the `ModelConfig` resource for embedding generation |
| `ttlDays` | int | No | Memory entry retention in days (default: 15, minimum: 1) |

## How Memory Works

### Automatic Save Cycle

1. The agent processes user messages normally
2. Every 5th user message, the agent automatically extracts key information
3. Extracted memories are summarized and encoded as embedding vectors
4. Vectors are stored in the database with metadata and timestamps

### Memory Retrieval (Prefetch)

Before generating a response, the agent:
1. Encodes the current user message as an embedding vector
2. Searches stored memories by cosine similarity
3. Injects the most relevant memories into the agent's context

### Memory Tools

When memory is enabled, three tools are injected into the agent:

| Tool | Description |
|------|-------------|
| `save_memory` | Explicitly save a piece of information |
| `load_memory` | Search for relevant memories by query |
| `prefetch_memory` | Automatically run before response generation |

## Memory Management via API

```
POST /api/memories/sessions # Add a memory entry
POST /api/memories/sessions/batch # Add multiple memory entries
POST /api/memories/search # Search memories via vector similarity
GET /api/memories?agent_name=... # List memories for an agent
DELETE /api/memories?agent_name=... # Clear memories for an agent
```

## Technical Details

- Embedding vectors are normalized to 768 dimensions
- Background TTL pruning runs periodically (default retention: 15 days)
- Memory is per-agent — each agent has its own isolated memory store
- Memories include timestamps; session references are stored for explicit `save_memory` calls
79 changes: 79 additions & 0 deletions src/app/docs/kagent/concepts/context-management/page.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
---
title: "Context Management"
pageOrder: 9
description: "Automatically manage conversation history to stay within LLM context windows during long conversations."
---

export const metadata = {
title: "Context Management",
description: "Automatically manage conversation history to stay within LLM context windows during long conversations.",
author: "kagent.dev"
};

# Context Management

Long conversations can exceed LLM context windows. kagent provides **event compaction** to manage older messages, reducing token count while optionally preserving key information through summarization.

## Configuration

```yaml
apiVersion: kagent.dev/v1alpha2
kind: Agent
metadata:
name: long-conversation-agent
spec:
type: Declarative
declarative:
modelConfig: default-model-config
systemMessage: "You are a helpful agent for extended sessions."
context:
compaction:
compactionInterval: 5
```

### Compaction Fields

| Field | Type | Default | Description |
|-------|------|---------|-------------|
| `compactionInterval` | int | 5 | Number of agent invocations between compaction runs (minimum: 1) |
| `overlapSize` | int | 2 | Number of recent events to preserve uncompacted (minimum: 0) |
| `tokenThreshold` | int | — | Optional token count that triggers compaction |
| `eventRetentionSize` | int | — | Optional max number of events to retain |
| `summarizer` | object | — | Optional summarizer configuration (see below) |

### Adding Summarization

By default, compacted events are **dropped** from the context. To instead summarize them, configure a `summarizer`:

```yaml
context:
compaction:
compactionInterval: 5
summarizer:
modelConfig: summary-model-config
promptTemplate: "Summarize the following conversation, preserving key decisions and outcomes."
```

When a summarizer is configured, older events are replaced with an LLM-generated summary rather than being discarded.

## How It Works

1. As the conversation progresses, events accumulate in the session history
2. Every N invocations (controlled by `compactionInterval`), compaction triggers
3. Older events beyond the `overlapSize` window are either dropped or summarized
4. The agent continues with the compacted history seamlessly

## When to Use

Enable event compaction when:
- Agents handle long-running conversations (debugging sessions, investigations)
- Agents call many tools that generate large outputs
- You want to support extended interactions without hitting context limits

You may not need it for:
- Short, single-turn interactions
- Agents with small tool sets that generate compact outputs

## Context Caching Note

Prompt caching (a separate optimization that caches the prompt prefix for faster responses) is **not** configured at the agent level. Most LLM providers enable prompt caching by default.
137 changes: 137 additions & 0 deletions src/app/docs/kagent/concepts/git-based-skills/page.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,137 @@
---
title: "Git-Based Skills"
pageOrder: 8
description: "Load markdown knowledge documents from Git repositories to guide agent behavior."
---

export const metadata = {
title: "Git-Based Skills",
description: "Load markdown knowledge documents from Git repositories to guide agent behavior.",
author: "kagent.dev"
};

# Git-Based Skills

Skills are markdown-based knowledge documents that agents load at startup. They provide domain-specific instructions, best practices, and procedures that guide agent behavior.

kagent supports two sources for skills:
- **OCI images** — Container images containing skill files (original approach)
- **Git repositories** — Clone skills directly from Git repos

## Skill File Format

Each skill is a directory containing a `SKILL.md` file with YAML frontmatter:

```markdown
---
name: kubernetes-troubleshooting
description: Guide for diagnosing and fixing common Kubernetes issues
---

# Kubernetes Troubleshooting

## Pod Crash Loops

When a pod is in CrashLoopBackOff:

1. Check logs: `kubectl logs <pod> --previous`
2. Check events: `kubectl describe pod <pod>`
3. Verify resource limits...
```

## Git Repository Configuration

### Basic Example

```yaml
apiVersion: kagent.dev/v1alpha2
kind: Agent
metadata:
name: my-agent
spec:
type: Declarative
declarative:
modelConfig: default-model-config
systemMessage: "You are a helpful agent."
skills:
gitRefs:
- url: https://github.com/myorg/agent-skills.git
ref: main
```

### With Subdirectory

```yaml
skills:
gitRefs:
- url: https://github.com/myorg/monorepo.git
ref: main
path: skills/kubernetes
```

### Multiple Sources

Combine Git and OCI skills. OCI skill references are plain image strings:

```yaml
skills:
refs:
- ghcr.io/myorg/k8s-skills:latest
gitRefs:
- url: https://github.com/myorg/skills-repo.git
ref: main
- url: https://github.com/myorg/another-repo.git
ref: develop
path: agent-skills
```

## Authentication

### HTTPS Token Auth

```yaml
apiVersion: v1
kind: Secret
metadata:
name: git-credentials
type: Opaque
stringData:
token: ghp_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
```

```yaml
skills:
gitAuthSecretRef:
name: git-credentials
gitRefs:
- url: https://github.com/myorg/private-skills.git
ref: main
```

### SSH Key Auth

```yaml
skills:
gitAuthSecretRef:
name: git-ssh-key
gitRefs:
- url: git@github.com:myorg/private-skills.git
ref: main
```

> A single `gitAuthSecretRef` applies to all Git repositories in the agent. All repos must use the same authentication method.

## How It Works

Under the hood, kagent uses a lightweight init container containing Git and krane tools:

1. Before the agent pod starts, the `skills-init` container runs
2. It clones each Git repository to the skills volume
3. It also pulls any OCI skill images
4. The agent runtime discovers skills from the mounted volume at startup

## Skill Discovery at Runtime

Once loaded, skills are available through the built-in `SkillsTool`:
- **List skills:** The agent calls the tool with no arguments to see available skills
- **Load skill:** The agent calls the tool with a skill name to get the full content
Loading
Loading