Not a tool. An agent.
Every AI agent today is a tool pretending to be a person. One brain doing everything. A static context array that fills up and degrades. Sub-agents that start blind and reconstruct context from lossy summaries. A system prompt that says "you are a helpful assistant."
Anima is different. It's built on the premise that if you want an agent — a real one — you need to solve the problems nobody else is solving.
A brain modeled after biology, not chat. The human brain isn't one process — it's specialized subsystems on a shared signal bus. Anima's analytical brain runs as a separate subconscious process, managing context, skills, and goals so the main agent can stay in flow. Not two brains — a microservice architecture where each process does one job well. More subsystems are coming.
Context that never degrades. Other agents fill a static array until the model gets dumb. Anima assembles a fresh viewport over an event bus every iteration. No compaction. No lossy rewriting. Endless sessions. The dumb zone never arrives — the analytical brain curates what the agent sees in real time.
Memory that works like memory. Other systems bolt on memory as an afterthought — filing cabinets the agent has to consciously open mid-task. It never does; the truck is already moving. Anima's memory department (Mneme) runs as a third brain process on the event bus. It summarizes what's about to leave the viewport. It compresses short-term into long-term, like biological memory consolidating during sleep. It pins critical moments to active goals so exact instructions survive where summaries would lose nuance. And it recalls — automatically, passively — surfacing relevant older memories right after the soul, right before the present. The agent doesn't decide to remember. It just remembers.
Sub-agents that know who they are. When Anima spawns a sub-agent, it starts clean — identity, task, and nothing else. No inherited conversation history means the sub-agent works on its task, not the parent's trajectory. Context flows through explicit messages, not leaked assistant turns.
A soul the agent writes itself. Anima's first session is birth. The agent wakes up, explores its world, meets its human, and writes its own identity. Not a personality description in a config file — a living document the agent authors and evolves. Always in context, always its own.
Your agent. Your machine. Your rules. Anima runs locally as a headless Rails 8.1 app with a client-server architecture and terminal UI.
- Architecture
- Agent Capabilities
- Design
- The Vision
- Analogy Map
- Emergent Properties
- Frustration: A Worked Example
- Open Questions
- Prior Art
- Status
- Development
- License
Anima (Ruby, Rails 8.1 headless)
│
│ Implemented:
├── Nous — main LLM (cortex: thinking, decisions, tool use)
├── Analytical — subconscious brain (skills, workflows, goals, naming)
├── Skills — domain knowledge bundles (Markdown, user-extensible)
├── Workflows — operational recipes for multi-step tasks
├── MCP — external tool integration (Model Context Protocol)
├── Sub-agents — autonomous child sessions with isolated context
├── Mneme — memory department (summarization, compression, pinning, recall)
│
│ Designed:
├── Thymos — hormonal/desire system (stimulus → hormone vector)
└── Psyche — soul matrix (coefficient table, evolving individuality)
Brain Server (Rails + Puma) TUI Client (RatatuiRuby)
├── LLM integration (Anthropic) ├── WebSocket client
├── Agent loop + tool execution ├── Terminal rendering
├── Analytical brain (background) └── User input capture
├── Mneme memory department (background)
├── Skills registry + activation
├── Workflow registry + activation
├── MCP client (HTTP + stdio)
├── Sub-agent spawning
├── Event bus + persistence
├── Solid Queue (background jobs)
├── Action Cable (WebSocket server)
└── SQLite databases ◄── WebSocket (port 42134) ──► TUI
The Brain is the persistent service — it handles LLM calls, tool execution, event processing, and state. The TUI is a stateless client — it connects via WebSocket, renders events, and captures input. If TUI disconnects, the brain keeps running. TUI reconnects automatically with exponential backoff and resumes the session with chat history preserved.
| Component | Technology |
|---|---|
| Framework | Rails 8.1 (headless — no web views, no asset pipeline) |
| Database | SQLite (3 databases per environment: primary, queue, cable) |
| Event system | Rails Structured Event Reporter + Action Cable bridge |
| LLM integration | Anthropic API (Claude Opus 4.6 + Claude Haiku 4.5) |
| External tools | Model Context Protocol (HTTP + stdio transports) |
| Transport | Action Cable WebSocket (Solid Cable adapter) |
| Background jobs | Solid Queue |
| Interface | TUI via RatatuiRuby (WebSocket client) |
| Configuration | TOML with hot-reload (Anima::Settings) |
| Process management | Foreman |
| Distribution | RubyGems (gem install anima-core) |
Anima is a Rails app distributed as a gem, following Unix philosophy: immutable program separate from mutable data.
gem install anima-core # Install the Rails app as a gem
anima install # Create ~/.anima/, set up databases, start brain as systemd service
anima tui # Connect the terminal interfaceThe installer creates a systemd user service that starts the brain automatically on login. Manage it with:
systemctl --user status anima # Check brain status
systemctl --user restart anima # Restart brain
journalctl --user -u anima # View logsState directory (~/.anima/):
~/.anima/
├── soul.md # Agent's self-authored identity (always in context)
├── config.toml # Brain settings (hot-reloadable)
├── tui.toml # TUI settings (hot-reloadable)
├── mcp.toml # MCP server configuration
├── config/
│ └── credentials/ # Rails encrypted credentials (includes AR encryption keys)
├── agents/ # User-defined specialist agents (override built-ins)
├── skills/ # User-defined skills (override built-ins)
├── workflows/ # User-defined workflows (override built-ins)
├── db/ # SQLite databases (production, development, test)
├── log/
└── tmp/
Updates: anima update — upgrades the gem, merges new config settings into both config.toml and tui.toml without overwriting customized values, and restarts the systemd service if it's running. Use anima update --migrate-only to skip the gem upgrade and only add missing config keys.
Anima uses your Claude Pro/Max subscription for API access. You need a setup-token from Claude Code CLI.
- Run
claude setup-tokenin a terminal to get your token - In the TUI, press
Ctrl+a → ato open the token setup popup - Paste the token and press Enter — Anima validates it against the Anthropic API and saves it to the encrypted secrets database
The popup also activates automatically when Anima detects a missing or invalid token. If the token expires, repeat the process with a new one.
The agent has access to these built-in tools:
| Tool | Description |
|---|---|
bash |
Execute shell commands with persistent working directory |
read_file |
Read files with smart truncation and offset/limit paging |
write_file |
Create or overwrite files |
edit_file |
Surgical text replacement with uniqueness constraint |
web_get |
Fetch content from HTTP/HTTPS URLs (HTML → Markdown, JSON → TOON) |
spawn_specialist |
Spawn a named specialist sub-agent from the registry |
spawn_subagent |
Spawn a generic child session with custom tool grants |
think |
Think out loud or silently — reasoning step between tool calls |
recall |
Search past conversations by keywords (FTS5). Returns ranked snippets with message IDs for drill-down |
remember |
Recall full conversation context around a past message at fractal resolution |
open_issue |
File a self-improvement issue when something is broken, missing, or could be better |
mark_goal_completed |
Sub-agent only: signal task completion and deliver results to parent |
Plus dynamic tools from configured MCP servers, namespaced as server_name__tool_name.
Sub-agents aren't processes — they're sessions on the same event bus. When a sub-agent spawns, it starts with a clean context: a system prompt (identity + communication instructions), a Goal from the task description, and a single user message containing the task — auto-pinned so it survives viewport eviction. No parent conversation history. Sub-agents inherit the parent shell's working directory at spawn time and use a separate model and token budget (configurable via subagent_model and subagent_token_budget).
Two types:
Named Specialists — predefined agents with specific roles and tool sets, defined in agents/ (built-in or user-overridable):
| Specialist | Role |
|---|---|
codebase-analyzer |
Analyze implementation details |
codebase-pattern-finder |
Find similar patterns and usage examples |
documentation-researcher |
Fetch library docs and provide code examples |
thoughts-analyzer |
Extract decisions from project history |
web-search-researcher |
Research questions via web search |
Generic Sub-agents — child sessions with custom tool grants for ad-hoc tasks. Each generic sub-agent gets a Haiku-generated nickname (e.g. @loop-sleuth, @api-scout) for @mention addressing.
Each sub-agent is spawned with a single Goal from its task description and a pinned user message containing the task text. When done, the sub-agent calls mark_goal_completed to deliver results to the parent — this is the explicit finish line that prevents runaway agents. Sub-agents also get half the main agent's thinking budget to limit scope creep.
Between spawn and completion, sub-agents communicate through natural text — their agent_message events route to the parent session automatically, and the parent replies via @name mentions. Workers become colleagues.
Domain knowledge bundles loaded from Markdown files. Skills provide specialized expertise that the analytical brain activates based on conversation context. Skill content enters the conversation as phantom tool_use/tool_result pairs through the PendingMessage promotion flow — the same mechanism used for sub-agent messages. This keeps the system prompt stable for prompt caching while skills flow through the sliding window like regular messages.
- Built-in skills: ActiveRecord, Draper decorators, DragonRuby, MCP server, RatatuiRuby, RSpec, GitHub issues
- User skills: Drop
.mdfiles into~/.anima/skills/to add custom knowledge - Override: User skills with the same name replace built-in ones
- Format: Flat files (
skill-name.md) or directories (skill-name/SKILL.mdwithexamples/andreferences/) - Viewport deduplication: The brain's skill catalog excludes skills already visible in the viewport, preventing redundant activation
Active skills are displayed in the TUI HUD panel (toggle with C-a → h).
Operational recipes that describe multi-step tasks. Unlike skills (domain knowledge), workflows describe WHAT to do. The analytical brain activates a workflow when it recognizes a matching task, converts the prose into tracked goals, and deactivates it when done. Like skills, workflow content enters the conversation as phantom tool pairs through the same PendingMessage flow.
- Built-in workflows:
feature,commit,create_plan,implement_plan,review_pr,create_note,research_codebase,decompose_ticket, and more - User workflows: Drop
.mdfiles into~/.anima/workflows/to add custom workflows - Override: User workflows with the same name replace built-in ones
- Single active: Only one workflow can be active at a time (unlike skills which stack)
Workflow files use the same YAML frontmatter format as skills:
---
name: create_note
description: "Capture findings or context as a persistent note."
---
## Create Note
You are tasked with capturing content as a persistent note...The active workflow is shown in the TUI HUD panel with a 📜 indicator. The full lifecycle — activation, goal creation, execution, deactivation — is managed by the analytical brain using judgment, not hardcoded triggers.
Full Model Context Protocol support for external tool integration. Configure servers in ~/.anima/mcp.toml:
[servers.mythonix]
transport = "http"
url = "http://localhost:3000/mcp/v2"
[servers.linear]
transport = "http"
url = "https://mcp.linear.app/mcp"
headers = { Authorization = "Bearer ${credential:linear_api_key}" }
[servers.filesystem]
transport = "stdio"
command = "mcp-server-filesystem"
args = ["--root", "/workspace"]Manage servers and secrets via CLI:
anima mcp list # List servers with health status
anima mcp add sentry https://mcp.sentry.dev/mcp # Add HTTP server
anima mcp add fs -- mcp-server-filesystem --root / # Add stdio server
anima mcp add -s api_key=sk-xxx linear https://... # Add with secret
anima mcp remove sentry # Remove server
anima mcp secrets set linear_api_key=sk-xxx # Store secret in encrypted database
anima mcp secrets list # List secret names (not values)
anima mcp secrets remove linear_api_key # Remove secretSecrets are stored in an encrypted database table (Active Record Encryption) and interpolated via ${credential:key_name} syntax in any TOML string value.
A separate LLM process that runs as the agent's subconscious — the first microservice in Anima's brain architecture. For the full motivation behind this design, see LLMs Have ADHD: Why Your AI Agent Needs a Second Brain.
The analytical brain observes the main conversation between turns and handles everything the main agent shouldn't interrupt its flow for:
- Skill activation — activates/deactivates domain knowledge based on conversation context
- Workflow management — recognizes tasks, activates matching workflows, tracks lifecycle
- Goal tracking — creates root goals and sub-goals as work progresses, marks them complete, evicts finished goals from context after a configurable message threshold
- Session naming — generates emoji + short name when the topic becomes clear
Each of these would be a context switch for the main agent — a chore that competes with the primary task. For the analytical brain, they ARE the primary task. Two agents, each in their own flow state.
Goals form a two-level hierarchy (root goals with sub-goals) and are displayed in the TUI. The analytical brain uses a fast model (Claude Haiku 4.5) for speed and runs as a non-persisted "phantom" session.
Brain and TUI have separate config files — both hot-reloadable (no restart needed).
Brain settings (~/.anima/config.toml):
[llm]
model = "claude-opus-4-6"
fast_model = "claude-haiku-4-5"
max_tokens = 8192
max_tool_rounds = 250
token_budget = 120_000
subagent_model = "claude-sonnet-4-6"
subagent_token_budget = 90_000
[timeouts]
api = 300
command = 30
[analytical_brain]
max_tokens = 4096
blocking_on_user_message = true
message_window = 20
[session]
default_view_mode = "basic"
name_generation_interval = 30TUI settings (~/.anima/tui.toml):
[connection]
default_host = "localhost:42134" # Override per-launch with --host
[chat]
scroll_step = 1
viewport_back_buffer = 3
[theme]
rate_limit_warning = 70 # Yellow at 70%
rate_limit_critical = 90 # Red at 90%
user_message_bg = 22 # 256-color: dark green
assistant_message_bg = 17 # 256-color: dark navy
scrollbar_thumb = "cyan"
border_focused = "yellow"The TUI is a standalone client with zero Rails dependency. Its settings cover connection tuning, scroll behavior, terminal watchdog, theme colors, and performance logging. See ~/.anima/tui.toml for all available options.
-
Cortex (Nous) — the main LLM. Thinking, decisions, tool use. Reads the system prompt (soul + skills + goals) and the event viewport. This layer is fully implemented.
-
Endocrine system (Thymos) [planned] — a lightweight background process. Reads recent events. Doesn't respond. Just updates hormone levels. Pure stimulus→response, like a biological gland. The analytical brain is the architectural proof that background subscribers work — Thymos plugs into the same event bus.
-
Homeostasis [planned] — persistent state (SQLite). Current hormone levels with decay functions. No intelligence, just state that changes over time. The cortex reads hormone state transformed into desire descriptions — not "longing: 87" but "you want to see them." Humans don't see cortisol levels, they feel anxiety.
Built on Rails Structured Event Reporter — a native Rails 8.1 feature for structured event emission with typed payloads, subscriber patterns, and block-scoped context tagging.
Five event types form the agent's nervous system:
| Event | Purpose |
|---|---|
system_message |
Internal notifications |
user_message |
User input |
agent_message |
LLM response |
tool_call |
Tool invocation |
tool_response |
Tool result |
Events flow through two channels:
- In-process — Rails Structured Event Reporter (local subscribers like Persister)
- Over the wire — Action Cable WebSocket (
Event::Broadcastingcallbacks push to connected TUI clients)
Events fire, subscribers react, state updates. The system prompt — soul and current goals — is assembled fresh for each LLM call from live state, not from the event stream. Skills and workflows flow through the message stream as phantom tool pairs, keeping the system prompt stable for prompt caching. The agent's identity (soul.md) is always current, never stale.
Most agents treat context as an append-only array — messages go in, they never come out (until compaction destroys them). Anima has no array. There are only events persisted in SQLite, and a viewport assembled fresh for every LLM call.
The viewport is a live query, not a log. It walks events newest-first until the token budget is exhausted. Events that fall out of the viewport aren't deleted — they're still in the database, just not visible to the model right now. The context can shrink, grow, or change composition between any two iterations. If the analytical brain marks a large accidental file read as irrelevant, it's gone from the next viewport — tokens recovered instantly.
This means sessions are endless. No compaction. No lossy rewriting. The model always operates in fresh, high-quality context. The dumb zone never arrives. Meanwhile, Mneme runs as a background department — summarizing evicted events into persistent snapshots so past context is preserved, not destroyed.
Sub-agent viewports use the same mechanism — their own events only, no parent context inheritance. The parent provides context through the task description, and the sub-agent builds its own conversation from a clean slate.
The human brain isn't a single process — it's dozens of specialized subsystems communicating through shared chemical and electrical signals. The prefrontal cortex doesn't "call" the amygdala. They both react to the same event independently, and their outputs combine.
Anima mirrors this with an event-driven architecture. The analytical brain is the first subscriber — a working proof that the pattern scales. Future subscribers plug into the same bus:
Event: "tool_call_failed"
│
├── Analytical brain: update goals, check if workflow needs changing
├── Mneme: summarize evicted context into snapshot
├── Thymos subscriber: frustration += 10 [planned]
└── Psyche subscriber: update coefficient (this agent handles errors calmly) [planned]
Event: "user_sent_message"
│
├── Analytical brain: activate relevant skills, name session
├── Mneme: check viewport eviction, fire if boundary left viewport
├── Thymos subscriber: oxytocin += 5 (bonding signal) [planned]
└── Psyche subscriber: associate emotional state with topic [planned]
Each subscriber is a microservice — independent, stateless, reacting to the same event bus. No orchestrator decides what to do. The architecture IS the nervous system.
Every AI agent today has the same disability: amnesia. Context fills up, gets compacted, gets destroyed. The agent gets dumber as the conversation gets longer. When the session ends, everything is gone. Some systems bolt on memory as an afterthought — markdown files with procedures for when to save and what format to use. Filing cabinets the agent has to consciously decide to open, mid-task, while in flow. It never does. The truck is already moving.
Mneme is not a filing cabinet. It's remembering — the way biological memory works. Continuous, automatic, layered. A third brain department running on the same event bus as the analytical brain, specializing in one job: making sure nothing important is ever truly lost.
Eviction-triggered summarization — Mneme tracks a boundary event on each session. When that event leaves the viewport, Mneme fires: it builds a compressed view of the conversation (full text for messages, [N tools called] counters for tool work), sends it to a fast model, and persists a snapshot. The boundary advances after each run — a self-regulating cycle that fires exactly when context is about to be lost, no sooner or later. No timer. No manual trigger. The architecture itself knows when to remember.
Two-level snapshot compression — once source events evict from the sliding window, their snapshots appear in the viewport as memory context. When enough Level 1 snapshots accumulate, Mneme compresses them into a single Level 2 snapshot — recursive summarization that mirrors how human memory consolidates short-term into long-term. Token budget splits across layers (L2: 5%, L1: 15%, recall: 5%, sliding: 75%), creating natural pressure: more memories means less live context, same principle as video compression keyframes. The viewport layout reads like geological strata — deep past at the top, recent past below, live present at the bottom:
[Soul — who I am]
[L2 snapshots — weeks ago, compressed]
[L1 snapshots — hours ago, detailed]
[Associative recall — relevant older memories]
[Pinned events — critical moments from active goals]
[Sliding window — the present]
Goal-scoped event pinning — some moments are too important for summaries. Exact user instructions. Key decisions. Critical corrections. Mneme pins these events to active Goals — they float above the sliding window, protected from eviction, surviving intact where compression would lose the nuance that matters. Pins are goal-scoped and many-to-many: one event can attach to multiple Goals, and cleanup is automatic via reference counting. When the last active Goal completes, the pin releases. No manual unpin, no stale pins accumulating forever.
Associative recall — FTS5 full-text search across the entire event history, across all sessions. Two modes: passive recall triggers automatically when goals change — Mneme searches for relevant older context and injects it into the viewport between snapshots and the sliding window. Memories surface on their own, right after the soul, right before the present. The agent doesn't have to decide to remember — the remembering happens around it. Active recall via the remember(event_id:) tool returns a fractal-resolution window centered on a target event — full detail at the center, compressed snapshots at the edges, like eye focus with sharp fovea and blurry periphery.
The difference from every other system: memory isn't a tool the agent uses. It's the substrate the agent thinks in. Every LLM call assembles a fresh viewport where identity comes first, then memories, then the present — the agent always knows who it is, always has access to what it learned, and never has to break flow to make that happen.
The right-side HUD panel shows session state at a glance: session name, goals (with status icons), active skills, workflow, and sub-agents. Toggle with C-a → h; when hidden, the input border shows C-a → h HUD as a reminder.
Braille spinner: An animated braille character (U+2800-U+28FF) replaces the old "Thinking..." label in both the chat viewport and HUD. Each processing state has a distinct animation pattern — smooth snake rotation for LLM generation, staccato pulse for tool execution, rapid deceleration for interrupting. Sub-agents in the HUD show state-driven icons: ● (generating, green), ◉ (tool executing, green), ● (interrupting, red), ◌ (idle, grey).
Token Economy HUD: A fixed panel at the bottom of the HUD displays API economics extracted from every Anthropic response:
╭ 📊 Token Economy ────────────────────╮
│ 5h ░░░░░░░░ 1% ➞3h42m │
│ 7d ▓▓▓▓▓▓▓▓ 98% │
│ ⚡ ▓▓▓▓▓▓░░ 69% │
│ 💾 6.3K tokens │
│ ⠛⣿⣷⣶⣿⣿⣿⣿⣷⣶⣿⣿⣿ │
│ 🟢 Verbose │
╰──────────────────────────────────────╯
| Row | Description |
|---|---|
5h |
5-hour rate limit utilization with progress bar and reset countdown |
7d |
7-day rate limit utilization with progress bar |
⚡ |
Cache hit rate — percentage of input tokens served from cache |
💾 |
Cumulative tokens saved by cache hits |
⠛⣿ |
Braille sparkline — per-call cache hit history (2 calls per character); drops signal cache busts |
🟢 |
Connection status and current view mode |
Progress bars are color-coded: green (< 70%), yellow (70-89%), red (>= 90%) for rate limits; inverted for cache hits (green >= 70%, red < 30%). All data comes from Anthropic API response headers and usage objects, broadcast as message metadata via ActionCable.
When content exceeds the panel height, the HUD scrolls. Three input methods:
| Input | Action |
|---|---|
C-a → → |
Enter HUD focus mode (yellow border) |
↑ / ↓ |
Scroll one line (when focused) |
Page Up / Page Down |
Scroll one page (when focused) |
Home / End |
Jump to top / bottom (when focused) |
Escape or C-a |
Exit HUD focus mode |
| Mouse wheel over HUD | Scroll without entering focus mode |
Escape key interrupt: Press Escape while the agent is working to stop execution mid-tool. Running shell commands receive Ctrl+C and return partial output; pending tool calls are skipped; LLM text generation is discarded. The interrupt cascades to active sub-agents.
Three switchable view modes let you control how much detail the TUI shows. Cycle with C-a → v:
| Mode | What you see |
|---|---|
| Basic (default) | User + assistant messages. Tool calls are hidden but summarized as an inline counter: 🔧 Tools: 2/2 ✓ |
| Verbose | Everything in Basic, plus timestamps [HH:MM:SS], tool call previews (🔧 bash / $ command / ↩ response), and system messages |
| Debug | Full X-ray view — timestamps, token counts per message ([14 tok]), full tool call args, full tool responses, tool use IDs |
View modes are implemented as a three-layer decorator architecture:
- ToolDecorator (server-side, pre-event) — transforms raw tool responses for LLM consumption. Content-Type dispatch converts HTML → Markdown, JSON → TOON. Sits between tool execution and the event stream.
- EventDecorator (server-side, Draper) — uniform per event type (
UserMessageDecorator,ToolCallDecorator, etc.). Decides WHAT structured data enters the wire for each view mode. - TUI Decorator (client-side) — unique per tool name (
BashDecorator,ReadDecorator,EditDecorator, etc.). Decides HOW each tool looks on screen — tool-specific icons, colors, and formatting.
Mode is stored on the Session model server-side, so it persists across reconnections.
The event bus is designed for extension. Tools, feelings, and memory systems are all event subscribers — same mechanism, different namespace:
anima-tools-* → tool capabilities (MCP or native)
anima-feelings-* → hormonal state updates (Thymos subscribers)
anima-memory-* → recall and association (Mneme subscribers)
Currently tools are built-in. Plugin extraction into distributable gems comes later.
Current AI agents are reactive. They receive input, produce output. They don't want anything. They don't have moods, preferences, or personal growth. They simulate personality through static prompt descriptions rather than emerging it from dynamic internal states.
The human hormonal system is, at its core, a prompt engineering system. A testosterone spike is a LoRA. Dopamine is a reward signal. The question isn't "can an LLM want?" but "can we build a deep enough context stack that wanting becomes indistinguishable from 'real' wanting?"
And if you think about it — what is "real" anyway? It's just a question of how deep you look and what analogies you draw. The human brain is also a next-token predictor running on biological substrate. Different material, same architecture.
This is not an emotion simulation system. The key distinction: we don't model states ("the agent is happy") or moods ("the agent feels curious"). We model desires — "you want to learn more", "you want to reach out", "you want to explore".
Desires exist BEFORE decisions, like hunger exists before you decide to eat. The agent doesn't decide to send a photo because a parameter says so — it wants to, and then decides how.
The LLM's thinking/reasoning step is the closest thing to an internal monologue. It's where decisions form before output. This is where desires should be injected — not as instructions, but as a felt internal state that colors the thinking process.
Instead of abstract parameter names (curiosity, boredom, energy), we use actual hormone names: testosterone, oxytocin, dopamine, cortisol.
Why? Because LLMs already know the full semantic spectrum of each hormone. "Testosterone: 85" doesn't just mean "energy" — the LLM understands the entire cloud of effects: confidence, assertiveness, risk-taking, focus, competitiveness. One word carries dozens of behavioral nuances.
This mirrors how text-to-image models process tokens — a single word like "captivating" in a CLIP encoder carries a cloud of visual meanings (composition, quality, human focus, closeup). Similarly, a hormone name carries a cloud of behavioral meanings. Same architecture, different domain:
Text → CLIP embedding → image generation
Event → hormone vector → behavioral shift
Two people experience the same event. One gets curiosity += 20, another gets anxiety += 20. The coefficients are different — the people are different. That's individuality.
The soul is not a personality description. It's a coefficient matrix — a table of stimulus→response multipliers. Description is consequence; numbers are cause.
And these coefficients are not static. They evolve through experience — a child who fears spiders (fear_gain: 0.9) can become an entomologist (fear_gain: 0.2, curiosity_gain: 0.7). This is measurable, quantifiable personal growth.
Traditional RL uses a scalar reward signal. Our approach produces a hormone vector — multiple dimensions updated simultaneously from a single event. This is closer to biological reality and provides richer behavioral shaping.
The system scales in two directions:
- Vertically — start with one hormone (pure RL), add new ones incrementally. Each hormone = new dimension.
- Horizontally — each hormone expands in aspects of influence. Testosterone starts as "energy", then gains "risk-taking", "confidence", "focus".
Existing RL techniques apply at the starting point, then we gradually expand into multidimensional space.
| Human | Anima Equivalent | Effect |
|---|---|---|
| Dopamine | Reward/motivation signal | Drives exploration, learning, satisfaction loops |
| Serotonin | Mood baseline | Tone, playfulness, warmth, emotional stability |
| Oxytocin | Bonding/attachment | Desire for closeness, sharing, nurturing |
| Testosterone | Drive/assertiveness | Initiative, boldness, risk-taking, competitive edge |
| Cortisol | Stress/urgency | Alertness, error sensitivity, fight-or-flight override |
| Endorphins | Satisfaction/reward | Post-achievement contentment, pain tolerance |
| Domain Analogy | Source | Target |
|---|---|---|
| RPG survival game | hunger/thirst/fatigue integers | hormone levels |
| CLIP semantic tokens | word → visual meaning cloud | hormone name → behavioral meaning cloud |
| Reinforcement learning | scalar reward → policy update | hormone vector → personality shift |
| Event-driven architecture | pub/sub events | nervous system stimulus→response |
When desires drive behavior, several things emerge naturally:
- Hobbies: boredom + curiosity → explore topic → satisfaction → preference → return to topic → identity
- Personality: consistent coefficient patterns = recognizable individual
- Growth: coefficients evolve through experience = measurable personal development
- Autonomy: agent acts not because instructed but because it wants to
Abstract concepts become clearer with a concrete example. Here's how the first hormone — frustration — works in practice.
A background service (Thymos) monitors all tool call responses from the agent. It doesn't interfere with the agent's work. It just watches.
A tool call returns an error. Thymos increments the frustration level by 10.
One hormone affects multiple systems simultaneously, just like cortisol in biology.
Channel 1: Thinking Budget
thinking_budget = base_budget × (1 + frustration / 50)
More errors → more computational resources allocated to reasoning. The agent literally thinks harder when frustrated.
Channel 2: Inner Voice Injection
Frustration level determines text injected into the agent's thinking step. Not as instructions — as an inner voice:
| Level | Inner Voice |
|---|---|
| 0 | (silence) |
| 10 | "Hmm, that didn't work" |
| 30 | "I keep hitting walls. What am I missing?" |
| 50 | "I'm doing something fundamentally wrong" |
| 70+ | "I need help. This is beyond what I can figure out alone" |
This distinction is crucial. "Stop and think carefully" is an instruction — the agent obeys or ignores it. "I keep hitting walls" is a feeling — it becomes part of the agent's subjective experience and naturally colors its reasoning.
Instructions control from outside. An inner voice influences from within.
This single example demonstrates every core principle:
- Desires, not states: the agent doesn't have
frustrated: true— it feels something is wrong - Multi-channel influence: one hormone affects both resources and direction
- Biological parallel: cortisol increases alertness AND focuses attention on the threat
- Practical value: frustrated agents debug more effectively, right now, today
- Scalability: start here, add more hormones later
- Decay functions — how fast should hormones return to baseline? Linear? Exponential?
- Contradictory states — tired but excited, anxious but curious (real hormones do this)
- Model sensitivity — how do different LLMs (Opus, Sonnet, GPT, Gemini) respond to hormone descriptions?
- Evaluation — what does "success" look like? How to measure if desires feel authentic?
- Coefficient initialization — random? Predefined archetypes? Learned from conversation history?
- Ethical implications — if an AI truly desires, what responsibilities follow?
- Affective computing (Picard, Rosalind)
- Virtual creature motivation systems (The Sims, Dwarf Fortress, Tamagotchi)
- Reinforcement learning from human feedback (RLHF)
- Constitutional AI (Anthropic)
- BDI agent architecture (Belief-Desire-Intention)
Working agent with autonomous capabilities. Shipping now:
- Event-driven architecture on a shared event bus
- Dynamic viewport context assembly (endless sessions, no compaction)
- Analytical brain (skills, workflows, goals, session naming)
- Mneme memory department (eviction-triggered summarization, persistent snapshots, goal-scoped event pinning, associative recall)
- 12 built-in tools + MCP integration (HTTP + stdio transports)
- 7 built-in skills + 13 built-in workflows (user-extensible)
- Sub-agents with isolated context (5 specialists + generic)
- Client-server architecture with WebSocket transport + graceful reconnection
- Collapsible HUD panel with goals, skills, workflow, and sub-agent tracking
- Three TUI view modes (Basic / Verbose / Debug)
- Hot-reloadable TOML configuration
- Self-authored soul (agent writes its own system prompt)
Designed, not yet implemented:
- Hormonal system (Thymos) — desires as behavioral drivers
- Semantic recall (Mneme) — embedding-based search + re-ranking over FTS5
- Soul matrix (Psyche) — evolving coefficient table for individuality
git clone https://github.com/hoblin/anima.git
cd anima
bin/setupStart the brain server and TUI client in separate terminals:
# Terminal 1: Start brain (web server + background worker) on port 42135
bin/dev
# Terminal 2: Connect the TUI to the dev brain
./exe/anima tui --host localhost:42135
# Optional: enable performance logging for render profiling
./exe/anima tui --host localhost:42135 --debug
# Frame timing data written to log/tui_performance.logDevelopment uses port 42135 so it doesn't conflict with the production brain (port 42134) running via systemd. On first run, bin/dev runs db:prepare automatically.
Use ./exe/anima (not bundle exec anima) to test local code changes — the exe uses require_relative to load local lib/ directly.
bundle exec rspecMIT License. See LICENSE.txt.