🚀 Introduction to ROMA

🌟 Universal Task Execution Framework

ROMA is a general-purpose, hierarchical task execution framework that empowers you to build ANY intelligent system by decomposing complex tasks into fundamental primitives. At the moment, we have the following: Think 🤔, Write ✍️, and Search 🔍.

This isn't just another AI tool—it's a universal scaffold for building high-performance agents that can handle ANY task that can be broken down using the MECE (Mutually Exclusive, Collectively Exhaustive) principle. From podcast generation to market analysis, from story writing to code generation—if you can imagine it, you can build it.

🎯 The MECE Framework

At the heart of SentientResearchAgent lies the MECE principle (Mutually Exclusive, Collectively Exhaustive), which provides a universal framework for decomposing ANY task into three fundamental operations.

What is MECE?

MECE is a problem-solving principle that ensures complete coverage without overlap:

Mutually Exclusive: Each operation type is distinct—there's no ambiguity about whether something is a THINK, WRITE, or SEARCH operation
Collectively Exhaustive: These three operations cover ALL possible tasks—there's nothing you need to do that doesn't fit into one of these categories

The Three Universal Operations

🤔 THINK - Reasoning & Analysis

Any cognitive operation that processes information without creating new content or retrieving external data:

Data Analysis: Finding patterns, trends, insights
Decision Making: Choosing between options, evaluating trade-offs
Planning: Breaking down problems, creating strategies
Evaluation: Assessing quality, checking correctness
Problem Solving: Finding solutions, debugging issues

✍️ WRITE - Content Generation & Synthesis

Any operation that creates new content or synthesizes existing information:

Document Creation: Reports, articles, documentation
Code Generation: Writing programs, scripts, configurations
Creative Content: Stories, scripts, marketing copy
Synthesis: Combining multiple sources into cohesive output
Formatting: Structuring and presenting information

🔍 SEARCH - Information Retrieval

Any operation that gathers information from external sources:

Web Research: Finding online information
Database Queries: Retrieving structured data
API Calls: Fetching from external services
Literature Review: Academic or technical research
Data Collection: Gathering raw information

Why MECE Matters

Universal Applicability: ANY task can be broken down into these three operations
Clear Boundaries: No confusion about which operation to use
Complete Coverage: Nothing falls through the cracks
Scalable Complexity: Simple tasks use few operations, complex tasks use many

MECE in Action

Consider building a market analysis report:

Goal: "Create a comprehensive market analysis for electric vehicles"

Decomposition:
1. SEARCH: Gather market data and statistics
2. SEARCH: Find competitor information
3. SEARCH: Collect regulatory information
4. THINK: Analyze market trends
5. THINK: Identify opportunities and threats
6. WRITE: Create executive summary
7. WRITE: Detailed analysis sections
8. THINK: Review and ensure coherence
9. WRITE: Final report with recommendations

Each operation is clearly one type, and together they completely achieve the goal.

🔄 Execution Flow Architecture

SentientResearchAgent uses a sophisticated three-directional execution flow that mirrors natural problem-solving:

1. Top-Down Decomposition ⬇️

Tasks flow from general to specific:

"Build a mobile app" (General)
    ↓
"Design UI" + "Build Backend" + "Write Tests" (Specific)
    ↓
"Create login screen" + "Design dashboard" + ... (More Specific)

2. Bottom-Up Aggregation ⬆️

Results flow from specific to general:

Individual UI screens (Specific)
    ↑
Complete UI design (Less Specific)
    ↑
Fully functional app (General)

3. Left-to-Right Dependencies ➡️

Tasks can depend on siblings for context:

"Research users" → "Design features" → "Build MVP"
     (First)           (Uses research)    (Uses both)

The Recursive Process

Every task starts at an ATOMIZER
- Evaluates task complexity
- Decides: Can this be executed directly (EXECUTE) or needs planning (PLAN)?
If EXECUTE node:
- Task is atomic (can't be broken down further)
- Appropriate executor agent is called
- Result is returned
If PLAN node:
- Task is complex and needs decomposition
- Planner agent breaks it into subtasks (THINK, WRITE, or SEARCH)
- Each subtask goes through its own atomizer
- Process repeats recursively to any depth
Horizontal Dependencies (Optional):
- Tasks can depend on siblings at the same level
- Dependent tasks wait for predecessors to complete
- Results flow left-to-right when dependencies exist
- Independent tasks execute in parallel
AGGREGATOR collects results:
- Once all subtasks complete
- Combines results intelligently based on context
- Returns synthesized result to parent
- Parent may itself be a subtask in a larger tree

🎚️ Recursive Depth Control

One of SentientResearchAgent's most powerful features is customizable recursion depth, allowing you to control the granularity of task decomposition.

Note: at the moment, we have found most use-cases work well for depths <= 3

Controlling Depth

# Shallow depth for quick tasks
agent = SentientAgent.create(max_depth=1)
quick_result = await agent.run("Summarize this article")

Depth Guidelines

Task Complexity	Recommended Depth	Use Cases
Simple	1	Summaries, quick searches, basic writing
Moderate	2-3	Blog posts, reports, standard analysis
Complex	3-4	in-depth reports, comprehensive story generation

🔍 Stage Tracing & Transparency

Stage Tracing is what sets ROMA apart—complete visibility into every step of the execution process.

What is Stage Tracing?

Stage Tracing provides a detailed log of:

Inputs: Exactly what each agent receives
Processing: How the agent interprets and processes the input
Outputs: What the agent produces
Context: The surrounding information used
Decisions: Why certain choices were made

Benefits of Stage Tracing

Debugging Made Easy
- See exactly where issues occur
- Understand why certain outputs were produced
- Identify bottlenecks or inefficiencies
Trust Through Transparency
- No "black box" mystery
- Understand the reasoning process
- Verify correctness at each step
Rapid Iteration
- See immediate effects of changes
- Test different approaches quickly
- Build confidence in your agents

🌳 Hierarchical Task Decomposition

The core principle of SentientResearchAgent is hierarchical task decomposition through a recursive atomizer-planner-executor architecture.

The Concept

The framework mirrors human problem-solving through a recursive process:

Atomizer evaluates - Is this task atomic or does it need planning?
If atomic - Execute directly with appropriate agent
If complex - Plan and decompose into subtasks (THINK, WRITE, SEARCH)
Recursively process - Each subtask goes through the same evaluation
Aggregate results - Combine outputs bottom-up through aggregators

Visual Example with Atomizer Flow

"Write a research paper on climate change" 
            │
            ▼ [ATOMIZER: Too complex → PLAN]
├── Research current climate data
│   │
│   ▼ [ATOMIZER: Too complex → PLAN]
│   ├── Search temperature trends
│   │   ▼ [ATOMIZER: Atomic → EXECUTE]
│   ├── Search sea level data
│   │   ▼ [ATOMIZER: Atomic → EXECUTE]
│   └── Search extreme weather patterns
│       ▼ [ATOMIZER: Atomic → EXECUTE]
│   ▲ [AGGREGATOR: Combine search results]
│
├── Analyze environmental impacts
│   │
│   ▼ [ATOMIZER: Too complex → PLAN]
│   ├── Impact on ecosystems
│   │   ▼ [ATOMIZER: Atomic → EXECUTE]
│   ├── Impact on human societies
│   │   ▼ [ATOMIZER: Atomic → EXECUTE]
│   └── Economic consequences
│       ▼ [ATOMIZER: Atomic → EXECUTE]
│   ▲ [AGGREGATOR: Synthesize analysis]
│
└── Write and format paper
    │
    ▼ [ATOMIZER: Too complex → PLAN]
    ├── Create outline
    │   ▼ [ATOMIZER: Atomic → EXECUTE]
    ├── Write sections
    │   ▼ [ATOMIZER: Atomic → EXECUTE]
    └── Add citations
        ▼ [ATOMIZER: Atomic → EXECUTE]
    ▲ [AGGREGATOR: Compile final paper]

Key Components in Action

ATOMIZER - The gatekeeper that decides task handling:
- Evaluates complexity
- Routes to PLAN or EXECUTE
- Ensures appropriate decomposition depth
PLAN NODE - The decomposer:
- Breaks complex tasks into MECE subtasks
- Assigns task types (THINK, WRITE, SEARCH)
- Defines dependencies
EXECUTE NODE - The worker:
- Handles atomic tasks
- Uses specialized agents
- Returns concrete results
AGGREGATOR - The synthesizer:
- Collects all subtask results
- Combines intelligently based on context
- Returns unified output to parent

Benefits

Intelligent Decomposition: Atomizer ensures optimal task breakdown
Parallelization: Independent subtasks run concurrently
Specialization: Right agent for each task type
Clarity: Complex goals become traceable execution paths
Flexibility: Recursive depth adapts to task complexity

🎭 Node Types

PLAN Nodes

Purpose: Decompose complex tasks into subtasks

# PLAN node example
{
  "node_type": "PLAN",
  "goal": "Analyze market trends",
  "sub_graph_id": "subgraph_123",  # Points to child tasks
  "planned_sub_task_ids": ["root.1", "root.2", "root.3"]
}

Characteristics:

Never execute work directly
Create and manage subtasks
Aggregate results from children
Can be nested (plans within plans)

EXECUTE Nodes

Purpose: Perform actual work

# EXECUTE node example
{
  "node_type": "EXECUTE",
  "goal": "Search for latest AI breakthroughs",
  "agent_name": "SearchAgent",
  "result": "Found 15 relevant papers..."
}

Characteristics:

Leaf nodes in the task tree
Use specialized agents
Produce concrete results
Cannot have subtasks

🏷️ Task Types

Task types in ROMA directly map to the MECE framework operations:

1. SEARCH Tasks 🔍

Purpose: Information retrieval - gathering data from external sources

TaskType.SEARCH

Key Characteristics:

Intended to retrieve information from outside the current context
Not intended to create new content
Not intended to analyze or make decisions

Examples:

Web searches for current information
Database queries for specific data
API calls to external services

Typical Agents: Web searchers, database/KB connectors, API integrators

2. WRITE Tasks ✍️

Purpose: Content generation - creating new information or synthesizing existing

TaskType.WRITE

Key Characteristics:

Creates new content that didn't exist before
Synthesizes information into new forms

Examples:

Report writing and documentation
Code generation and implementation
Formatting and presentation

Typical Agents: Writers, coders, synthesizers, formatters, content creators

3. THINK Tasks 🤔

Purpose: Analysis and reasoning - processing information to make decisions

TaskType.THINK

Key Characteristics:

Analyzes existing information
Makes decisions and evaluations

Examples:

Data analysis and pattern recognition
Strategic planning and decision making
Mathematical reasoning

Typical Agents: Analyzers, reasoners, consistency checking

🤖 Agent System

Agents are the workers that process tasks. Each agent specializes in specific operations.

Agent Roles

1. Atomizer Agents

Purpose: Determine if a task needs decomposition

Input: "Write a blog post about AI"
Output: {
  "is_atomic": False,  # Too complex, needs planning
  "refined_goal": "Write comprehensive blog post about AI developments"
}

2. Planner Agents

Purpose: Decompose complex tasks

Input: "Research and compare cloud providers"
Output: {
  "subtasks": [
    {"goal": "Research AWS features and pricing", "type": "SEARCH"},
    {"goal": "Research Azure features and pricing", "type": "SEARCH"},
    {"goal": "Research GCP features and pricing", "type": "SEARCH"},
    {"goal": "Create comparison matrix", "type": "THINK"},
    {"goal": "Write recommendation report", "type": "WRITE"}
  ]
}

3. Executor Agents

Purpose: Perform actual work

Input: "Search for quantum computing applications"
Output: {
  "result": "Found 5 key applications: cryptography, drug discovery...",
  "sources": ["Nature 2024", "MIT Research"],
  "confidence": 0.88
}

4. Aggregator Agents

Purpose: Combine results from subtasks

Input: [result1, result2, result3]
Output: {
  "summary": "Comprehensive analysis shows...",
  "key_findings": ["Finding 1", "Finding 2"],
  "conclusion": "Based on all research..."
}

🔗 Context Propagation

Context ensures information flows intelligently between tasks through well-defined propagation strategies.

Context Propagation Strategies

1. Parent-to-Child Propagation

Parent goals pass their context to children, ensuring awareness of the broader objective:

{
 "parent_goal": "Write investment report",
 "parent_constraints": ["Focus on 2024 data", "Include ESG factors"],
 "inherited_context": {
   "overall_objective": "Quarterly portfolio review",
   "style_guide": "formal"
 }
}

Sibling-to-Sibling Propagation (Dependency-Based) When a node depends on previous nodes, context flows between siblings:

{
  "dependency_context": {
    "depends_on": ["research_task_1", "research_task_2"],
    "sibling_results": [
      {"task_id": "research_task_1", "output": "Company A analysis..."},
      {"task_id": "research_task_2", "output": "Company B analysis..."}
    ]
  }
}

Root Task (context: user request)
    ↓ (propagates objective + constraints)
Plan Node (context: parent context + planning constraints)
    ↓ (propagates plan + parent context)
Execute Node 1 (context: plan + parent awareness)
    → (provides results to dependent siblings)
Execute Node 2 (context: plan + parent awareness + Node1 results via dependency)
    → (provides results to dependent siblings)
Execute Node 3 (context: plan + parent awareness + Node1,2 results via dependencies)
    ↑ (all results flow up)
Aggregator (context: all child results + original parent context)

Context Types

Lineage Context: Information flowing from parent and ancestor tasks

Dependency Context: Results from sibling tasks that current task depends on

Execution Context: Runtime information and system state

User Context: Preferences and constraints from the original request

⚡ Execution Strategies

1. Parallel Execution

Independent tasks run simultaneously:

        [Task A] ──┐
Root ──→ [Task B] ──┼──→ Aggregator
        [Task C] ──┘

2. Sequential Execution

Tasks with dependencies run in order:

Root ──→ [Task A] ──→ [Task B] ──→ [Task C] ──→ Result

Use Case: When each task depends on the previous one

3. Mixed Strategy

Combination of parallel and sequential:

        ┌─→ [Research A] ─┐
Root ──→│                 ├──→ [Analysis] ──→ [Report]
        └─→ [Research B] ─┘

🎯 Putting It All Together

Here's how these concepts work in practice:

User submits goal → Creates root TaskNode
Atomizer checks complexity → Determines PLAN vs EXECUTE
Planner decomposes → Creates subtask graph
Scheduler activates tasks → Based on dependencies
Executors process → Using specialized agents
Context flows → Between related tasks
Results aggregate → Bottom-up synthesis
Final result emerges → From hierarchical processing

FilesExpand file tree

INTRODUCTION.md

Latest commit

History

INTRODUCTION.md

File metadata and controls

🚀 Introduction to ROMA

🌟 Universal Task Execution Framework

🎯 The MECE Framework

What is MECE?

The Three Universal Operations

🤔 THINK - Reasoning & Analysis

✍️ WRITE - Content Generation & Synthesis

🔍 SEARCH - Information Retrieval

Why MECE Matters

MECE in Action

🔄 Execution Flow Architecture

1. Top-Down Decomposition ⬇️

2. Bottom-Up Aggregation ⬆️

3. Left-to-Right Dependencies ➡️

The Recursive Process

🎚️ Recursive Depth Control

Controlling Depth

Depth Guidelines

🔍 Stage Tracing & Transparency

What is Stage Tracing?

Benefits of Stage Tracing

🌳 Hierarchical Task Decomposition

The Concept

Visual Example with Atomizer Flow

Key Components in Action

Benefits

🎭 Node Types

PLAN Nodes

EXECUTE Nodes

🏷️ Task Types

1. SEARCH Tasks 🔍

2. WRITE Tasks ✍️

3. THINK Tasks 🤔

🤖 Agent System

Agent Roles

1. Atomizer Agents

2. Planner Agents

3. Executor Agents

4. Aggregator Agents

🔗 Context Propagation

Context Propagation Strategies

1. Parent-to-Child Propagation

⚡ Execution Strategies

1. Parallel Execution

2. Sequential Execution

3. Mixed Strategy

🎯 Putting It All Together