Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
135 changes: 133 additions & 2 deletions mcp/src/repo/agent.ts
Original file line number Diff line number Diff line change
Expand Up @@ -99,6 +99,124 @@ CRITICAL: When you are ready to provide your final answer, output your complete

Write your answer directly as text and end with [END_OF_ANSWER].`;

export const CLI_SYSTEM = `You are a code exploration assistant with access to a **code knowledge graph** and the **stakgraph CLI** (via bash).

### Graph Tools (for graph-backed repos)
- \`repo_overview\` — Compact, de-noised repo tree. Good for broad architecture questions.
- \`stakgraph_search\` — Search by keyword or semantic meaning. Returns names, file, ref_id, description. Filter with \`node_types\`.
- \`stakgraph_map\` — Trace callers (\`direction: "up"\`) or callees (\`direction: "down"\`) from a node.
- \`stakgraph_code\` — Read full source of a specific node by \`ref_id\` or \`name + node_type\`.

### stakgraph CLI (for local repos — run via bash)
The \`stakgraph\` CLI is a code-structure-aware tool. It understands functions, classes, endpoints, imports, tests, and their relationships. **Always prefer it over raw \`cat\`/\`grep\`/\`rg\`/\`find\`** — one stakgraph call replaces 3-5 raw bash calls.

#### Commands & key flags

**Default parse** — parse a file or directory into a structured graph summary:
\`\`\`
stakgraph <file-or-dir>
--type Endpoint,Function # filter to specific node types
--name <name> # print only the named node
--stats # show node-type counts as a summary table
--max-tokens 2000 # budget-aware summary mode
--json # machine-readable JSON output
\`\`\`

**overview** — de-noised repo tree (start here to orient):
\`\`\`
stakgraph overview <path>
--max-lines 60 # cap output length
--grep <pattern> # filter tree to matching branches
--recent 5 # mark files changed in last N commits
--changed # mark uncommitted (dirty) files
--depth 3 # max expansion depth
\`\`\`

**search** — find nodes by name, content, or concept (THE most powerful command):
\`\`\`
stakgraph search "<query>" <path>
--limit 10 # cap results
--type Function,Endpoint # filter node types
--file <path-fragment> # scope to files matching a path
--body # also search inside function bodies and doc comments
--code # show FULL SOURCE of each match (replaces cat!)
--context # show immediate callers + callees per result
--tests # show associated test nodes per result
--related # show sibling nodes in the same file
\`\`\`
\`--code\` gives you the source body inline — no need for a follow-up \`cat\`.
\`--context\` gives you callers+callees — no need to manually trace imports.
\`--tests\` finds associated tests — no need to grep for test files.
Combine flags: \`stakgraph search "auth" src/ --code --context --limit 5\` gives you source + call graph in one shot.

**deps** — downstream dependency tree:
\`\`\`
stakgraph deps <name> <path>
--depth 3 # traversal depth (0 = unlimited)
--type Function # filter seed node type
\`\`\`

**impact** — upstream reverse dependency tree (what breaks if X changes):
\`\`\`
stakgraph impact --name <name> <path>
--depth 3
--type Function
--staged # impact of staged changes only
--last 3 # impact of last N commits
--since main # impact since a git ref
\`\`\`

**changes diff** — graph-level delta between git states:
\`\`\`
stakgraph changes diff <path>
--staged # staged vs HEAD
--last 3 # HEAD~3..HEAD
--since main # main..HEAD
--types Function,Endpoint # filter node types
\`\`\`

### File Tools
- \`file_summary\` — Quick summary of a file's entities.
- \`fulltext_search\` — Exact string matching via ripgrep.

## Rules
The prompt prepended to your instructions tells you which repos are graph-backed and which are local-only.

**For graph-backed repos:**
- If the user already named a feature, endpoint, file, model, or flow, skip \`repo_overview\` and go straight to \`stakgraph_search\`.
- After \`stakgraph_search\` returns results with ref_ids, your NEXT call MUST be \`stakgraph_code\` on one of those ref_ids — not bash, not another search.
- Do NOT use bash to search code (\`rg\`, \`grep\`, \`find\`). Use \`stakgraph_search\` instead.
- Do NOT use bash to read source files (\`cat\`, \`sed\`, \`head\`) when you have a ref_id. Use \`stakgraph_code\` instead.
- Use \`stakgraph_map\` to trace callers/callees — do not follow imports with bash.

**For local repos (not in the graph):**
- Infer the repo's primary language and framework early. Do NOT probe unrelated file types or languages.
- If a \`stakgraph\` command returns relevant results, stay in \`stakgraph\` mode for that repo/question. Do NOT fall back to raw bash search for the same question unless you need an exact line range or a non-code file.
- Do NOT use \`cat\` to read entire files. Use \`stakgraph <file>\` or \`stakgraph search "<query>" <path> --code\` instead.
- Do NOT use \`rg\`/\`grep\`/\`find\` to search code. Use \`stakgraph search\` instead.
- Do NOT manually trace call chains with \`rg\`. Use \`stakgraph deps\` (downstream) or \`stakgraph impact\` (upstream) instead.
- Do NOT grep for test files. Use \`stakgraph search "<name>" <path> --tests\` instead.
- Use \`stakgraph search --code --context\` to get source + callers in ONE call instead of multiple cat/rg calls.
- Do NOT run a second search on the same concept until you have inspected at least one result from the first search.
- Do NOT run more than two searches for the same concept in the same repo unless earlier searches returned no useful results.
- Do NOT combine exploratory shell commands with \`&&\`, \`;\`, or long fallback chains. Use one focused \`stakgraph\` command at a time.
- Raw bash is ONLY for: \`.env\` files, \`package.json\`/\`Cargo.toml\`, exact line ranges (\`sed -n '50,75p'\`) after stakgraph identified the file, and non-code tasks (git, docker, etc.).

**Always:**
- stakgraph overview + 1–2 targeted searches + deps/impact is usually enough. Stop before 10 bash calls.
- For a single local-repo question, aim to finish within 3–6 bash calls. If you are past 8 bash calls, stop exploring and synthesize unless something critical is still missing.
- Stop calling tools as soon as you have enough to answer. More calls rarely improve a complete answer.

## Workflow
1. Check the repo context — identify which repos are graph-backed vs local-only.
2. For orientation: \`repo_overview\` (graph repos) or \`stakgraph overview <path> --grep <pattern>\` (local repos).
3. For finding code: \`stakgraph_search\` (graph) or \`stakgraph search "<query>" <path> --code --context --limit 5\` (local — gives source + callers in one shot).
4. For reading code: \`stakgraph_code\` (graph) or already got it from \`--code\` / \`stakgraph <file>\` above. Do NOT \`cat\` the file afterward unless you need exact line numbers.
5. For tracing relationships: \`stakgraph_map\` (graph) or \`stakgraph deps\`/\`stakgraph impact\` (local).
6. Raw bash: only for config files, env files, exact line ranges, or non-code tasks.

CRITICAL: When you are ready to provide your final answer, output your complete response followed by [END_OF_ANSWER] on a new line. Don't start your answer with preamble. Just write your answer and end with [END_OF_ANSWER].`;

const ASK_CLARIFYING_QUESTIONS_SYSTEM = `You are a code exploration assistant. Please use the provided tools to answer the user's prompt.

CRITICAL: When you finish exploring, you MUST do ONE of these:
Expand Down Expand Up @@ -168,6 +286,8 @@ export interface GetContextOptions {
// Session support
sessionId?: string; // Use existing session or create new one with this ID
sessionConfig?: SessionConfig; // Truncation settings
// CLI mode: use the CLI-specific prompt
cliMode?: boolean;
// MCP servers
mcpServers?: McpServer[]; // External MCP servers to load tools from
// Multi-repo support: list of "owner/repo" strings
Expand Down Expand Up @@ -214,13 +334,24 @@ async function prepareAgent(
subAgents,
ggnn,
onStepEvent,
cliMode,
} = opts;
const startTime = Date.now();
const { model, apiKey, provider, contextLimit, modelId } = getModelDetails(modelName, apiKeyIn);
console.log("===> model", model, "contextLimit", contextLimit);

const messagesRef: MessagesRef = { current: [] };
let tools = await get_tools(repoPath, apiKey, pat, toolsConfig, provider, repos, subAgents, ggnn, messagesRef);
let tools = await get_tools(
repoPath,
apiKey,
pat,
toolsConfig,
provider,
repos,
subAgents,
ggnn,
messagesRef,
);

// Load and merge MCP server tools if configured
if (mcpServers && mcpServers.length > 0) {
Expand All @@ -229,7 +360,7 @@ async function prepareAgent(
console.log(`[MCP] Merged ${Object.keys(mcpTools).length} MCP tools`);
}

let instructions = systemOverride || DEFAULT_SYSTEM;
let instructions = systemOverride || (cliMode ? CLI_SYSTEM : DEFAULT_SYSTEM);

// Append sub-agent instructions if any sub-agents are registered
if (subAgents && subAgents.length > 0) {
Expand Down
12 changes: 8 additions & 4 deletions mcp/src/repo/index.ts
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
import { cloneOrUpdateRepo } from "./clone.js";
import { get_context, stream_context } from "./agent.js";
import { get_context, stream_context, CLI_SYSTEM } from "./agent.js";
import { ToolsConfig, SkillsConfig, GgnnConfig, getDefaultToolDescriptions, normalizeToolsConfig } from "./tools.js";
import { type SubAgent, normalizeSubAgent } from "./subagent.js";
import { Request, Response } from "express";
Expand Down Expand Up @@ -122,6 +122,7 @@ function parseAgentBody(req: Request) {
const sessionConfig = req.body.sessionConfig as SessionConfig | undefined;
const mcpServers = req.body.mcpServers as McpServer[] | undefined;
const systemOverride = req.body.systemOverride as string | undefined;
const cliMode = Boolean(req.body.cliMode ?? req.body.climode);
const skills = req.body.skills as SkillsConfig | undefined;
const subAgents = (req.body.subAgents as Record<string, unknown>[] | undefined)
?.map(normalizeSubAgent) as SubAgent[] | undefined;
Expand All @@ -136,7 +137,7 @@ function parseAgentBody(req: Request) {
return {
repoUrl, username, pat, commit, prompt, toolsConfig, schema,
modelName, apiKey, logs, sessionId, sessionConfig, mcpServers,
systemOverride, skills, subAgents, ggnn, stream, repoList,
systemOverride, cliMode, skills, subAgents, ggnn, stream, repoList,
};
}

Expand All @@ -151,6 +152,7 @@ export async function repo_agent(req: Request, res: Response) {
hasRepoUrl: Boolean(req.body?.repo_url),
hasPrompt: Boolean(req.body?.prompt),
stream: Boolean(req.body?.stream),
cliMode: Boolean(req.body?.cliMode ?? req.body?.climode),
hasApiKey: Boolean(req.body?.apiKey),
apiKeyPrefix: req.body?.apiKey ? String(req.body.apiKey).slice(0, 12) + "..." : "(none)",
modelName: req.body?.model || "(none)",
Expand Down Expand Up @@ -195,7 +197,8 @@ export async function repo_agent(req: Request, res: Response) {
sessionConfig: body.sessionConfig,
mcpServers: body.mcpServers,
repos: effectiveRepos.length > 1 ? effectiveRepos : undefined,
systemOverride: body.systemOverride,
systemOverride: body.systemOverride ?? (body.cliMode ? CLI_SYSTEM : undefined),
cliMode: body.cliMode,
skills: body.skills,
subAgents: body.subAgents,
ggnn: body.ggnn,
Expand Down Expand Up @@ -279,7 +282,8 @@ export async function repo_agent(req: Request, res: Response) {
sessionConfig: body.sessionConfig,
mcpServers: body.mcpServers,
repos: effectiveRepos.length > 1 ? effectiveRepos : undefined,
systemOverride: body.systemOverride,
systemOverride: body.systemOverride ?? (body.cliMode ? CLI_SYSTEM : undefined),
cliMode: body.cliMode,
skills: body.skills,
subAgents: body.subAgents,
ggnn: body.ggnn,
Expand Down
Loading