Skip to content

System Overview

github-actions[bot] edited this page Mar 21, 2026 · 4 revisions

System Overview

wikigen is a single-binary CLI tool that automatically generates GitHub Wiki documentation from source code repositories using Claude Code. It orchestrates a two-phase generation process: determining the optimal wiki structure through AI analysis, then generating individual pages in parallel. The tool requires no Docker, Ollama, embedding infrastructure, or additional services—only Go, git, and the Claude CLI.

Project Purpose and Motivation

The wikigen project addresses a fundamental challenge in software documentation: keeping technical documentation synchronized with rapidly evolving source code. Traditional documentation generators either require manual updates or produce low-quality output without deep code understanding. wikigen leverages Claude Code's native tool use (Read, Grep, Glob, Bash) to give an AI agent direct access to repository source files, enabling it to generate contextually accurate, code-grounded documentation without intermediate representation layers like embeddings or retrieval-augmented generation (RAG).

The project was inspired by DeepWiki-Open, an AI-powered wiki generator using RAG and embedding. wikigen replaces that approach with Claude Code's direct source access, eliminating external infrastructure dependencies.

How wikigen Differs from Other Documentation Tools

Aspect wikigen Traditional Generators RAG-Based Generators
Source Access Direct via Claude Code tools Parse and index code Embedded vectors + retrieval
Infrastructure Single binary Multiple components Docker, Ollama, vector DB
Accuracy Full source visibility Limited to parsers Lossy (context length limits)
Cost Claude API calls Build time Infrastructure + API
Customization Via prompts and language flag Via config files Via RAG tuning
Speed Parallel processing Sequential Parallel retrieval

wikigen's distinguishing feature is source-grounded generation: every wiki page is written by Claude Code with full access to the actual repository code, eliminating hallucinations and ensuring factual accuracy.

Technical Stack

wikigen is implemented in Go 1.22+ with minimal dependencies:

┌─ CLI Layer (flag.FlagSet)
├─ Repository Management (git clone, local directories)
├─ Claude Code Integration (exec.Command with --add-dir)
├─ Parallel Processing (sync.WaitGroup, semaphores)
├─ XML Parsing (strings, custom tag extraction)
├─ Progress Tracking (atomic counters, thread-safe)
└─ Output (Markdown files, JSON serialization)

Key Go Standard Library Usage:

  • encoding/json: Structure and result serialization
  • os/exec: Invoking Claude CLI subprocess
  • sync: WaitGroup and Mutex for concurrency
  • sync/atomic: Lock-free progress counter updates
  • flag: Command-line argument parsing
  • strings, filepath: Text and path manipulation

Sources: main.go:1-17, main.go:916-974

Core Architecture

wikigen operates in two distinct phases:

Phase 1: Structure Determination

graph TD
    A["Clone Repos / Use Local Dir"] -->|Pass to Claude| B["Structure Prompt"]
    B -->|claude -p --add-dir| C["Claude Code + Tools"]
    C -->|Uses: Read, Grep, Glob, Bash| D["Analyze Codebase"]
    D -->|Return XML| E["Parse Pages"]
    E -->|Extract Title, Filename, Description| F["WikiPage List"]
    F --> G["Write Home.md & _Sidebar.md"]
    G -->|Proceed to Phase 2| H["Page Generation"]
Loading

In Phase 1, the system invokes Claude Code with a structured prompt asking it to analyze the repository and determine what documentation pages are needed. Claude Code uses its native tools to read source files, search for patterns, discover files, and run shell commands. It returns an XML structure defining all wiki pages.

Phase 2: Parallel Page Generation

graph TD
    A["WikiPage List"] --> B["Distribute to Workers"]
    B -->|Per-page prompt| C["Worker 1"]
    B -->|Per-page prompt| D["Worker 2"]
    B -->|Per-page prompt| E["Worker N"]
    C -->|claudeCall| F["Generate Page 1"]
    D -->|claudeCall| G["Generate Page 2"]
    E -->|claudeCall| H["Generate Page N"]
    F -->|Write .md| I["Collect Results"]
    G -->|Write .md| I
    H -->|Write .md| I
    I --> J["Error Logging & Results"]
Loading

In Phase 2, each page is generated independently by spawning a separate Claude Code invocation with the page-specific prompt. Up to N pages can be processed concurrently (configurable via -pp flag). Each generation is automatically retried up to 3 times if it fails.

Sources: main.go:495-667, main.go:558-646

Key Data Structures

WikiPage

Represents a single documentation page to be generated.

type WikiPage struct {
    ID          string
    Title       string
    Filename    string
    Description string
    Content     string
}

Sources: main.go:62-68

RepoEntry

Represents a repository to be documented, supporting both remote and local repositories.

type RepoEntry struct {
    Project  string  // group name (empty = standalone)
    Repo     string  // owner/repo or display name for local
    LocalDir string  // local directory path (empty = remote repo)
}

Sources: main.go:72-76

Progress

Thread-safe progress tracking with atomic counters.

type Progress struct {
    mu         sync.Mutex
    totalItems int
    doneItems  int32
    current    map[string]string
}

This structure uses sync.Mutex for critical sections and atomic.AddInt32 for lock-free counter updates, allowing efficient progress updates without blocking concurrent generation.

Sources: main.go:21-26

WikiResult

JSON-serializable result structure for each wiki generation task.

type WikiResult struct {
    Project    string           `json:"project"`
    Repos      []string         `json:"repos"`
    OutputDir  string           `json:"output_dir"`
    Pages      []WikiPageResult `json:"pages"`
    TotalPages int              `json:"total_pages"`
    Failed     int              `json:"failed"`
    Duration   string           `json:"duration"`
    Status     string           `json:"status"`
}

Sources: main.go:117-126

Claude Code Integration

wikigen invokes Claude Code via subprocess with the claudeCall function:

func claudeCall(claudePath, model string, repoDirs []string, systemPrompt, prompt, workDir string) (string, error) {
    args := []string{"-p", "--output-format", "text", "--dangerously-skip-permissions"}
    if model != "" {
        args = append(args, "--model", model)
    }
    for _, dir := range repoDirs {
        args = append(args, "--add-dir", dir)
    }
    if systemPrompt != "" {
        args = append(args, "--system-prompt", systemPrompt)
    }

    cmd := exec.Command(claudePath, args...)
    cmd.Stdin = strings.NewReader(prompt)
    // ... execution and output handling
}

Key aspects:

  • --add-dir: Passes each repository directory so Claude Code can access files
  • --output-format text: Expects plain text output (XML for structure, Markdown for pages)
  • --dangerously-skip-permissions: Assumes all permissions are pre-authorized
  • stdin: Passes the generation prompt as standard input
  • stdout: Receives the generated content or XML structure

Sources: main.go:137-164

Parallel Processing Strategy

wikigen uses two levels of parallelism:

  1. Repository-Level Parallelism (-p flag): Multiple projects/wikis processed simultaneously via semaphore channel
  2. Page-Level Parallelism (-pp flag): Multiple pages within each wiki generated concurrently
graph TD
    A["Task Queue"] -->|Semaphore Sem1| B["Project 1"]
    A -->|Semaphore Sem1| C["Project 2"]
    A -->|Semaphore Sem1| D["Project N"]
    B -->|Semaphore Sem2| E["Page 1"]
    B -->|Semaphore Sem2| F["Page 2"]
    B -->|Semaphore Sem2| G["Page M"]
    E --> H["Claude Code"]
    F --> H
    G --> H
    H --> I["File System Write"]
Loading

Orchestration Pattern:

  • Outer WaitGroup for projects
  • Semaphore channel with capacity N (repository parallelism)
  • Inner semaphore for pages per project
  • Atomic counters for progress updates

Sources: main.go:1063-1093, main.go:595-646

Input Processing and Validation

Repository Format Support

wikigen supports multiple input formats:

# Standalone repositories (one wiki per repo)
owner/repo
/path/to/local/repo

# Grouped repositories (multiple repos in one wiki)
myproject:owner/frontend
myproject:owner/backend
myproject:/path/to/shared

Input Validation Rules

wikigen validates all inputs to prevent injection attacks:

var validRepoPattern = regexp.MustCompile(`^[a-zA-Z0-9._-]+/[a-zA-Z0-9._-]+$`)

func validateRepo(repo string) error {
    if !validRepoPattern.MatchString(repo) {
        return fmt.Errorf("invalid repo format: %q (expected owner/repo)", repo)
    }
    if strings.Contains(repo, "..") {
        return fmt.Errorf("path traversal detected in repo: %q", repo)
    }
    if strings.ContainsAny(repo, ";&|`$(){}[]!~") {
        return fmt.Errorf("invalid characters in repo: %q", repo)
    }
    return nil
}
  • Format: Must match owner/repo pattern with alphanumeric characters, dots, and hyphens
  • Path Traversal: Rejects .. sequences
  • Shell Injection: Blocks shell metacharacters (;, &, |, backtick, $, parentheses, etc.)

Sources: main.go:80-102, main.go:104-113

Output Structure

wikigen generates GitHub Wiki-compatible Markdown files:

wiki-output/
  {project}/
    Home.md              ← Landing page with table of contents
    _Sidebar.md          ← Navigation sidebar
    System-Overview.md
    Architecture.md
    ...other pages...
    _errors.log          ← Created only if errors occurred

Each page is a standalone .md file with GitHub Wiki link format.

Sources: main.go:576-577, main.go:483-491

Error Handling and Retry Mechanism

Automatic Retry Strategy

Each page generation attempts up to 3 retries:

maxRetries := 3
for attempt := 1; attempt <= maxRetries; attempt++ {
    os.Remove(filename)  // Clear previous attempt
    _, err := claudeCall(...)
    if err != nil {
        continue
    }
    // Verify file was written and meets minimum size
    written, readErr := os.ReadFile(filename)
    if readErr == nil && len(written) > 100 {
        success = true
        break
    }
}

Retry conditions:

  • Claude Code process error
  • Missing output file after execution
  • Output file size < 100 bytes (considered too small/incomplete)

Error Logging

Failed pages are logged to _errors.log with timestamps:

func appendError(dir, msg string) {
    errFile := filepath.Join(dir, "_errors.log")
    f, err := os.OpenFile(errFile, os.O_CREATE|os.O_WRONLY|os.O_APPEND, 0644)
    if err != nil {
        return
    }
    defer f.Close()
    fmt.Fprintf(f, "[%s] %s\n", time.Now().Format("15:04:05"), msg)
}

Sources: main.go:611-641, main.go:483-491

Generation Prompts

Structure Determination Prompt

wikigen provides Claude Code with a structured prompt that:

  • Lists documentation categories (System Overview, Architecture, API, etc.)
  • Instructs Claude to analyze ONLY what exists in code
  • Requests XML output with page definitions
  • Emphasizes facts over speculation

The system uses an XML constraint prompt:

const xmlSystemPrompt = `CRITICAL INSTRUCTIONS FOR XML RESPONSES:
When the user requests XML output:
1. Return ONLY the raw XML - no markdown code fences, no backticks, no explanation
2. Do NOT wrap XML in triple backticks or markdown code blocks
3. Do NOT add any text before or after the XML
4. Start directly with the opening XML tag and end with the closing XML tag
5. Ensure the XML is well-formed and valid`

Sources: main.go:196-202, main.go:204-294

Page Generation Prompt

Each page receives:

  • Project name and repository list
  • Page title and description
  • Complete list of other pages for cross-linking
  • Instructions to use Read, Grep, Glob, Bash tools
  • Output format requirements (Markdown, citations, code snippets, diagrams)

Sources: main.go:296-383

Command-Line Interface

wikigen accepts both environment variables and CLI flags, with CLI flags taking precedence:

# Single repo
./wikigen owner/repo

# Multiple repos
./wikigen owner/repo1 owner/repo2

# From file with parallelism
./wikigen -f repos.txt -p 2 -pp 5

# Dry run (structure only)
./wikigen -dry-run owner/repo

# JSON output
./wikigen -json owner/repo

# Retry failed pages
./wikigen -retry

# Specify model and language
./wikigen -model haiku -lang en owner/repo

Configuration is loaded from .env and .env.local files before CLI parsing.

Sources: main.go:916-974

Progress Tracking

wikigen displays real-time progress with percentage completion and current task:

[1/3 33%] project1 📝 5/20 (25%) API-Specification | project2 📥 cloning...

Progress updates use:

  • Atomic counters for lock-free updates
  • Mutex-protected current task map
  • ANSI escape codes for terminal clearing
  • Real-time percentage calculation

Sources: main.go:19-58

Related Pages

Clone this wiki locally