-
Notifications
You must be signed in to change notification settings - Fork 1
System Overview
wikigen is a single-binary CLI tool that automatically generates GitHub Wiki documentation from source code repositories using Claude Code. It orchestrates a two-phase generation process: determining the optimal wiki structure through AI analysis, then generating individual pages in parallel. The tool requires no Docker, Ollama, embedding infrastructure, or additional services—only Go, git, and the Claude CLI.
The wikigen project addresses a fundamental challenge in software documentation: keeping technical documentation synchronized with rapidly evolving source code. Traditional documentation generators either require manual updates or produce low-quality output without deep code understanding. wikigen leverages Claude Code's native tool use (Read, Grep, Glob, Bash) to give an AI agent direct access to repository source files, enabling it to generate contextually accurate, code-grounded documentation without intermediate representation layers like embeddings or retrieval-augmented generation (RAG).
The project was inspired by DeepWiki-Open, an AI-powered wiki generator using RAG and embedding. wikigen replaces that approach with Claude Code's direct source access, eliminating external infrastructure dependencies.
| Aspect | wikigen | Traditional Generators | RAG-Based Generators |
|---|---|---|---|
| Source Access | Direct via Claude Code tools | Parse and index code | Embedded vectors + retrieval |
| Infrastructure | Single binary | Multiple components | Docker, Ollama, vector DB |
| Accuracy | Full source visibility | Limited to parsers | Lossy (context length limits) |
| Cost | Claude API calls | Build time | Infrastructure + API |
| Customization | Via prompts and language flag | Via config files | Via RAG tuning |
| Speed | Parallel processing | Sequential | Parallel retrieval |
wikigen's distinguishing feature is source-grounded generation: every wiki page is written by Claude Code with full access to the actual repository code, eliminating hallucinations and ensuring factual accuracy.
wikigen is implemented in Go 1.22+ with minimal dependencies:
┌─ CLI Layer (flag.FlagSet)
├─ Repository Management (git clone, local directories)
├─ Claude Code Integration (exec.Command with --add-dir)
├─ Parallel Processing (sync.WaitGroup, semaphores)
├─ XML Parsing (strings, custom tag extraction)
├─ Progress Tracking (atomic counters, thread-safe)
└─ Output (Markdown files, JSON serialization)
Key Go Standard Library Usage:
-
encoding/json: Structure and result serialization -
os/exec: Invoking Claude CLI subprocess -
sync: WaitGroup and Mutex for concurrency -
sync/atomic: Lock-free progress counter updates -
flag: Command-line argument parsing -
strings,filepath: Text and path manipulation
Sources: main.go:1-17, main.go:916-974
wikigen operates in two distinct phases:
graph TD
A["Clone Repos / Use Local Dir"] -->|Pass to Claude| B["Structure Prompt"]
B -->|claude -p --add-dir| C["Claude Code + Tools"]
C -->|Uses: Read, Grep, Glob, Bash| D["Analyze Codebase"]
D -->|Return XML| E["Parse Pages"]
E -->|Extract Title, Filename, Description| F["WikiPage List"]
F --> G["Write Home.md & _Sidebar.md"]
G -->|Proceed to Phase 2| H["Page Generation"]
In Phase 1, the system invokes Claude Code with a structured prompt asking it to analyze the repository and determine what documentation pages are needed. Claude Code uses its native tools to read source files, search for patterns, discover files, and run shell commands. It returns an XML structure defining all wiki pages.
graph TD
A["WikiPage List"] --> B["Distribute to Workers"]
B -->|Per-page prompt| C["Worker 1"]
B -->|Per-page prompt| D["Worker 2"]
B -->|Per-page prompt| E["Worker N"]
C -->|claudeCall| F["Generate Page 1"]
D -->|claudeCall| G["Generate Page 2"]
E -->|claudeCall| H["Generate Page N"]
F -->|Write .md| I["Collect Results"]
G -->|Write .md| I
H -->|Write .md| I
I --> J["Error Logging & Results"]
In Phase 2, each page is generated independently by spawning a separate Claude Code invocation with the page-specific prompt. Up to N pages can be processed concurrently (configurable via -pp flag). Each generation is automatically retried up to 3 times if it fails.
Sources: main.go:495-667, main.go:558-646
Represents a single documentation page to be generated.
type WikiPage struct {
ID string
Title string
Filename string
Description string
Content string
}Sources: main.go:62-68
Represents a repository to be documented, supporting both remote and local repositories.
type RepoEntry struct {
Project string // group name (empty = standalone)
Repo string // owner/repo or display name for local
LocalDir string // local directory path (empty = remote repo)
}Sources: main.go:72-76
Thread-safe progress tracking with atomic counters.
type Progress struct {
mu sync.Mutex
totalItems int
doneItems int32
current map[string]string
}This structure uses sync.Mutex for critical sections and atomic.AddInt32 for lock-free counter updates, allowing efficient progress updates without blocking concurrent generation.
Sources: main.go:21-26
JSON-serializable result structure for each wiki generation task.
type WikiResult struct {
Project string `json:"project"`
Repos []string `json:"repos"`
OutputDir string `json:"output_dir"`
Pages []WikiPageResult `json:"pages"`
TotalPages int `json:"total_pages"`
Failed int `json:"failed"`
Duration string `json:"duration"`
Status string `json:"status"`
}Sources: main.go:117-126
wikigen invokes Claude Code via subprocess with the claudeCall function:
func claudeCall(claudePath, model string, repoDirs []string, systemPrompt, prompt, workDir string) (string, error) {
args := []string{"-p", "--output-format", "text", "--dangerously-skip-permissions"}
if model != "" {
args = append(args, "--model", model)
}
for _, dir := range repoDirs {
args = append(args, "--add-dir", dir)
}
if systemPrompt != "" {
args = append(args, "--system-prompt", systemPrompt)
}
cmd := exec.Command(claudePath, args...)
cmd.Stdin = strings.NewReader(prompt)
// ... execution and output handling
}Key aspects:
-
--add-dir: Passes each repository directory so Claude Code can access files -
--output-format text: Expects plain text output (XML for structure, Markdown for pages) -
--dangerously-skip-permissions: Assumes all permissions are pre-authorized - stdin: Passes the generation prompt as standard input
- stdout: Receives the generated content or XML structure
Sources: main.go:137-164
wikigen uses two levels of parallelism:
-
Repository-Level Parallelism (
-pflag): Multiple projects/wikis processed simultaneously via semaphore channel -
Page-Level Parallelism (
-ppflag): Multiple pages within each wiki generated concurrently
graph TD
A["Task Queue"] -->|Semaphore Sem1| B["Project 1"]
A -->|Semaphore Sem1| C["Project 2"]
A -->|Semaphore Sem1| D["Project N"]
B -->|Semaphore Sem2| E["Page 1"]
B -->|Semaphore Sem2| F["Page 2"]
B -->|Semaphore Sem2| G["Page M"]
E --> H["Claude Code"]
F --> H
G --> H
H --> I["File System Write"]
Orchestration Pattern:
- Outer WaitGroup for projects
- Semaphore channel with capacity N (repository parallelism)
- Inner semaphore for pages per project
- Atomic counters for progress updates
Sources: main.go:1063-1093, main.go:595-646
wikigen supports multiple input formats:
# Standalone repositories (one wiki per repo)
owner/repo
/path/to/local/repo
# Grouped repositories (multiple repos in one wiki)
myproject:owner/frontend
myproject:owner/backend
myproject:/path/to/shared
wikigen validates all inputs to prevent injection attacks:
var validRepoPattern = regexp.MustCompile(`^[a-zA-Z0-9._-]+/[a-zA-Z0-9._-]+$`)
func validateRepo(repo string) error {
if !validRepoPattern.MatchString(repo) {
return fmt.Errorf("invalid repo format: %q (expected owner/repo)", repo)
}
if strings.Contains(repo, "..") {
return fmt.Errorf("path traversal detected in repo: %q", repo)
}
if strings.ContainsAny(repo, ";&|`$(){}[]!~") {
return fmt.Errorf("invalid characters in repo: %q", repo)
}
return nil
}-
Format: Must match
owner/repopattern with alphanumeric characters, dots, and hyphens -
Path Traversal: Rejects
..sequences -
Shell Injection: Blocks shell metacharacters (
;,&,|, backtick,$, parentheses, etc.)
Sources: main.go:80-102, main.go:104-113
wikigen generates GitHub Wiki-compatible Markdown files:
wiki-output/
{project}/
Home.md ← Landing page with table of contents
_Sidebar.md ← Navigation sidebar
System-Overview.md
Architecture.md
...other pages...
_errors.log ← Created only if errors occurred
Each page is a standalone .md file with GitHub Wiki link format.
Sources: main.go:576-577, main.go:483-491
Each page generation attempts up to 3 retries:
maxRetries := 3
for attempt := 1; attempt <= maxRetries; attempt++ {
os.Remove(filename) // Clear previous attempt
_, err := claudeCall(...)
if err != nil {
continue
}
// Verify file was written and meets minimum size
written, readErr := os.ReadFile(filename)
if readErr == nil && len(written) > 100 {
success = true
break
}
}Retry conditions:
- Claude Code process error
- Missing output file after execution
- Output file size < 100 bytes (considered too small/incomplete)
Failed pages are logged to _errors.log with timestamps:
func appendError(dir, msg string) {
errFile := filepath.Join(dir, "_errors.log")
f, err := os.OpenFile(errFile, os.O_CREATE|os.O_WRONLY|os.O_APPEND, 0644)
if err != nil {
return
}
defer f.Close()
fmt.Fprintf(f, "[%s] %s\n", time.Now().Format("15:04:05"), msg)
}Sources: main.go:611-641, main.go:483-491
wikigen provides Claude Code with a structured prompt that:
- Lists documentation categories (System Overview, Architecture, API, etc.)
- Instructs Claude to analyze ONLY what exists in code
- Requests XML output with page definitions
- Emphasizes facts over speculation
The system uses an XML constraint prompt:
const xmlSystemPrompt = `CRITICAL INSTRUCTIONS FOR XML RESPONSES:
When the user requests XML output:
1. Return ONLY the raw XML - no markdown code fences, no backticks, no explanation
2. Do NOT wrap XML in triple backticks or markdown code blocks
3. Do NOT add any text before or after the XML
4. Start directly with the opening XML tag and end with the closing XML tag
5. Ensure the XML is well-formed and valid`Sources: main.go:196-202, main.go:204-294
Each page receives:
- Project name and repository list
- Page title and description
- Complete list of other pages for cross-linking
- Instructions to use Read, Grep, Glob, Bash tools
- Output format requirements (Markdown, citations, code snippets, diagrams)
Sources: main.go:296-383
wikigen accepts both environment variables and CLI flags, with CLI flags taking precedence:
# Single repo
./wikigen owner/repo
# Multiple repos
./wikigen owner/repo1 owner/repo2
# From file with parallelism
./wikigen -f repos.txt -p 2 -pp 5
# Dry run (structure only)
./wikigen -dry-run owner/repo
# JSON output
./wikigen -json owner/repo
# Retry failed pages
./wikigen -retry
# Specify model and language
./wikigen -model haiku -lang en owner/repoConfiguration is loaded from .env and .env.local files before CLI parsing.
Sources: main.go:916-974
wikigen displays real-time progress with percentage completion and current task:
[1/3 33%] project1 📝 5/20 (25%) API-Specification | project2 📥 cloning...
Progress updates use:
- Atomic counters for lock-free updates
- Mutex-protected current task map
- ANSI escape codes for terminal clearing
- Real-time percentage calculation
Sources: main.go:19-58
- Architecture & Design — Detailed system design and generation pipeline
- CLI Usage & Commands — Complete command-line reference
- Configuration & Environment — Environment variables and .env setup
- Input Formats & Repository Configuration — Repository input formats and parsing
- Authentication & Git Integration — SSH and PAT authentication methods
- Output Format & Wiki Structure — GitHub Wiki-compatible output structure
- Error Handling & Retry Mechanism — Error handling and automatic retries
- Parallel Processing & Performance — Concurrency and performance tuning
- Input Validation & Security — Input validation and security measures
- Claude Code Integration — Claude Code invocation and tool use
- Wiki Generation Processing Flow — Two-phase generation process details
- System Overview
- Architecture & Design
- CLI Usage & Commands
- Configuration & Environment
- Input Formats & Repository Configuration
- Authentication & Git Integration
- Output Format & Wiki Structure
- Error Handling & Retry Mechanism
- Parallel Processing & Performance
- Input Validation & Security
- Build & Deployment
- Claude Code Integration
- Wiki Generation Processing Flow
- Multi-Repository Wiki Support
- Progress Tracking & Output Modes