Skip to content

Input Formats Repository Configuration

github-actions[bot] edited this page Mar 21, 2026 · 1 revision

Input Formats & Repository Configuration

This page documents how wikigen accepts repository inputs, how it parses configuration files, and how it validates and processes repositories for wiki generation. It covers the repos.txt format, standalone versus multi-repository grouped wikis, local directory support, and command-line argument handling.

Overview

wikigen accepts repository inputs through multiple channels: command-line positional arguments, the -f flag (file-based), the -r flag (comma-separated list), and the -local flag (local directory without cloning). The input processing pipeline validates all repositories, determines whether each is local or remote, and coordinates cloning and wiki generation tasks accordingly.

See also: CLI Usage & Commands for complete flag reference, Authentication & Git Integration for cloning mechanisms, and Input Validation & Security for security measures.

Input Methods

wikigen supports four primary input methods:

flowchart TD
    A["Input Decision"] --> B["Positional Args"]
    A --> C["File: -f repos.txt"]
    A --> D["Flag: -r repos"]
    A --> E["Local Dir: -local path"]

    B --> F["Parse as owner/repo"]
    C --> G["Read & Parse File"]
    D --> H["Split by Comma"]
    E --> I["Validate & Use Direct"]

    F --> J["Build Task List"]
    G --> J
    H --> J
    I --> J

    style A fill:#e1f5ff
    style J fill:#c8e6c9
Loading

Positional Arguments

When wikigen is invoked with repository identifiers as positional arguments (not preceded by flags), they are collected and processed:

./wikigen owner/repo1 owner/repo2

Multiple positional arguments are combined into a comma-separated list and parsed identically to the -r flag. This is the simplest input method for generating wikis for one or two repositories at a time.

Sources: main.go:954-960

File-Based Input: -f Flag

The -f flag accepts a file path containing one repository specification per line:

./wikigen -f repos.txt

The file is read entirely and each non-empty, non-comment line becomes a repository specification. Comments begin with # and are ignored.

Sources: main.go:984-990

Comma-Separated: -r Flag

The -r flag accepts a comma-separated list of repositories:

./wikigen -r "owner/repo1,owner/repo2,project:owner/repo3"

Each comma-separated value is treated as a single repository specification. This method is useful for scripting and one-liner commands but scales poorly beyond a few repositories.

Sources: main.go:942

Local Directory: -local Flag

The -local flag specifies a local directory to document without cloning:

./wikigen -local /path/to/local/repo

This bypasses git cloning entirely, allowing wikigen to analyze a repository already present on the filesystem. The project name is derived from the directory basename or can be overridden by a positional argument:

./wikigen -local /path/to/repo myproject

Sources: main.go:940, main.go:1006-1012

repos.txt Format and Parsing

The repos.txt file format supports multiple repository specification styles, allowing a single file to describe both standalone and grouped multi-repository wikis.

File Structure

# Comment line — ignored by parser
owner/repo1         # Standalone wiki, remote repository

# Local directories
/absolute/path/repo
./relative/path/repo
~/home/path/repo

# Multi-repo grouped wiki
myproject:owner/frontend
myproject:owner/backend
myproject:owner/shared

# Mixed: grouped wiki with local directory
myproject:/local/path
myproject:owner/remote-repo

Empty lines and lines beginning with # are skipped. Each line represents one repository specification.

Sources: main.go:700-725

Parsing Algorithm

The parseRepoList function processes each line using the following decision tree:

flowchart TD
    A["Line from repos.txt"] --> B["Trim whitespace"]
    B --> C{"Empty or comment?"}
    C -->|Yes| D["Skip"]
    C -->|No| E{"Contains colon?"}

    E -->|Yes| F{"Prefix without '/' and<br/>not a local path?"}
    F -->|Yes| G["Parse as group<br/>project:repo"]
    F -->|No| H["Parse as standalone"]

    E -->|No| I{"Local path?<br/>'/','./','../',<br/>'~/', or<br/>existing dir?"}
    I -->|Yes| J["Parse as<br/>local standalone"]
    I -->|No| K["Parse as<br/>remote standalone"]

    G --> L["Add to 'groups'<br/>map"]
    H --> M["Add to<br/>'standalone'<br/>slice"]
    J --> M
    K --> M

    style A fill:#e1f5ff
    style L fill:#fff9c4
    style M fill:#c8e6c9
Loading

The parser distinguishes between three repository types:

  1. Standalone Remote: owner/repo — cloned from GitHub
  2. Standalone Local: /path, ./path, ~/path, or existing directory — used directly
  3. Grouped: project:owner/repo or project:/path — multiple repos merged into one wiki

Sources: main.go:700-725

Local Path Detection

The isLocalPath function detects local paths by checking for common prefixes or attempting filesystem access:

func isLocalPath(s string) bool {
    if strings.HasPrefix(s, "/") || strings.HasPrefix(s, "./") ||
       strings.HasPrefix(s, "../") || strings.HasPrefix(s, "~/") {
        return true
    }
    // Check if it's an existing directory
    info, err := os.Stat(s)
    return err == nil && info.IsDir()
}

This allows relative paths like ../sibling-repo or even bare directory names like my-repo (if the directory exists in the current working directory) to be recognized without requiring an explicit prefix.

Sources: main.go:82-89

Repository Entry Structure

Each repository input is parsed into a RepoEntry struct:

type RepoEntry struct {
    Project  string // group name (empty = standalone)
    Repo     string // owner/repo or display name for local
    LocalDir string // local directory path (empty = remote repo)
}
  • Project: Non-empty only for grouped wikis. All entries with the same Project value are merged into one wiki.
  • Repo: Either owner/repo format for remote repositories or a display name for local directories.
  • LocalDir: Non-empty only for local repositories. Contains the absolute or relative path to the directory.

Sources: main.go:72-76

Standalone Wikis

A standalone wiki is generated for each unique repository not assigned to a project group. Each repository produces one wiki output directory.

Format in repos.txt

owner/frontend
owner/backend
owner/docs

Three separate wiki directories are created: wiki-output/frontend, wiki-output/backend, and wiki-output/docs.

Naming

The project name (wiki directory name) is derived from the repository name:

  • owner/my-repomy-repo
  • /path/to/my-servicemy-service

The repository name is extracted from the trailing component of either the owner/repo format or the local directory path.

Sources: main.go:1014-1026

Multi-Repository Grouped Wikis

Multiple repositories can be grouped into a single wiki by prefixing them with a project name and colon. This enables cross-repository documentation that spans multiple codebases.

Format in repos.txt

myproject:owner/frontend-repo
myproject:owner/backend-repo
myproject:owner/shared-library

All three repositories are analyzed together and generate a single wiki at wiki-output/myproject/. Pages document the system as a whole, including inter-service interactions and shared components.

Multi-Repo Documentation

When multiple repositories form one project, wikigen creates cross-repository pages showing:

  • System architecture across all repositories
  • How repositories interact with each other (e.g., frontend API calls to backend)
  • Shared data models and protocols
  • Deployment topology

All repositories are passed together to Claude Code during both the structure determination and page generation phases, enabling the AI to understand and document the entire system holistically.

Sources: main.go:1028-1030, main.go:246-250

Mixing Remote and Local in Groups

Groups can mix remote and local repositories:

myproject:owner/frontend
myproject:/local/path/backend
myproject:owner/shared-lib

All three are documented together in a single wiki. Remote repositories are cloned; local ones are used directly.

Local Directory Support

wikigen can analyze local repositories without cloning, useful for documentation of unreleased code, private repositories, or repositories on disk.

Via repos.txt

Include any local path in repos.txt:

/absolute/path/to/repo          # Absolute path
./relative/path                 # Relative to current directory
../sibling-repo                 # Relative with parent traversal
~/myproject                      # Home directory (tilde expansion planned)
localrepo                        # If localrepo/ exists, treated as local

Each is validated to ensure the directory exists and is readable.

Via -local Flag

./wikigen -local /path/to/repo
./wikigen -local ./local-copy myproject

When -local is specified, the directory is used directly without cloning. A project name can optionally be provided as a positional argument; otherwise, it defaults to the directory basename.

Validation

Local directories are validated via validateLocalDir:

func validateLocalDir(dir string) error {
    info, err := os.Stat(dir)
    if err != nil {
        return fmt.Errorf("local directory not found: %q", dir)
    }
    if !info.IsDir() {
        return fmt.Errorf("not a directory: %q", dir)
    }
    return nil
}

If the directory does not exist or is not readable, an error is returned and generation is skipped for that repository.

Sources: main.go:104-113

Input Validation and Security

All repository inputs are validated before cloning or processing to prevent injection attacks and path traversal.

Remote Repository Validation

Remote repositories (specified as owner/repo) must match the pattern ^[a-zA-Z0-9._-]+/[a-zA-Z0-9._-]+$:

var validRepoPattern = regexp.MustCompile(`^[a-zA-Z0-9._-]+/[a-zA-Z0-9._-]+$`)

func validateRepo(repo string) error {
    if !validRepoPattern.MatchString(repo) {
        return fmt.Errorf("invalid repo format: %q (expected owner/repo)", repo)
    }
    if strings.Contains(repo, "..") {
        return fmt.Errorf("path traversal detected in repo: %q", repo)
    }
    if strings.ContainsAny(repo, ";&|`$(){}[]!~") {
        return fmt.Errorf("invalid characters in repo: %q", repo)
    }
    return nil
}

The validation checks:

  1. Format matches owner/repo with alphanumerics, dots, and dashes only
  2. No path traversal sequences (..)
  3. No shell injection characters (;, &, |, backtick, $, parentheses, braces, brackets, !, ~)

Sources: main.go:80-102

Validation Flow

graph TD
    A["Repository Input"] --> B{"Type?"}

    B -->|Remote| C["validateRepo"]
    B -->|Local| D["validateLocalDir"]

    C --> E{"Valid format?"}
    E -->|No| F["Reject: Invalid format"]
    E -->|Yes| G{"Contains '..'?"}

    G -->|Yes| H["Reject: Path traversal"]
    G -->|No| I{"Contains shell chars?"}

    I -->|Yes| J["Reject: Injection chars"]
    I -->|No| K["Approve & Clone"]

    D --> L{"Exists and<br/>is dir?"}
    L -->|No| M["Reject: Not found or not dir"]
    L -->|Yes| N["Approve & Use Direct"]

    K --> O["Proceed to Generation"]
    N --> O
    F --> P["Error & Skip"]
    H --> P
    J --> P
    M --> P

    style A fill:#e1f5ff
    style O fill:#c8e6c9
    style P fill:#ffcdd2
Loading

Repository Resolution and Task Generation

After parsing all inputs, wikigen builds a task list grouping repositories by project:

sequenceDiagram
    participant CLI as CLI Input
    participant Parse as parseRepoList
    participant Gen as generateWiki

    CLI ->> Parse: repos.txt / flags / positional args
    Parse ->> Parse: Classify each line
    Parse -->> CLI: standalone[], groups{}

    CLI ->> Gen: For each standalone
    Gen ->> Gen: Resolve repo (clone or validate local)
    Gen ->> Gen: Generate wiki
    Gen -->> CLI: result

    CLI ->> Gen: For each group
    Gen ->> Gen: Resolve all repos in group
    Gen ->> Gen: Generate merged wiki
    Gen -->> CLI: result
Loading

The task list determines how many wiki outputs will be created and which repositories belong to each output.

Sources: main.go:999-1030

Repository Processing During Wiki Generation

Once a task is dispatched for generation, wikigen processes its repositories in sequence:

flowchart TD
    A["generateWiki Task<br/>projectName + repos[]"] --> B{"For each repo"}

    B -->|Local path| C["validateLocalDir"]
    B -->|Remote| D["validateRepo"]

    C -->|Valid| E["Add to repoDirs"]
    D -->|Valid| F["gitClone"]

    F --> G["Add to repoDirs"]

    C -->|Invalid| H["Return error"]
    D -->|Invalid| H

    E --> I["All repoDirs collected"]
    G --> I
    H --> J["Skip wiki"]

    I --> K["Create output dir"]
    K --> L["Call Claude Code<br/>with all repoDirs"]
    L --> M["Proceed to<br/>structure & pages"]

    style A fill:#e1f5ff
    style M fill:#c8e6c9
    style J fill:#ffcdd2
Loading

For grouped wikis, all repositories in the group are cloned (or validated if local) before Claude Code is invoked. This allows Claude Code to see the entire project structure and relationships.

Sources: main.go:502-546

Configuration Priority

Repository specifications can be provided via multiple input methods. When multiple methods are used together, they are combined:

./wikigen -f repos.txt -r "extra/repo1,extra/repo2" owner/repo3

This command:

  1. Reads all entries from repos.txt
  2. Adds entries from -r flag
  3. Adds positional arguments

All are merged into a single task list. If conflicting inputs specify the same repository, it may be processed multiple times (resulting in overwritten output).

Sources: main.go:982-995

Error Handling in Input Processing

If any repository fails validation or cloning, the entire wiki for that repository (or group) is skipped. Errors are logged and reported at completion.

graph TD
    A["Process Repository"] --> B{"Validation<br/>passes?"}
    B -->|No| C["Log error"]
    B -->|Yes| D{"Cloning<br/>succeeds?<br/>local exists?"}
    D -->|No| E["Log clone error"]
    D -->|Yes| F["Generate wiki"]

    C --> G["Mark as failed"]
    E --> G
    F --> H{"Generation<br/>succeeds?"}
    H -->|No| I["Mark as failed"]
    H -->|Yes| J["Mark as succeeded"]

    G --> K["Report in stderr<br/>& JSON"]
    J --> K

    style A fill:#e1f5ff
    style K fill:#fff9c4
Loading

Sources: main.go:1074-1092

Environment Variables and Configuration Files

Repository authentication and cloning behavior is controlled by environment variables and configuration files, not by repository specifications themselves.

  • GITHUB_TOKEN: GitHub Personal Access Token for HTTPS cloning. If not set, SSH is used.
  • WIKI_CLONE_DIR: Directory where repositories are cloned. Default: ./.repos
  • WIKI_OUTPUT_DIR: Directory where wikis are written. Default: ./wiki-output

These are loaded from .env and .env.local files in the current directory, then command-line flags override environment variables.

See Configuration & Environment for complete details.

Example Configurations

Single Repository

./wikigen owner/repo

Multiple Repositories (Separate Wikis)

Command line:

./wikigen owner/repo1 owner/repo2 owner/repo3

repos.txt:

owner/repo1
owner/repo2
owner/repo3

Result: Three wiki directories, one per repository.

Multi-Repository Wiki

repos.txt:

# Microservice project with three repos
myservice:owner/api
myservice:owner/frontend
myservice:owner/shared-types

Command:

./wikigen -f repos.txt

Result: One wiki directory (wiki-output/myservice/) with cross-repository documentation.

Mixed Local and Remote

repos.txt:

# Main project: one local repo, two remote
myapp:./backend
myapp:owner/frontend
myapp:owner/cli-tools

# Separate wiki for a local repo
./docs

Result: wiki-output/myapp/ (three repos) and wiki-output/docs/ (one repo).

Batch Processing with Parallelism

./wikigen -f repos.txt -p 3 -pp 5

Processes up to 3 wikis in parallel, with up to 5 pages in parallel per wiki. Useful for large batches of repositories.

See Parallel Processing & Performance for details.

Related Pages

Clone this wiki locally