-
Notifications
You must be signed in to change notification settings - Fork 1
Repository Access Authentication
This page documents how wikigen authenticates with GitHub and accesses repositories using either SSH or Personal Access Token (PAT) authentication. It covers the git clone implementation, token substitution mechanisms, optimization techniques, and security considerations for credential handling.
wikigen supports two mutually exclusive authentication methods for cloning repositories: SSH (default) and GitHub Personal Access Token (PAT). The choice of authentication method is configured via the -token CLI flag or GITHUB_TOKEN environment variable.
SSH authentication is the default method used when no PAT token is provided. In this mode, wikigen converts HTTPS GitHub URLs to SSH URLs and relies on a pre-configured SSH key registered with GitHub.
SSH URL Format:
git@github.com:{owner}/{repo}.git
SSH authentication requires:
- A valid SSH key pair
- The public key registered in GitHub account settings (Settings → SSH and GPG keys)
- SSH agent running (or SSH key in
~/.ssh/id_rsa)
The conversion from HTTPS to SSH occurs in the gitClone function (main.go:160-164). When no token is provided, any HTTPS GitHub URLs are replaced with the SSH equivalent format and .git suffix is appended if missing.
Sources: wikigen/main.go:160-164
PAT authentication is used when the -token flag is provided or GITHUB_TOKEN environment variable is set. This method is useful in environments where SSH key configuration is impractical, such as CI/CD pipelines or containerized environments.
HTTPS URL Format with Token Substitution:
https://{PAT}@github.com/{owner}/{repo}.git
To use PAT authentication:
-
Create a GitHub Personal Access Token:
- Navigate to GitHub → Settings → Developer settings → Personal access tokens
- Click "Generate new token"
- Assign minimum required scopes:
repo(full control of private repositories) orpublic_repo(for public repositories only) - Copy the token value
-
Provide token to wikigen:
# Via command-line flag ./wikigen -token ghp_xxxxxxxxxxxxxxxxxxxxxxxxxxxx owner/repo # Via environment variable export GITHUB_TOKEN=ghp_xxxxxxxxxxxxxxxxxxxxxxxxxxxx ./wikigen owner/repo # Via .env file # .env GITHUB_TOKEN=ghp_xxxxxxxxxxxxxxxxxxxxxxxxxxxx
The token is substituted into the HTTPS URL at git clone time via string replacement in the gitClone function (main.go:158). The original https:// scheme is replaced with https://{token}@ before the clone operation.
Sources: wikigen/main.go:156-158, .env.example:1-2
The gitClone function implements the repository cloning logic with support for both authentication methods, incremental updates via git pull, and clone optimization.
flowchart TD
A["gitClone called<br/>repo_url, token, dest_dir"] --> B{"Repo exists<br/>at dest_dir?"}
B -->|Yes| C["git pull --ff-only<br/>Update existing"]
B -->|No| D{"Token<br/>provided?"}
D -->|Yes| E["HTTPS Clone<br/>with PAT"]
D -->|No| F["SSH Clone<br/>Convert URL"]
E --> G["git clone --depth=1<br/>--single-branch<br/>Clone URL"]
F --> G
G --> H["Return"]
C --> H
If the destination directory already contains a .git directory, wikigen updates the existing repository using git pull --ff-only instead of re-cloning. This enables incremental updates without full re-download and preserves any local state.
Sources: wikigen/main.go:147-152
For new repositories, wikigen performs a full clone with authentication method selection and optimization flags applied.
Default Behavior:
- If no token is provided, convert HTTPS URL to SSH format (main.go:160-164)
- If token is provided, substitute token into HTTPS URL (main.go:156-158)
- Execute
git clonewith optimization flags (main.go:167)
Sources: wikigen/main.go:147-171
wikigen executes git clone with two optimization flags:
| Flag | Purpose | Effect |
|---|---|---|
--depth=1 |
Shallow clone | Downloads only the latest commit, reducing bandwidth and storage |
--single-branch |
Single branch clone | Fetches only the default branch (typically main/master), not all branches |
Combined, these flags reduce clone time and disk usage to a minimum while preserving full source code access needed for documentation generation.
Sources: wikigen/main.go:167
wikigen respects the following precedence order for the -token flag (from highest to lowest priority):
-
CLI flag:
-tokencommand-line argument -
Environment variable:
GITHUB_TOKENsystem environment variable -
.envfile:GITHUB_TOKENin.envor.env.localin the current working directory - Default: Empty string (falls back to SSH authentication)
The .env file is automatically loaded at startup (main.go:719-739) before CLI flags are parsed.
Sources: wikigen/main.go:908, wikigen/main.go:719-739
# GitHub Personal Access Token (empty = use SSH)
GITHUB_TOKEN=ghp_xxxxxxxxxxxxxxxxxxxxxxxxxxxx
# Optional: Can also be set via CLI
# ./wikigen -token ghp_xxxxxxxxxxxxxxxxxxxxxxxxxxxx owner/repoSources: .env.example:1-2
Risks with Token Handling:
-
Plaintext Storage: Tokens stored in
.envfiles are in plaintext. Never commit.envfiles to version control. - Environment Variable Exposure: Environment variables may be visible in process listings or CI/CD logs if not properly masked.
- URL Logging: Git clone URLs containing tokens may appear in logs.
Mitigation Strategies:
-
CI/CD Environments: Use secrets management provided by your CI/CD platform
# GitHub Actions example env: GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
Sources: README.md:259-265
-
Local Development: Restrict file permissions
chmod 600 .env
-
Token Scope: Grant minimum required scopes when creating PAT
-
public_repofor public repositories only -
repofor private repositories
-
-
Token Rotation: Periodically regenerate tokens
- GitHub recommends rotating every 90 days
- OAuth tokens expire after approximately 1 year Sources: README.md:278
-
SSH Preferred: Use SSH authentication in environments where SSH keys can be managed securely
- SSH keys are not transmitted in URLs
- No secrets in environment variables or
.envfiles
wikigen validates all repository inputs before cloning to prevent various attack vectors.
Validation Rules:
-
Format Validation: Repository must match
owner/repopattern -
Path Traversal Prevention: Rejects paths containing
.. -
Shell Injection Prevention: Rejects special shell characters (
;,&,|,`,$,(,),{,},[,],!,~)
The validation regex pattern (main.go:79) strictly enforces alphanumeric characters, dots, underscores, and hyphens:
^[a-zA-Z0-9._-]+/[a-zA-Z0-9._-]+$
Sources: wikigen/main.go:79-92
flowchart TD
A["User Input:<br/>owner/repo"] --> B{"Matches<br/>regex?"}
B -->|No| C["❌ Reject<br/>Invalid format"]
B -->|Yes| D{"Contains<br/>'..'?"}
D -->|Yes| E["❌ Reject<br/>Path traversal"]
D -->|No| F{"Contains<br/>shell chars?"}
F -->|Yes| G["❌ Reject<br/>Shell injection"]
F -->|No| H["✅ Accept<br/>Proceed with clone"]
C --> Z["Return error"]
E --> Z
G --> Z
H --> I["Safe to execute"]
Prevented Attack Vectors:
| Attack Type | Example | Prevention |
|---|---|---|
| Path Traversal | ../../../etc/passwd |
Rejects .. in input |
| Shell Injection | owner/repo; rm -rf / |
Rejects ;, &, |, etc. |
| Command Substitution | owner/$(whoami) |
Rejects $, (, )
|
| Glob Expansion | owner/repo* |
Rejects *, ?, [, ]
|
When using wikigen in GitHub Actions, follow secure token handling practices:
# ✅ CORRECT: Use secrets management
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: ./wikigen -token "$GITHUB_TOKEN" owner/repo
# ❌ WRONG: Token visible in logs
run: ./wikigen -token ghp_xxxxxxxxxxxxxxxxxxxxxxxxxxxx owner/repoGitHub Actions automatically masks secrets in job logs, but tokens should not be written to logs explicitly.
Sources: README.md:259-273
For SSH authentication to function, an SSH key must be configured:
-
Generate SSH key (if not already present):
ssh-keygen -t ed25519 -C "your_email@example.com" # Or for older systems: ssh-keygen -t rsa -b 4096
-
Start SSH agent:
eval "$(ssh-agent -s)" ssh-add ~/.ssh/id_ed25519
-
Register public key with GitHub:
- Copy public key:
cat ~/.ssh/id_ed25519.pub - GitHub → Settings → SSH and GPG keys → New SSH key
- Paste key and save
- Copy public key:
-
Verify SSH access:
ssh -T git@github.com # Expected: "Hi username! You've successfully authenticated..."
For GitHub Actions, SSH keys should not be used. Instead, use the built-in GITHUB_TOKEN:
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: ./wikigen -token "$GITHUB_TOKEN" owner/repoAlternatively, configure SSH in Actions via webfactory/ssh-agent@v0.5.4:
- uses: webfactory/ssh-agent@v0.5.4
with:
ssh-private-key: ${{ secrets.SSH_PRIVATE_KEY }}
# Then use SSH authentication (omit -token flag)The --depth=1 flag performs a shallow clone, downloading only the most recent commit instead of full history:
Benefits:
- Reduces bandwidth usage by 50-90% for large repositories
- Reduces disk space usage proportionally
- Faster clone time
Trade-offs:
- Cannot access git history before the latest commit
- Some git operations (e.g.,
git log --all) are limited
For wikigen's use case (analyzing current source code), this is ideal since documentation generation only requires the latest state.
Impact: A repository with 10,000+ commits clones in seconds instead of minutes.
Sources: wikigen/main.go:167, README.md:18
The --single-branch flag downloads only the default branch (typically main or master):
Benefits:
- Reduces bandwidth by eliminating other branches
- Reduces disk space
- Faster clone overall
Trade-offs:
- Other branches are not available locally
- Cannot analyze branch-specific code
For multi-branch repositories, this flag reduces clone time by 10-40% depending on branch count.
Sources: wikigen/main.go:167
The complete authentication and clone flow:
sequenceDiagram
participant User
participant CLI as wikigen CLI
participant Env as Environment
participant Git as git command
participant GitHub as GitHub API
User->>CLI: ./wikigen -token PAT owner/repo
CLI->>Env: Load .env file
Env-->>CLI: Configuration loaded
CLI->>CLI: Validate owner/repo format
CLI->>CLI: Check: token provided?
alt Token Provided
CLI->>CLI: Construct HTTPS URL<br/>https://PAT@github.com/owner/repo.git
CLI->>Git: git clone --depth=1<br/>--single-branch HTTPS_URL
else No Token (SSH)
CLI->>CLI: Convert to SSH URL<br/>git@github.com:owner/repo.git
CLI->>Git: git clone --depth=1<br/>--single-branch SSH_URL
end
Git->>GitHub: Connect with auth
GitHub-->>Git: Authentication successful
Git-->>CLI: Clone complete
CLI-->>User: Repository ready for analysis
- Installation & Setup — Prerequisites including git and SSH configuration, authentication setup
-
CLI Usage & Commands — Complete command reference including
-tokenflag documentation -
Configuration & Environment Variables —
GITHUB_TOKENenvironment variable configuration and precedence rules - Input Validation & Security — Repository format validation and shell injection prevention details
- GitHub Actions Integration — Token management in CI/CD pipelines and OAuth token configuration
- System Overview
- Architecture & Design
- CLI Usage & Commands
- Configuration & Environment
- Input Formats & Repository Configuration
- Authentication & Git Integration
- Output Format & Wiki Structure
- Error Handling & Retry Mechanism
- Parallel Processing & Performance
- Input Validation & Security
- Build & Deployment
- Claude Code Integration
- Wiki Generation Processing Flow
- Multi-Repository Wiki Support
- Progress Tracking & Output Modes