LeakGorilla 🦍

Advanced Web Secret Scanner for Security Professionals

LeakGorilla is a powerful reconnaissance tool designed for penetration testers and security researchers to discover exposed API keys, credentials, and sensitive information in web applications. It intelligently crawls websites and analyzes HTML, JavaScript, and inline scripts to detect leaked secrets that could compromise security.

🎯 What is LeakGorilla?

LeakGorilla automates the tedious process of hunting for exposed secrets in web applications. During development, developers often accidentally commit API keys, tokens, and credentials to frontend code. LeakGorilla finds these security vulnerabilities before attackers do.

Why Use LeakGorilla?

Automated Discovery: Scans entire websites automatically, following links within the same domain
Comprehensive Detection: Identifies 20+ types of secrets including AI API keys, cloud credentials, and database strings
Smart Analysis: Scans HTML pages, external JavaScript files, and inline scripts
Concurrent Scanning: Multi-threaded JavaScript file analysis for faster results
Safe Output: Redacts sensitive data in console while saving full details to file
Flexible Export: Supports both human-readable text and JSON formats

🔍 What LeakGorilla Detects

AI & ML Services

OpenAI API Keys (GPT, DALL-E, Whisper)
Anthropic Claude API Keys
Groq API Keys
Google AI API Keys
Meta AI/Facebook Access Tokens

Cloud Providers

AWS Access Keys & Secret Keys
Google Cloud Service Account Keys
Azure Connection Strings

Development Tools

GitHub Personal Access Tokens
GitLab Tokens
Slack Bot & User Tokens
JWT Tokens

Payment & Communication

Stripe API Keys (Live & Test)
Twilio API Keys
SendGrid API Keys
Mailgun API Keys

Databases & Infrastructure

MongoDB Connection Strings
PostgreSQL Connection Strings
MySQL Connection Strings
Redis Connection Strings

Security Assets

Private Keys (RSA, EC, DSA, OpenSSH)
OAuth Tokens
Generic API Keys & Secrets

🚀 Quick Start

Setup: Python virtual environment

Create and activate a virtual environment, then install dependencies.

Linux / macOS (bash / zsh):

python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

Windows (PowerShell):

python -m venv .venv
.\\.venv\\Scripts\\Activate.ps1
pip install -r requirements.txt

Windows (cmd.exe):

python -m venv .venv
.\\.venv\\Scripts\\activate.bat
pip install -r requirements.txt

Deactivate the virtual environment with:

deactivate

Basic Scan

Scan a website for exposed secrets:

leakgorilla https://example.com

Using Python Directly

If not installed via APT, run with Python:

python3 leakgorilla/scanner.py https://example.com

Scan with Custom Depth

Scan up to 100 pages:

leakgorilla https://example.com --max-pages 100

Export to JSON

Save results in JSON format for automation:

leakgorilla https://example.com --format json --output results.json

Full Example

Comprehensive scan with all options:

leakgorilla https://target.com --max-pages 200 --timeout 15 --output scan_results.txt --format txt

📖 Usage Guide

Command Syntax

leakgorilla <url> [options]

Options

Option	Description	Default
`--max-pages N`	Maximum number of pages to crawl	50
`--timeout N`	HTTP request timeout in seconds	10
`--delay N`	Delay between requests (rate limiting)	0
`--proxy URL`	Proxy URL (e.g., Burp Suite)	None
`--verbose, -v`	Verbose output for debugging	False
`--output FILE`	Output file path	web_secrets.txt
`--format FORMAT`	Output format: `txt`, `json`, or `html`	txt
`--wayback`	Also scan Wayback Machine archives	False
`--patterns FILE`	Custom patterns JSON file	None
`--create-patterns`	Generate example custom_patterns.json	-
`--resume FILE`	Save/load scan state for resuming	None
`--clear-resume FILE`	Clear a saved resume state file	-

Examples

1. Quick Security Audit (APT)

leakgorilla https://myapp.com

1. Quick Security Audit (Python)

python3 leakgorilla/scanner.py https://myapp.com

2. Deep Scan for Large Sites (APT)

leakgorilla https://corporate-site.com --max-pages 500 --timeout 20 --delay 1

2. Deep Scan for Large Sites (Python)

python3 leakgorilla/scanner.py https://corporate-site.com --max-pages 500 --timeout 20 --delay 1

3. Pentest with Proxy (APT)

leakgorilla https://target.com --proxy http://127.0.0.1:8080 --verbose

3. Pentest with Proxy (Python)

python3 leakgorilla/scanner.py https://target.com --proxy http://127.0.0.1:8080 --verbose

4. JSON Output for Automation (APT)

leakgorilla https://api.example.com --format json --output api_secrets.json

4. JSON Output for Automation (Python)

python3 leakgorilla/scanner.py https://api.example.com --format json --output api_secrets.json

5. HTML Report (APT)

leakgorilla https://example.com --format html --output report.html

5. HTML Report (Python)

python3 leakgorilla/scanner.py https://example.com --format html --output report.html

6. Wayback Machine Scan (APT)

leakgorilla https://example.com --wayback

6. Wayback Machine Scan (Python)

python3 leakgorilla/scanner.py https://example.com --wayback

7. Resume Interrupted Scan (APT)

leakgorilla https://example.com --resume scan_state.json

7. Resume Interrupted Scan (Python)

python3 leakgorilla/scanner.py https://example.com --resume scan_state.json

8. Custom Patterns (APT)

# Generate example patterns file
leakgorilla https://example.com --create-patterns

# Use custom patterns
leakgorilla https://example.com --patterns custom_patterns.json

8. Custom Patterns (Python)

# Generate example patterns file
python3 leakgorilla/scanner.py https://example.com --create-patterns

# Use custom patterns
python3 leakgorilla/scanner.py https://example.com --patterns custom_patterns.json

📊 Understanding Results

Console Output

LeakGorilla displays progress in real-time:

[1/50] Scanning: https://example.com
  ✓ Found 3 potential secret(s)
[2/50] Scanning: https://example.com/about
[3/50] Scanning: https://example.com/api/config.js
  ✓ Found 1 potential secret(s)

Summary Report

After scanning, you'll see a categorized summary:

================================================================================
SCAN SUMMARY
================================================================================
Total secrets found: 12

[OpenAI API Key] - 2 found
--------------------------------------------------------------------------------
  URL: https://example.com/js/app.js
  Source: JavaScript file
  Value: sk-pr...FJ2a

[AWS Access Key] - 1 found
--------------------------------------------------------------------------------
  URL: https://example.com/config
  Source: HTML content
  Value: AKIA...Z7Q9

Output File

Full unredacted results are saved to your specified output file:

Text Format: Human-readable with context snippets
JSON Format: Machine-parsable for automation and integration

🎯 Use Cases

1. Pre-Deployment Security Check

Scan your staging environment before going live:

leakgorilla https://staging.myapp.com --max-pages 200

2. Bug Bounty Reconnaissance

Discover exposed secrets in target applications:

leakgorilla https://target.com --format json --output bounty_findings.json

3. Security Audit

Comprehensive scan of client websites:

leakgorilla https://client-site.com --max-pages 500 --timeout 20 --output audit_report.txt

4. Continuous Monitoring

Integrate into CI/CD pipelines:

leakgorilla https://production.app.com --format json | jq '.[] | select(.type=="OpenAI API Key")'

5. Competitor Analysis

Ethical reconnaissance (with permission):

leakgorilla https://competitor.com --max-pages 100

🛡️ Best Practices

For Security Professionals

Always get written permission before scanning
Respect rate limits and server resources
Use appropriate --timeout values
Save results securely (they contain sensitive data)
Report findings responsibly

For Developers

Run LeakGorilla on your own sites regularly
Scan before each deployment
Integrate into CI/CD pipelines
Use .env files and environment variables instead of hardcoding secrets
Implement secret scanning in pre-commit hooks

⚠️ Legal Disclaimer

IMPORTANT: Use LeakGorilla only on:

Websites you own
Systems you have explicit written permission to test
Bug bounty programs that allow automated scanning

Unauthorized scanning may violate:

Computer Fraud and Abuse Act (CFAA)
Computer Misuse Act
Terms of Service agreements
Local and international laws

The developers of LeakGorilla are not responsible for misuse of this tool.

🔧 How It Works

Crawling: Starts at the target URL and discovers links within the same domain
Content Extraction: Downloads HTML pages and external JavaScript files
Pattern Matching: Uses advanced regex patterns to identify 20+ secret types
Context Analysis: Extracts surrounding code for better understanding
Concurrent Processing: Scans multiple JavaScript files simultaneously
Smart Filtering: Avoids binary files, images, and non-content URLs
Safe Reporting: Redacts secrets in console, saves full data to file

📈 Performance Tips

Start Small: Use --max-pages 10 for initial testing
Adjust Timeout: Increase --timeout for slow servers
Monitor Progress: Watch console output for real-time feedback
Use JSON: Export to JSON for easier parsing and automation
Respect Servers: Don't set --max-pages too high on small sites

🎯 Detection Accuracy

LeakGorilla uses regex patterns to detect secrets. Accuracy varies by secret type:

High Accuracy (90-95%)

✅ OpenAI API Keys
✅ Anthropic Claude Keys
✅ Groq API Keys
✅ GitHub Tokens
✅ SendGrid API Keys
✅ AWS Access Keys

Good Accuracy (80-90%)

✅ Stripe API Keys
✅ Slack Tokens
✅ Database Connection Strings
✅ Twilio API Keys

Medium Accuracy (70-80%)

⚠️ Google API Keys
⚠️ Meta/Facebook Tokens
⚠️ JWT Tokens
⚠️ Private Keys

Lower Accuracy (60-70%)

⚠️ Generic API Keys
⚠️ Generic Secrets

Overall Accuracy: ~75-85%

Note: False positives may occur with:

Base64-encoded fonts/images
Minified JavaScript
Random strings in CSS files

Recommendation: Focus on CRITICAL and HIGH severity findings for best accuracy (85-95%).

🤝 Contributing

Found a bug or want to add detection for new secret types? Contributions welcome!

Repository: https://github.com/jeffryhawchab/leakgorilla

📄 License

🆘 Support

Issues: https://github.com/jeffryhawchab/leakgorilla/issues
Documentation: https://github.com/jeffryhawchab/leakgorilla/wiki

Remember: With great power comes great responsibility. Use LeakGorilla ethically and legally. 🦍

New Features (2026-02-27)

Rotating proxies with optional validation and background re-validation (--proxies-file, --stream-proxies, --validate-proxies, --revalidate-minutes).
Randomized user-agents loaded once and per-request rotation to reduce fingerprinting.
Configurable delay range (--delay-min, --delay-max) and lower default --timeout (4s) for faster scans.
Threaded, concurrent crawler with increased --max-workers for higher throughput.
Search-engine seeding from dorks.conf (supports searx, DuckDuckGo, Yandex) via --use-search to collect initial targets.
CSS files are skipped by default to reduce false positives; only HTML and JavaScript are scanned.

See examples in the CLI section for usage of the new options.

Async Crawler

Added an aiohttp-based async crawler for higher throughput and lower overhead on I/O-bound scans. Use the --async flag to enable it. Example:

python3 leakgorilla/scanner.py https://example.com --async --max-workers 50

False Positive Reductions

Introduced WHITELIST rules to ignore long base64/blob-like matches and common image/assets, reducing false positives.
Tightened generic secret patterns to require longer token lengths by default.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
debian		debian
leakgorilla		leakgorilla
.gitignore		.gitignore
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
build-deb.sh		build-deb.sh
dorks.conf		dorks.conf
main.py		main.py
requirements.txt		requirements.txt
setup.py		setup.py

Folders and files

Latest commit

History

Repository files navigation

LeakGorilla 🦍

🎯 What is LeakGorilla?

Why Use LeakGorilla?

🔍 What LeakGorilla Detects

AI & ML Services

Cloud Providers

Development Tools

Payment & Communication

Databases & Infrastructure

Security Assets

🚀 Quick Start

Setup: Python virtual environment

Basic Scan

Using Python Directly

Scan with Custom Depth

Export to JSON

Full Example

📖 Usage Guide

Command Syntax

Options

Examples

📊 Understanding Results

Console Output

Summary Report

Output File

🎯 Use Cases

1. Pre-Deployment Security Check

2. Bug Bounty Reconnaissance

3. Security Audit

4. Continuous Monitoring

5. Competitor Analysis

🛡️ Best Practices

For Security Professionals

For Developers

⚠️ Legal Disclaimer

🔧 How It Works

📈 Performance Tips

🎯 Detection Accuracy

High Accuracy (90-95%)

Good Accuracy (80-90%)

Medium Accuracy (70-80%)

Lower Accuracy (60-70%)

🤝 Contributing

📄 License

🆘 Support

New Features (2026-02-27)

Async Crawler

False Positive Reductions

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages