Advanced Web Secret Scanner for Security Professionals
LeakGorilla is a powerful reconnaissance tool designed for penetration testers and security researchers to discover exposed API keys, credentials, and sensitive information in web applications. It intelligently crawls websites and analyzes HTML, JavaScript, and inline scripts to detect leaked secrets that could compromise security.
LeakGorilla automates the tedious process of hunting for exposed secrets in web applications. During development, developers often accidentally commit API keys, tokens, and credentials to frontend code. LeakGorilla finds these security vulnerabilities before attackers do.
- Automated Discovery: Scans entire websites automatically, following links within the same domain
- Comprehensive Detection: Identifies 20+ types of secrets including AI API keys, cloud credentials, and database strings
- Smart Analysis: Scans HTML pages, external JavaScript files, and inline scripts
- Concurrent Scanning: Multi-threaded JavaScript file analysis for faster results
- Safe Output: Redacts sensitive data in console while saving full details to file
- Flexible Export: Supports both human-readable text and JSON formats
- OpenAI API Keys (GPT, DALL-E, Whisper)
- Anthropic Claude API Keys
- Groq API Keys
- Google AI API Keys
- Meta AI/Facebook Access Tokens
- AWS Access Keys & Secret Keys
- Google Cloud Service Account Keys
- Azure Connection Strings
- GitHub Personal Access Tokens
- GitLab Tokens
- Slack Bot & User Tokens
- JWT Tokens
- Stripe API Keys (Live & Test)
- Twilio API Keys
- SendGrid API Keys
- Mailgun API Keys
- MongoDB Connection Strings
- PostgreSQL Connection Strings
- MySQL Connection Strings
- Redis Connection Strings
- Private Keys (RSA, EC, DSA, OpenSSH)
- OAuth Tokens
- Generic API Keys & Secrets
Create and activate a virtual environment, then install dependencies.
Linux / macOS (bash / zsh):
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txtWindows (PowerShell):
python -m venv .venv
.\\.venv\\Scripts\\Activate.ps1
pip install -r requirements.txtWindows (cmd.exe):
python -m venv .venv
.\\.venv\\Scripts\\activate.bat
pip install -r requirements.txtDeactivate the virtual environment with:
deactivateScan a website for exposed secrets:
leakgorilla https://example.comIf not installed via APT, run with Python:
python3 leakgorilla/scanner.py https://example.comScan up to 100 pages:
leakgorilla https://example.com --max-pages 100Save results in JSON format for automation:
leakgorilla https://example.com --format json --output results.jsonComprehensive scan with all options:
leakgorilla https://target.com --max-pages 200 --timeout 15 --output scan_results.txt --format txtleakgorilla <url> [options]| Option | Description | Default |
|---|---|---|
--max-pages N |
Maximum number of pages to crawl | 50 |
--timeout N |
HTTP request timeout in seconds | 10 |
--delay N |
Delay between requests (rate limiting) | 0 |
--proxy URL |
Proxy URL (e.g., Burp Suite) | None |
--verbose, -v |
Verbose output for debugging | False |
--output FILE |
Output file path | web_secrets.txt |
--format FORMAT |
Output format: txt, json, or html |
txt |
--wayback |
Also scan Wayback Machine archives | False |
--patterns FILE |
Custom patterns JSON file | None |
--create-patterns |
Generate example custom_patterns.json | - |
--resume FILE |
Save/load scan state for resuming | None |
--clear-resume FILE |
Clear a saved resume state file | - |
1. Quick Security Audit (APT)
leakgorilla https://myapp.com1. Quick Security Audit (Python)
python3 leakgorilla/scanner.py https://myapp.com2. Deep Scan for Large Sites (APT)
leakgorilla https://corporate-site.com --max-pages 500 --timeout 20 --delay 12. Deep Scan for Large Sites (Python)
python3 leakgorilla/scanner.py https://corporate-site.com --max-pages 500 --timeout 20 --delay 13. Pentest with Proxy (APT)
leakgorilla https://target.com --proxy http://127.0.0.1:8080 --verbose3. Pentest with Proxy (Python)
python3 leakgorilla/scanner.py https://target.com --proxy http://127.0.0.1:8080 --verbose4. JSON Output for Automation (APT)
leakgorilla https://api.example.com --format json --output api_secrets.json4. JSON Output for Automation (Python)
python3 leakgorilla/scanner.py https://api.example.com --format json --output api_secrets.json5. HTML Report (APT)
leakgorilla https://example.com --format html --output report.html5. HTML Report (Python)
python3 leakgorilla/scanner.py https://example.com --format html --output report.html6. Wayback Machine Scan (APT)
leakgorilla https://example.com --wayback6. Wayback Machine Scan (Python)
python3 leakgorilla/scanner.py https://example.com --wayback7. Resume Interrupted Scan (APT)
leakgorilla https://example.com --resume scan_state.json7. Resume Interrupted Scan (Python)
python3 leakgorilla/scanner.py https://example.com --resume scan_state.json8. Custom Patterns (APT)
# Generate example patterns file
leakgorilla https://example.com --create-patterns
# Use custom patterns
leakgorilla https://example.com --patterns custom_patterns.json8. Custom Patterns (Python)
# Generate example patterns file
python3 leakgorilla/scanner.py https://example.com --create-patterns
# Use custom patterns
python3 leakgorilla/scanner.py https://example.com --patterns custom_patterns.jsonLeakGorilla displays progress in real-time:
[1/50] Scanning: https://example.com
β Found 3 potential secret(s)
[2/50] Scanning: https://example.com/about
[3/50] Scanning: https://example.com/api/config.js
β Found 1 potential secret(s)
After scanning, you'll see a categorized summary:
================================================================================
SCAN SUMMARY
================================================================================
Total secrets found: 12
[OpenAI API Key] - 2 found
--------------------------------------------------------------------------------
URL: https://example.com/js/app.js
Source: JavaScript file
Value: sk-pr...FJ2a
[AWS Access Key] - 1 found
--------------------------------------------------------------------------------
URL: https://example.com/config
Source: HTML content
Value: AKIA...Z7Q9
Full unredacted results are saved to your specified output file:
- Text Format: Human-readable with context snippets
- JSON Format: Machine-parsable for automation and integration
Scan your staging environment before going live:
leakgorilla https://staging.myapp.com --max-pages 200Discover exposed secrets in target applications:
leakgorilla https://target.com --format json --output bounty_findings.jsonComprehensive scan of client websites:
leakgorilla https://client-site.com --max-pages 500 --timeout 20 --output audit_report.txtIntegrate into CI/CD pipelines:
leakgorilla https://production.app.com --format json | jq '.[] | select(.type=="OpenAI API Key")'Ethical reconnaissance (with permission):
leakgorilla https://competitor.com --max-pages 100- Always get written permission before scanning
- Respect rate limits and server resources
- Use appropriate
--timeoutvalues - Save results securely (they contain sensitive data)
- Report findings responsibly
- Run LeakGorilla on your own sites regularly
- Scan before each deployment
- Integrate into CI/CD pipelines
- Use
.envfiles and environment variables instead of hardcoding secrets - Implement secret scanning in pre-commit hooks
IMPORTANT: Use LeakGorilla only on:
- Websites you own
- Systems you have explicit written permission to test
- Bug bounty programs that allow automated scanning
Unauthorized scanning may violate:
- Computer Fraud and Abuse Act (CFAA)
- Computer Misuse Act
- Terms of Service agreements
- Local and international laws
The developers of LeakGorilla are not responsible for misuse of this tool.
- Crawling: Starts at the target URL and discovers links within the same domain
- Content Extraction: Downloads HTML pages and external JavaScript files
- Pattern Matching: Uses advanced regex patterns to identify 20+ secret types
- Context Analysis: Extracts surrounding code for better understanding
- Concurrent Processing: Scans multiple JavaScript files simultaneously
- Smart Filtering: Avoids binary files, images, and non-content URLs
- Safe Reporting: Redacts secrets in console, saves full data to file
- Start Small: Use
--max-pages 10for initial testing - Adjust Timeout: Increase
--timeoutfor slow servers - Monitor Progress: Watch console output for real-time feedback
- Use JSON: Export to JSON for easier parsing and automation
- Respect Servers: Don't set
--max-pagestoo high on small sites
LeakGorilla uses regex patterns to detect secrets. Accuracy varies by secret type:
- β OpenAI API Keys
- β Anthropic Claude Keys
- β Groq API Keys
- β GitHub Tokens
- β SendGrid API Keys
- β AWS Access Keys
- β Stripe API Keys
- β Slack Tokens
- β Database Connection Strings
- β Twilio API Keys
β οΈ Google API Keysβ οΈ Meta/Facebook Tokensβ οΈ JWT Tokensβ οΈ Private Keys
β οΈ Generic API Keysβ οΈ Generic Secrets
Overall Accuracy: ~75-85%
Note: False positives may occur with:
- Base64-encoded fonts/images
- Minified JavaScript
- Random strings in CSS files
Recommendation: Focus on CRITICAL and HIGH severity findings for best accuracy (85-95%).
Found a bug or want to add detection for new secret types? Contributions welcome!
Repository: https://github.com/jeffryhawchab/leakgorilla
MIT License - Copyright (c) 2026 Jeffrey Hawchab
- Issues: https://github.com/jeffryhawchab/leakgorilla/issues
- Documentation: https://github.com/jeffryhawchab/leakgorilla/wiki
Remember: With great power comes great responsibility. Use LeakGorilla ethically and legally. π¦
- Rotating proxies with optional validation and background re-validation (
--proxies-file,--stream-proxies,--validate-proxies,--revalidate-minutes). - Randomized user-agents loaded once and per-request rotation to reduce fingerprinting.
- Configurable delay range (
--delay-min,--delay-max) and lower default--timeout(4s) for faster scans. - Threaded, concurrent crawler with increased
--max-workersfor higher throughput. - Search-engine seeding from
dorks.conf(supports searx, DuckDuckGo, Yandex) via--use-searchto collect initial targets. - CSS files are skipped by default to reduce false positives; only HTML and JavaScript are scanned.
See examples in the CLI section for usage of the new options.
- Added an aiohttp-based async crawler for higher throughput and lower overhead on I/O-bound scans. Use the
--asyncflag to enable it. Example:
python3 leakgorilla/scanner.py https://example.com --async --max-workers 50- Introduced
WHITELISTrules to ignore long base64/blob-like matches and common image/assets, reducing false positives. - Tightened generic secret patterns to require longer token lengths by default.