Voice-driven web navigator for blind and low-vision users
Lighthouse.ai is a voice-controlled assistant that lets blind and low-vision users browse and operate websites hands-free. Users speak commands; the agent controls the browser and announces what's on screen after every action—reliably and safely.
- Python 3.9+
- Chrome/Chromium browser
- macOS, Linux, or Windows
# Clone the repository
git clone https://github.com/lighthouse-ai/lighthouse.git
cd lighthouse
# Quick setup (recommended)
make quickstart
# Or manual setup
make install
make download-models
make check-chromemake run-cli
# or
python cli.pymake run-api
# or
uvicorn main:app --reloadVisit http://localhost:8000/docs for API documentation.
- Navigate: "Go to google.com"
- Click: "Click the search button"
- Type: "Type hello world"
- Submit: "Submit the form"
- Describe: "Describe this page"
- List: "List all buttons"
- Stop: "Stop" or "Cancel"
- Domain Allowlist: Only navigate to approved domains
- Confirmation Gates: Destructive actions require confirmation
- Local Processing: All speech processing happens locally by default
- Screen Descriptions: Clear, concise page summaries
- Element Disambiguation: Numbered lists when multiple matches
- Change Detection: Reports what changed after each action
Copy .env.example to .env and configure:
# Domain allowlist (comma-separated)
ALLOWED_DOMAINS=google.com,amazon.com,github.com,wikipedia.org
# Browser settings
HEADLESS_MODE=false
BROWSER_TIMEOUT=10
# Audio settings
AUDIO_DEVICE=default
VAD_AGGRESSIVENESS=2
# Privacy settings
LOCAL_PROCESSING=true
LOG_LEVEL=INFOEdit config/domains.yaml to manage allowed domains:
allowed_domains:
- google.com
- amazon.com
- github.com
- wikipedia.org
- example.com
restricted_actions:
- delete
- purchase
- payment
- account_change- Local Processing: Speech recognition and synthesis happen on your device
- No Audio Storage: Audio is processed in real-time and discarded
- Redacted Logs: Sensitive information is automatically redacted
- Opt-in Cloud: Cloud services only used with explicit consent
- Domain Restrictions: Only navigate to approved websites
- Action Confirmation: Destructive actions require verbal confirmation
- Sandboxed Browser: Isolated browser profile for safety
- Audit Trail: All actions are logged for review
- No Personal Data: We don't collect or store personal information
- Local Storage: All data stays on your device
- Encrypted Logs: Session logs are encrypted locally
- Transparent Processing: Open source code for full transparency
# Run all tests
make test
# Run specific test categories
pytest tests/test_cli.py -v
pytest tests/test_api.py -v
pytest tests/test_browser.py -v
# Test on target sites
make test-sites
# Check code quality
make lint
make formatlighthouse/
├── cli.py # CLI entry point
├── main.py # FastAPI service
├── config/ # Configuration files
├── core/ # Core functionality
│ ├── asr.py # Speech recognition
│ ├── nlu.py # Natural language understanding
│ ├── tts.py # Text-to-speech
│ ├── browser.py # Browser automation
│ ├── safety.py # Safety controls
│ └── state.py # Session management
├── api/ # REST API
├── utils/ # Utilities
└── tests/ # Test suite
- Fork the repository
- Create a feature branch:
git checkout -b feature/amazing-feature - Make your changes and add tests
- Run tests:
make test - Commit your changes:
git commit -m 'Add amazing feature' - Push to the branch:
git push origin feature/amazing-feature - Open a Pull Request
# Install development dependencies
make dev-install
# Setup pre-commit hooks
pre-commit install
# Run development server
make dev- Basic voice commands
- Browser automation
- Screen descriptions
- Safety controls
- Local processing
- Hotword detection
- Advanced form handling
- Table navigation
- Multi-step workflows
- Cloud TTS integration
- Advanced error recovery
- Custom command training
- Mobile app
"Chrome not found"
make install-chrome
make check-chrome"Audio device not working"
# Check available audio devices
python -c "import sounddevice; print(sounddevice.query_devices())""Whisper model not found"
make download-models"Permission denied"
# On macOS, grant microphone permissions in System Preferences
# On Linux, add user to audio group
sudo usermod -a -G audio $USERThis project is licensed under the MIT License - see the LICENSE file for details.
- OpenAI Whisper for speech recognition
- Coqui TTS for text-to-speech
- Selenium for browser automation
- FastAPI for the web framework
Made with ❤️ for the accessibility community