SmartMouse is an intelligent mouse control system that uses computer vision and AI to automate mouse interactions. It can detect objects on screen and intelligently click, drag, or interact with them.
- 🖱️ AI-Powered Mouse Control - Uses computer vision to detect and interact with UI elements
- 👁️ Object Detection - Powered by OWL-ViT and DETR models for accurate object recognition
- 🧠 Smart Planning - AI planner determines the best actions to achieve goals
- 🔄 Action Automation - Automates clicks, drags, typing, and keyboard shortcuts
- 📊 Visual Feedback - Shows detection results and action plans
# Clone the repository
git clone https://github.com/YOUR_USERNAME/smartmouse-ts.git
cd smartmouse-ts
# Install dependencies
npm install
# Copy and configure your API key
cp config.json.example config.json
# Edit config.json and add your API key
# Build and run
npm run build
npm startCopy config.json.example to config.json and configure:
{
"api_key": "YOUR_API_KEY_HERE", // Your AI API key
"api_base_url": "https://your-api-provider.com/v1",
"model": "qwen3.5-plus",
...
}config.json file with real API keys!
SmartMouse Architecture:
┌─────────────────────────────────────────────────────────────┐
│ Brain.ts - Main AI Agent │
│ → Receives goals from user │
│ → Plans actions using AI │
│ → Executes mouse/keyboard actions │
├─────────────────────────────────────────────────────────────┤
│ Actions.ts - Action Execution │
│ → Click, Double-click, Drag │
│ → Type text, Keyboard shortcuts │
│ → Scroll, Wait, Screenshot │
├─────────────────────────────────────────────────────────────┤
│ Vision Models (Xenova/Transformers.js) │
│ → OWL-ViT: Object detection with text prompts │
│ → DETR: General object detection │
│ → ViT-GPT2: Image captioning │
└─────────────────────────────────────────────────────────────┘
// Example: Click a button by describing it
await smartMouse.executeGoal("Click the 'Submit' button");
// Example: Type text in a field
await smartMouse.executeGoal("Type 'Hello World' in the search box");
// Example: Complex action chain
await smartMouse.executeGoal("Find the login button, click it, then enter username 'admin'");This project uses transformers.js for local AI models:
- OWL-ViT (
owlvit-base-patch32,owlv2-base-patch16-ensemble) - Text-prompted object detection - DETR (
detr-resnet-50) - General object detection - ViT-GPT2 (
vit-gpt2-image-captioning) - Image captioning
Models are downloaded automatically on first run.
- Never share your API keys
- config.json is excluded from git via
.gitignore - Use environment variables for production deployments
We welcome contributions! Please:
- Fork the repository
- Create a feature branch
- Make your changes
- Submit a pull request
AI Contributors Welcome! This project is designed for AI-assisted development. AI agents can:
- Review code and suggest improvements
- Fix bugs and add features
- Write tests and documentation
- Optimize performance
MIT License - See LICENSE file for details.
- Transformers.js - Local AI models in JavaScript
- OWL-ViT - Open-world object detection
- Qwen AI models for planning and decision-making
Made with 🖱️ + 🧠 by SmartMouse Team
The SmartMouse project includes GitHub integration capabilities that allow agents to programmatically interact with GitHub repositories. This enables automated repository creation, file management, issue tracking, and pull request management.
For detailed information about the GitHub integration, see GITHUB_INTEGRATION.md.
To use the GitHub integration features, you'll need to set up GitHub access:
To connect VS Code to your GitHub account:
- Open VS Code
- Look for the Accounts icon in the bottom-left corner of the status bar
- Click on "Sign in with GitHub"
- Or use Command Palette (Ctrl+Shift+P) and type "GitHub: Sign In"
- Follow the browser-based authentication flow
Configure your Git credentials:
git config --global user.name "Your GitHub Username"
git config --global user.email "your-email@example.com"For the SmartMouse-specific GitHub integration:
- Run the setup script:
npx tsx setup-github-access.ts - This will open the GitHub token creation page in your browser
- Follow the instructions to create a personal access token
- Create a
github-config.jsonfile with your token - Test the connection with
npx tsx test-github.ts
To check your current GitHub access status, run:
npx tsx check-github-access.tsThis will verify if VS Code, Git, GitHub CLI, and SmartMouse GitHub integration are properly configured.