A comprehensive, privacy-focused AI assistant that runs entirely in your browser.
- Runs large language models directly in the browser using WebGPU
- No server required - everything runs client-side
- Works offline after initial model download
- WebGPU hardware acceleration for fast performance
- Llama-3.2-3B-Instruct (Context7) - Default model with 128k context (2GB VRAM)
- TinyLlama-1.1B - Fastest loading, minimal VRAM (512MB, 2k context)
- Llama-3.2-1B-Instruct - Small model with 128k context (1GB VRAM)
- Llama-3.2-3B-Instruct - Standard 128k context version (2GB VRAM)
- Llama-3.1-8B-Instruct - Highest quality responses (5GB VRAM, 128k context)
- Qwen2.5-7B-Instruct - Excellent coding capabilities (4GB VRAM, 128k context)
- Phi-3.5-mini-instruct - Microsoft's efficient model (2GB VRAM, 128k context)
- PDF - Full text extraction with metadata
- DOCX - Word documents with structure preservation
- TXT - Plain text files
- MD - Markdown files
- CSV - Structured data
- Client-side document processing for privacy
- Smart document chunking with configurable settings
- TF-IDF embeddings for semantic search
- Vector storage using IndexedDB
- Hybrid keyword + semantic search
- Real-time document search during conversations
- Drag-and-drop file upload
- Multiple document support
- Chunk Size: 50-1000 tokens (adjustable)
- Overlap Size: 0-200 tokens (adjustable)
- Search Accuracy: 0-100% scale
- Low (0-30%): Fuzzy matching
- Medium (40-60%): Balanced approach
- High (70-100%): Exact matching only
- Real-time streaming responses
- Syntax highlighting for code blocks
- Markdown rendering
- Token count display
- Message timestamps
- Chat history persistence
- Multiple chat sessions
- Clear chat functionality
- Copy message content
/help- Show all available commands/find [term]- Find exact sentences containing a term/chunks- Show all RAG chunks in order/debug rag- Show RAG system information/clear- Clear current chat/new- Start new chat session
- Multiple theme support (Skeleton, Wintry, Modern, Crimson)
- Dark/light mode switching
- Responsive design for desktop and mobile
- Configurable app title (via
app.config.json)
- Welcome guide for new users
- Loading progress indicators
- Error handling and user feedback
- Collapsible sidebar
- RAG context panel
- Mobile-optimized layout
dragDropUpload- Enable/disable drag-drop file uploadclientSideRAG- Enable/disable RAG functionalitydocumentEmbeddings- Enable/disable document embeddingsvectorSearch- Enable/disable vector searchragContextDisplay- Show/hide RAG context in responses
- Chat history saved in localStorage
- Documents and embeddings stored in IndexedDB
- Model cache in browser cache
- Feature flags persistence
- RAG settings persistence
- Theme preferences saved
- 100% client-side processing
- No data sent to servers
- All conversations stay private
- Documents processed locally
- Works completely offline after setup
- Chrome/Chromium 113+
- Edge 113+
- Safari 16.4+ (experimental)
- Firefox (behind flags)
- WebGPU support required
- Responsive mobile layout
- Touch-friendly interface
- Capacitor framework for native apps
- iOS platform support
- Progressive Web App (PWA) ready
- TypeScript support throughout
- Comprehensive test suite (Vitest + Playwright)
- ESLint and Prettier configuration
- Hot module replacement (HMR)
- Source maps for debugging
- Console logging for RAG operations
- Model caching after first download
- Lazy loading of components
- Web Worker support for background processing
- Efficient vector search algorithms
- Streaming response generation
- Token badge on messages using RAG
- Search status during document search
- Source citations in responses
- Processing status for file uploads
- Model loading progress
- File type icons (📕 PDF, 📘 DOCX, 📄 Text)
- Visit the application in a WebGPU-enabled browser
- Select an AI model based on your device capabilities
- Start chatting or upload documents for analysis
- Use advanced commands for enhanced functionality
- Customize themes and settings to your preference
All processing happens locally in your browser - your data never leaves your device!