An AI-native voice notes application powered by OpenAI's Realtime API, featuring agentic workflows, semantic search, and edge-first architecture on Cloudflare.
This is not just another notes app—it's a production-ready showcase of cutting-edge web technologies working together seamlessly:
- 🎤 Voice-First AI Agent — Talk naturally to create, search, update, and organize your notes using OpenAI's Realtime API with WebRTC streaming
- 🚀 Edge-Native Architecture — Runs entirely on Cloudflare's edge platform with sub-50ms global latency
- 🔐 User-Scoped SQLite Databases — Each user gets their own isolated SQLite database via Durable Objects—true data sovereignty at scale
- 🔍 Semantic Search — Vector embeddings with Cloudflare Vectorize and Workers AI (
bge-small-en-v1.5) for intelligent note retrieval - 🔄 Agentic Workflows — Client-side tool calling with automatic cache synchronization via React Query
- 📊 Real-Time Usage Telemetry — WebSocket-based sideband connection tracks token usage and costs live
- 🔒 Production-Grade Auth — Better Auth integration with short-lived JWT rotation and secure session management
- 📈 Analytics Database — D1-powered usage tracking with per-model cost estimation
- Next.js 15 with App Router and React 19
- OpenAI Agents SDK for realtime voice interaction and tool execution
- React Query v5 for optimistic updates and cache coherence
- Tailwind CSS 4 with shadcn/ui primitives
- Cloudflare Workers for edge computing
- Durable Objects for per-user isolated SQLite instances (
NotesDO,SessionsDO,UsageLogDO) - Vectorize for cosine similarity semantic search (384 dimensions)
- D1 Analytics DB for aggregated usage and cost tracking
- Workers AI for on-demand text embeddings
- WebRTC Bridge to OpenAI Realtime API with session management
User Voice → WebRTC → OpenAI Realtime API → Agent Tools →
→ Cloudflare Worker → Durable Object (SQLite) → Vectorize Index
↓
Usage WebSocket → UsageLogDO → D1 Analytics
For complete setup and deployment instructions:
- Backend Setup & Deployment → — Cloudflare Workers, Durable Objects, D1, and Vectorize provisioning
- Frontend Setup & Deployment → — Next.js configuration and deployment to Vercel
Unlike traditional multi-tenant databases, each user's notes live in a dedicated SQLite instance inside a Durable Object. This provides:
- Data sovereignty — Full isolation with no cross-user queries
- Predictable performance — No noisy neighbor issues
- Regulatory compliance — GDPR/CCPA friendly data boundaries
The OpenAI Realtime agent has access to powerful tools:
list_notes— Retrieve all user notesget_note_by_id— Fetch specific note contentcreate_note— Generate new notes from voiceupdate_note— Modify existing notesdelete_note— Remove notessearch_notes— Semantic search across all notes
All tools update React Query caches optimistically for instant UI feedback.
- User creates/updates a note
- Background task embeds content with Workers AI
- 384-dim vector stored in Vectorize with metadata
- Voice query "find my meeting notes" → embedded → cosine similarity search
- Results ranked and returned with relevance scores
Real-time WebSocket streams:
- Text tokens (input/output)
- Audio duration (seconds)
- Model (e.g.,
gpt-4o-realtime-preview-2025-06-03)
Backend computes estimated costs using src/utils/pricing.js and surfaces them via:
GET /api/usage/summary?start=2025-01-01T00:00:00Z&end=2025-01-31T23:59:59Z- API Worker Documentation → — Architecture diagrams, API surface, troubleshooting
- Web App Documentation → — Project structure, environment variables, deployment
- Agent Guidelines → — Coding conventions and agent integration notes
| Layer | Technology |
|---|---|
| Frontend Framework | Next.js 15, React 19 |
| State Management | React Query v5, Zustand |
| Styling | Tailwind CSS 4, shadcn/ui |
| Backend Runtime | Cloudflare Workers |
| Authentication | Better Auth with JWT rotation |
| Database | SQLite (Durable Objects), D1 (Analytics) |
| Vector DB | Cloudflare Vectorize |
| AI Models | OpenAI Realtime API, Workers AI |
| Real-Time Comms | WebRTC, WebSockets |
| Deployment | Cloudflare Workers, Vercel |
This architecture is perfect for:
- Voice-first applications requiring low latency AI responses
- Privacy-conscious apps needing user data isolation
- Global applications benefiting from edge deployment
- Cost-optimized AI leveraging Cloudflare's Workers AI for embeddings
- Real-time analytics with live usage tracking
