🎨 Stun — Spatial AI Thinking Environment

Stop typing. Start thinking visually.

An infinite canvas where Google Gemini AI directly navigates, organizes, and transforms your thoughts in real-time.

Built With

Language: TypeScript
Frontend: Next.js, React, Zustand
Canvas: TLDraw, Excalidraw, React Flow
AI: Google Gemini 2.5 Flash (GenAI SDK)
Backend: Node.js, Express, Firebase Admin
Database: Firestore
Cloud: Google Cloud Run, Secret Manager, Artifact Registry
Infrastructure: Terraform
Voice: Web Speech API

📑 Table of Contents

🎯 The Problem
✨ What is Stun?
🚀 Key Features
🛠️ Tech Stack
🎓 Why Gemini 2.5 Flash?
🚀 Getting Started
📖 Usage Guide
🏗️ Architecture
🧪 Testing & QA
☁️ Google Cloud Deployment
📊 Performance
🔐 Security
📚 Documentation
🤝 Contributing
📈 Impact & Learning
🏆 Hackathon Category
🎥 Demo & Resources

🎯 Quick Start

🚀 Try It Live Now

👉 Open Stun Live App (Google Cloud hosted | Real-time deployment)

📋 Local Development (3 Commands)

# Terminal 1: Firestore Emulator
cd backend && firebase emulators:start --only firestore --project stun-489205

# Terminal 2: Backend API
cd backend && bun install && bun run dev

# Terminal 3: Frontend App  
cd web && bun install && bun run dev

Windows? Single command:

.\scripts\start-dev.ps1

🎯 The Problem

Traditional AI is text-in, text-out. You type. It responds. You're stuck in a chat box.

But thinking isn't linear. It's spatial. Visual. Interconnected.

Stun reimagines AI interaction — instead of receiving text responses, AI visually understands your canvas, interprets spatial relationships, and directly navigates your workspace. Every command becomes a visual transformation.

✨ What is Stun?

Stun is a UI Navigator that blends three synchronized canvas layers into one intelligent workspace:

🎨 Layer 1: TLDraw          → Infinite pan/zoom workspace
📐 Layer 2: Excalidraw      → Visual shapes & diagrams  
🧠 Layer 3: React Flow      → AI-readable knowledge graph

How It Works

You speak or type a command
Example: "Turn this into a roadmap"
Gemini sees your canvas (screenshot + structured node data)
AI plans actions (move, create, group, connect, zoom)
Actions execute live on your canvas
Your board transforms in real-time

No chat box. No back-and-forth. Pure visual interaction.

🚀 Key Features

🧠 Visual AI Understanding

Multimodal Context: Gemini analyzes both canvas screenshots AND structured node data
Spatial Reasoning: AI understands relationships, distances, and hierarchies
Action Planning: Generates executable, validated command sequences

🎮 Hybrid Canvas Architecture

Infinite Workspace: Pan/zoom with TLDraw's operating system layer
Visual Tools: Draw, shape, annotate with Excalidraw
Knowledge Graph: React Flow nodes/edges for AI-readable logic

🗣️ Natural Interaction

Voice Commands: Web Speech API integration
Text Input: Type or speak your intent
Real-Time Execution: Watch AI transform your canvas live

🤝 Real-Time Collaboration

Live Presence: See who's editing (active user tracking)
Shared Boards: Invite collaborators for joint thinking
Instant Sync: All changes sync across users via Firestore

💾 Persistent & Offline-First

Auto-Save: Every action auto-saved to Firestore (debounced 3s)
Recovery: Resume work instantly, even after browser restart
Conflict-Free: Last-write-wins strategy with Firestore timestamps

🔒 Enterprise Security

OAuth 2.0: Google authentication (no passwords)
JWT Tokens: Firebase ID tokens with 1-hour expiry, auto-renewal
Access Control: Firestore rules enforce user-scoped read/write
Secrets Management: API keys in Google Secret Manager (not in code)

🛠️ Tech Stack

🎨 Frontend

Framework: Next.js 14 (App Router, TypeScript)
State: Zustand (lightweight, autosave-friendly)
Canvas Engines:
- 🎨 TLDraw 2.4.6 (infinite workspace)
- 📐 Excalidraw 0.17.6 (visual editing)
- 🧠 React Flow 11.11.4 (knowledge graph)
Voice: Web Speech API
Screenshots: html2canvas
Styling: SCSS
Storage: Firebase SDK + localStorage

🔧 Backend

Runtime: Node.js 20+
Framework: Express.js 5 (TypeScript)
AI Model: Google Gemini 2.5 Flash (via Google GenAI SDK)
Database: Firestore (NoSQL, real-time listeners)
Authentication: Firebase Admin SDK
Validation: Zod (type-safe runtime checks)
Logging: Winston

☁️ Google Cloud Stack

Compute: Cloud Run (auto-scaling containers)
Database: Firestore (NoSQL, real-time)
Secrets: Secret Manager (API key storage)
Registry: Artifact Registry (container images)
Terraform: Infrastructure as Code

📦 Deployment

Container: Docker (separate images for backend & frontend)
Orchestration: Terraform (6+ modules for GCP resources)
CI/CD: GitHub Actions → Artifact Registry → Cloud Run
Region: us-central1 (multi-zone availability)

🎓 Why Gemini 2.5 Flash?

Capability	Why It's Perfect
Multimodal	Understands screenshots + text context together
Spatial Reasoning	Interprets node positions, connections, grouping
Speed	100-500ms inference (real-time response)
Cost	Fast models = lower bill per 1M tokens
JSON Output	Native structured response (easy validation)

🚀 Getting Started

Prerequisites

Node.js 20+ (download)
Bun (install) or npm/yarn
Google Cloud Account (free tier eligible)
Gemini API Key (get free here)
Firebase Project (create one)

Installation (Full Setup)

Step 1: Clone Repository

git clone https://github.com/Invariants0/Stun.git
cd Stun

Step 2: Backend Setup

cd backend
cp .env.example .env.local

# Edit .env.local and add:
# GEMINI_API_KEY=your_key_here
# GCP_PROJECT_ID=stun-489205
# FIREBASE_SERVICE_ACCOUNT_KEY=<JSON from Firebase>

bun install
bun run dev
# Backend runs on http://localhost:8080

Step 3: Frontend Setup

cd ../web
cp .env.example .env.local

# Edit .env.local and add Firebase config:
# NEXT_PUBLIC_FIREBASE_API_KEY=...
# NEXT_PUBLIC_FIREBASE_PROJECT_ID=...

bun install
bun run dev
# Frontend runs on http://localhost:3000 → /board/demo-board

Step 4: Firestore Emulator (in separate terminal)

cd backend
firebase emulators:start --only firestore --project stun-489205

✅ You're ready! Open http://localhost:3000/board/demo-board

📖 Usage Guide

1. Open a Board

http://localhost:3000/board/demo-board

2. Draw & Create Nodes

Use Excalidraw tools to draw shapes
Click to create React Flow nodes
Connect nodes with edges

3. Issue a Command

Voice: Click the mic 🎤 button, speak your intent
Text: Type in the floating command bar (Ctrl+K to focus)

4. Watch AI Transform

Gemini analyzes your canvas + command, then executes actions live:

Move nodes
Create new nodes
Group related elements
Connect nodes with edges
Zoom to focus areas

5. Collaborate

Invite collaborators via share button
See active users in real-time
All changes sync instantly

🏗️ Architecture

USER BROWSER
    ↓
NEXT.JS FRONTEND (Hybrid Canvas)
    ↓ HTTP REST + Firebase JWT
CLOUD RUN BACKEND (Express.js)
    ├─ Intent Parser (command type detection)
    ├─ Orchestrator (spatial context builder)
    ├─ Gemini Service (AI coordination)
    ├─ Board Service (CRUD)
    ├─ Presence Service (collaboration)
    └─ Auth Middleware (JWT validation)
    ↓
GOOGLE GEMINI 2.5 FLASH (AI Planning)
    ↓ JSON Action Plan
FIRESTORE DATABASE
    ├─ boards (canvas state)
    └─ board_presence (active users)

Full Architecture Diagram: See ARCHITECTURE.md

🧪 Testing & QA

Test the AI Pipeline

cd backend
bun test tests/gemini/gemini-connectivity.test.ts
bun test tests/gemini/gemini-actions.test.ts

Test Board Persistence

cd backend
bun test tests/firestore.test.ts

Test Health Check

curl http://localhost:8080/health

Integration Test (E2E)

cd backend
bun test tests/ai.test.ts

☁️ Google Cloud Deployment

1. Prerequisites

gcloud auth login
gcloud config set project stun-489205
terraform --version  # >= 1.5.0

2. Deploy Infrastructure

cd infra/environments/dev
terraform init
terraform plan
terraform apply

3. Build & Deploy Services

cd infra
./scripts/deploy.ps1  # Windows
./scripts/deploy.sh   # macOS/Linux

4. View Live App

https://stun-frontend-dev-279596491182.us-central1.run.app

5. Monitor Logs

gcloud run logs read stun-backend-dev --limit=100
gcloud run logs read stun-frontend-dev --limit=100

Proof of GCP Deployment:

✅ Live app: stun-frontend-dev
✅ IaC code: infra/modules/
✅ Terraform configs: Cloud Run, Firestore, Secret Manager, Artifact Registry

📊 Performance

Metric	Value	Notes
Canvas Interaction	<16ms	60fps rendering
Screenshot Capture	100-300ms	html2canvas
Gemini API Call	200-800ms	LLM inference
Firestore Write	50-200ms	Network I/O
Full AI Cycle	500-1500ms	End-to-end command execution

Optimizations:

Debounced auto-save (3s) reduces write load
Optimistic UI updates before persistence
3-layer canvas render optimization with requestAnimationFrame
Firestore real-time listeners for sub-second collaboration sync

🔐 Security

Authentication & Authorization

✅ Google OAuth 2.0 for login
✅ Firebase JWT tokens (1-hour TTL, auto-refresh)
✅ Backend validates every request token
✅ Firestore rules scoped to user ID

Data Protection

✅ Secrets in Google Secret Manager (not in code)
✅ HTTPS enforced (Cloud Run default)
✅ httpOnly cookies (XSS-resistant)
✅ CORS whitelist for frontend domain

Input Validation

✅ Zod schemas validate all API requests
✅ Zod validates Gemini JSON responses (prevents hallucinations)
✅ Position sanitization prevents out-of-bounds node placement
✅ Rate limiting (express-rate-limit)

📚 Documentation

📄 Document	📝 Purpose	🔗 Link
Architecture Overview	Complete system design & data flow	docs/ARCHITECTURE.md
Canvas System	3-layer hybrid canvas synchronization	docs/Canvas-system.md
Product Requirements	Feature specifications & roadmap	docs/PRD.md
Deployment Runbook	GCP deployment procedures	DEPLOY.md
Local Testing Guide	Development setup & troubleshooting	LOCAL_TESTING_GUIDE.md

🤝 Contributing

We welcome contributions! To contribute:

Fork the repository
Create a feature branch: git checkout -b feature/your-feature
Commit changes: git commit -m "Add your feature"
Push branch: git push origin feature/your-feature
Open a Pull Request

Code Standards:

TypeScript (strict mode)
ESLint + Prettier for formatting
Zod for runtime validation
Tests for new features (Bun test)

📈 Impact & Learning

What We Built

A production-grade spatial AI thinking environment that proves AI can go beyond chat. Instead of responding in text, our AI visually navigates your workspace.

Key Learnings

Gemini's Multimodal Power: Screenshots + structured text data = richer AI understanding
Real-Time Interaction UX: Users expect <1s response times for AI actions
Hybrid Architecture Complexity: Syncing 3 canvas layers requires careful state management
Firestore at Scale: 1MB document limits force creative data structuring
Spatial Reasoning Challenge: Teaching AI to understand coordinates & layouts is non-trivial

Use Cases Beyond Demo

📋 Project Management: Visual task boards with AI auto-organization
🧠 Brainstorming: Mind maps that AI helps structure
🎨 Design Thinking: Collaboration boards with AI layout assistance
📊 Data Visualization: Charts that AI reorganizes based on insights
🧑‍🎓 Education: Interactive learning spaces with AI mentoring

🏆 Hackathon Category

Category: UI Navigator ☸️
Challenge: Build an agent that visually understands UI and performs actions based on intent

How Stun Qualifies:

✅ Visual UI Understanding: Gemini analyzes canvas screenshots
✅ Multimodal Input: Images (screenshots) + text (commands) + structured data (nodes)
✅ Executable Actions: AI outputs validated, sanitized actions that execute on canvas
✅ Real-Time Interaction: Sub-2-second command-to-execution cycle
✅ Live Deployment: Production-grade app running on Google Cloud

📞 Support & Community

Need help? Check these resources:

💬 Channel	🔗 Link	📌 For
🐛 Issues	github.com/.../Stun/issues	Bug reports & feature requests
💡 Discussions	github.com/.../Stun/discussions	Questions & ideas
📖 Code	github.com/Invariants0/Stun	Source + PRs
🏆 Hackathon	devpost.com/.../stun-7ct2km	Submission details

📄 License & Acknowledgments

License: MIT — See LICENSE

Built With ❤️ by a passionate team using:

Google Gemini — Multimodal AI powerhouse
Google Cloud Platform — Production infrastructure
Open Source — TLDraw, Excalidraw, React Flow, Next.js, Express, Bun

🎥 Live Demo & Resources

Quick Links

Built With

💫 Made for Hackathon

Gemini Live Agent Challenge 2026 — Transforming spatial thinking into visual reality

Name		Name	Last commit message	Last commit date
Latest commit History 243 Commits
backend		backend
docs		docs
infra		infra
scripts		scripts
web		web
.gcloudignore		.gcloudignore
.gitignore		.gitignore
ARCHITECTURE.md		ARCHITECTURE.md
ARCHITECTURE_OVERVIEW.md		ARCHITECTURE_OVERVIEW.md
DEPLOY.md		DEPLOY.md
LICENSE		LICENSE
README.md		README.md
stun-codebase-analysis.md		stun-codebase-analysis.md

Folders and files

Latest commit

History

Repository files navigation

🎨 Stun — Spatial AI Thinking Environment

Built With

📑 Table of Contents

🎯 Quick Start

🚀 Try It Live Now

📋 Local Development (3 Commands)

🎯 The Problem

✨ What is Stun?

How It Works

🚀 Key Features

🧠 Visual AI Understanding

🎮 Hybrid Canvas Architecture

🗣️ Natural Interaction

🤝 Real-Time Collaboration

💾 Persistent & Offline-First

🔒 Enterprise Security

🛠️ Tech Stack

🎨 Frontend

🔧 Backend

☁️ Google Cloud Stack

📦 Deployment

🎓 Why Gemini 2.5 Flash?

🚀 Getting Started

Prerequisites

Installation (Full Setup)

📖 Usage Guide

1. Open a Board

2. Draw & Create Nodes

3. Issue a Command

4. Watch AI Transform

5. Collaborate

🏗️ Architecture

🧪 Testing & QA

Test the AI Pipeline

Test Board Persistence

Test Health Check

Integration Test (E2E)

☁️ Google Cloud Deployment

1. Prerequisites

2. Deploy Infrastructure

3. Build & Deploy Services

4. View Live App

5. Monitor Logs

📊 Performance

🔐 Security

Authentication & Authorization

Data Protection

Input Validation

📚 Documentation

🤝 Contributing

📈 Impact & Learning

What We Built

Key Learnings

Use Cases Beyond Demo

🏆 Hackathon Category

📞 Support & Community

📄 License & Acknowledgments

🎥 Live Demo & Resources

Quick Links

Built With

💫 Made for Hackathon

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages