Skip to content

mumtazm1/echobox

Repository files navigation

Echobox

Meeting transcription desktop app with speaker identification. Echobox watches one or more Google Calendars, captures system audio during scheduled meetings, and produces speaker-labeled transcripts. All processing runs locally.

Features

  • Monitors multiple Google Calendars for meeting events
  • Speaker diarization via pyannote.audio ("who said what")
  • System audio capture via PipeWire
  • Electron desktop app with week view and transcript browser
  • Local processing: audio, transcription, and diarization stay on your machine

Requirements

System Requirements

  • Python: 3.10 or higher
  • Node.js: 18 or higher
  • PipeWire: For audio capture (standard on modern Linux distributions)
  • NVIDIA GPU (optional): Significantly faster diarization with CUDA support

Hardware Recommendations

  • 16GB RAM minimum (24GB+ recommended for larger meetings)
  • Multi-core CPU for transcription during meetings
  • GPU with 8GB+ VRAM for faster post-meeting diarization (optional)

Installation

Clone Repository

git clone https://github.com/mumtazm1/echobox.git
cd echobox

Install Python Backend

python -m venv .venv
source .venv/bin/activate
pip install -e .

Install Frontend Dependencies

cd frontend
npm install
cd ..

Environment File

Copy .env.example to .env and fill in the values:

cp .env.example .env

You must set HF_TOKEN (see Configuration below) for speaker diarization to work. The other variables have defaults that work for a standard local setup.

Google Calendar Credentials

For each Google account whose calendar you want to watch:

  1. Open Google Cloud Console and create or select a project.
  2. Enable the Google Calendar API.
  3. Create OAuth 2.0 credentials of type Desktop application.
  4. Download the client secret JSON.
  5. Place it in $EB_KEYS_DIR (default: ~/.config/echobox/keys). The filename stem becomes the calendar slug (e.g. work_client_secret_abc.json becomes slug work).
mkdir -p ~/.config/echobox/keys
mv ~/Downloads/client_secret_*.json ~/.config/echobox/keys/

Then run eb-calendar setup to generate ~/.config/echobox/calendars.json from the client secrets, followed by eb-calendar auth for each calendar. See config/calendars.example.json for the full schema if you prefer to hand-edit the file.

Running

Development Mode

Start the backend and frontend in separate terminals:

# Terminal 1: Start Python backend
eb-server

# Terminal 2: Start Electron frontend
cd frontend && npm run dev

The app will open at http://localhost:5173 (dev server) with hot reload enabled.

Production Build

Build and install the desktop application:

# Build packages
./scripts/build.sh

# Install desktop entry and icon
./scripts/install.sh

The built packages will be in frontend/dist/:

  • AppImage: Portable executable
  • .deb: Debian/Ubuntu package

Configuration

Environment Variables

Variable Description Default
EB_KEYS_DIR Directory containing Google OAuth client secret JSON files ~/.config/echobox/keys
EB_CONFIG_DIR Runtime config directory (stores calendars.json, cached tokens) ~/.config/echobox
EB_TRANSCRIPTS_DIR Root directory for transcript storage ./transcripts
HF_TOKEN HuggingFace token for pyannote speaker diarization required
ECHOBOX_SKIP_BACKEND Skip spawning backend from Electron (for development) unset

calendars.json

Echobox reads its calendar list from $EB_CONFIG_DIR/calendars.json. Each entry maps a Google Calendar to a local slug, display name, color, and output directory. A template lives at config/calendars.example.json:

{
  "calendars": [
    {
      "calendar_id": "primary",
      "name": "work",
      "display_name": "Work",
      "color": "#4285f4",
      "output_dir": "./transcripts/work",
      "client_secret_path": "~/.config/echobox/keys/work_client_secret.json",
      "enabled": true
    }
  ]
}
  • name is the slug used for URL filters, token filenames, and the transcript subdirectory. Lowercase, no spaces.
  • display_name is what the UI shows on tabs and badges. Falls back to a title-cased slug if omitted.
  • color is any hex code. Tab borders, event block backgrounds, and CSS custom properties (--calendar-<slug>) are generated from this at runtime, so you can add as many calendars as you want.

Transcript Storage

Transcripts are written under $EB_TRANSCRIPTS_DIR/<slug>/, one subdirectory per calendar defined in calendars.json.

Usage

Web Interface

The Echobox UI provides two main views:

Calendar View

  • Week-at-a-glance calendar showing meetings from all connected calendars
  • Color-coded meeting blocks by calendar source
  • Click on any meeting to view its transcript

Transcript Browser

  • List of all transcripts with search and filtering
  • Filter by calendar tab
  • Full transcript reader with speaker-labeled segments

Recording Controls

  • Start/Stop Recording: Manual recording controls for non-calendar meetings
  • Status Indicator: Shows current recording state and backend connection
  • Real-time Updates: Live status updates via WebSocket connection

CLI Commands

Echobox provides CLI commands for automation and debugging:

# Start the API server
eb-server

# Capture system audio
eb-capture

# Transcribe audio file
eb-transcribe recording.wav

# Run speaker diarization
eb-diarize recording.wav

# Process and merge transcript with diarization
eb-process recording.wav

# Calendar operations (auth, list, events, next, setup, config, watch)
eb-calendar --help

# Enroll a speaker's voice fingerprint for automatic labeling
eb-enroll alice alice-sample.wav

# Inspect enrolled speakers
eb-speakers list

# Run transcript correction via Claude API
eb-correct transcript.json

# Full meeting pipeline
eb-meeting

Architecture

Backend (Python/FastAPI)

  • Audio capture via PipeWire/PulseAudio
  • Transcription using faster-whisper (CTranslate2)
  • Speaker diarization using pyannote.audio
  • REST API + WebSocket for real-time updates

Frontend (Electron/React)

  • Electron 40+ with Vite build system
  • React 19 with TanStack Query
  • CSS custom properties for theming
  • Backend lifecycle management in production

Development

Project Structure

echobox/
  src/               # Python backend source
    api/             # FastAPI server and routes
    capture/         # Audio capture modules
    transcription/   # Whisper transcription
    diarization/     # Speaker diarization
    calendar_integration/  # Google Calendar sync
    correction/      # LLM transcript correction
    pipeline/        # Full processing pipeline
  frontend/          # Electron/React frontend
    src/main/        # Electron main process
    src/renderer/    # React app
    src/preload/     # Electron preload scripts
  config/            # Example config templates
  scripts/           # Build and install scripts

Running Tests

# Backend tests
pytest

# Frontend type checking
cd frontend && npm run typecheck

Building for Distribution

# Full build (AppImage + deb)
./scripts/build.sh

# Just create unpacked directory (faster for testing)
cd frontend && npm run pack

License

MIT. See LICENSE.

Contributing

Issues and pull requests are welcome. Please open an issue to discuss substantial changes before sending a PR.

About

Meeting transcription desktop app with speaker identification (Whisper + pyannote + Electron + FastAPI)

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors