Skip to content

joshxfi/bumblebee

Repository files navigation

Bumblebee 🐝

Bumblebee is a fully client-side chat app: React, Vite, and @huggingface/transformers run ONNX language models in the browser. There is no app backend, account system, or server-side inferenceβ€”only Hugging Face as the CDN for model weights. Chat history stays in memory for the current tab.

Use it as a small reference for local-first browser AI, Web Workers, streaming UI, and lightweight on-device generation.

bumblebee

Live demo: bumblebee.joshxfi.com

Source: Open source on GitHub Β· Author: @joshxfi

What It Does

  • Runs text generation in a Web Worker so inference does not block the UI thread
  • Loads ONNX checkpoints from Hugging Face at runtime; the browser cache speeds repeat visits
  • Model picker with models grouped by provider family
  • Picks a device profile (standard vs constrained): constrained users get a lighter default, and models marked desktop-only are disabled in the UI to avoid unstable loads
  • Streams assistant output as markdown (Streamdown)
  • Keeps the transcript ephemeral (in-memory only for the session)

How It Works

  • React 19 and Vite power the shell and chat UI
  • Zustand holds chat and runtime state; Tailwind CSS styles the UI
  • @huggingface/transformers loads the selected repo via pipeline("text-generation", …) inside the worker

Weights and tokenizers are not bundled; they download on demand from Hugging Face, then reuse the browser cache when possible.

Defaults

Bumblebee picks the starting model from getRecommendedModelId and getDeviceProfile. Constrained mode is used on typical mobile user agents, touch-capable Macs counted as touch-first, or when navigator.deviceMemory is available and ≀ 4 GB.

  • Standard (desktop-class) default: LFM2.5 350M β€” onnx-community/LFM2.5-350M-ONNX
  • Constrained default: Falcon H1 Tiny 90M Instruct β€” onnx-community/Falcon-H1-Tiny-90M-Instruct-ONNX

Model catalog

All checkpoints below are q4 ONNX builds from the onnx-community org. Desktop only means supportsMobile: false in configβ€”those entries are disabled when the device profile is constrained.

SmolLM

  • SmolLM2 135M β€” onnx-community/SmolLM2-135M-Instruct-ONNX-MHA β€” mobile + desktop
  • SmolLM2 360M β€” onnx-community/SmolLM2-360M-ONNX β€” mobile + desktop

Gemma

  • Gemma 3 270M β€” onnx-community/gemma-3-270m-it-ONNX β€” mobile + desktop
  • Gemma 3 1B β€” onnx-community/gemma-3-1b-it-ONNX β€” desktop only

Qwen

  • Qwen2.5 0.5B β€” onnx-community/Qwen2.5-0.5B-Instruct-ONNX-MHA β€” mobile + desktop
  • Qwen3 0.6B β€” onnx-community/Qwen3-0.6B-ONNX β€” mobile + desktop

Falcon

  • Falcon H1 Tiny 90M β€” onnx-community/Falcon-H1-Tiny-90M-Instruct-ONNX β€” mobile + desktop
  • Falcon H1 Tiny Multilingual 100M β€” onnx-community/Falcon-H1-Tiny-Multilingual-100M-Instruct-ONNX β€” mobile + desktop

LFM (Liquid)

  • LFM2.5 350M β€” onnx-community/LFM2.5-350M-ONNX β€” mobile + desktop
  • LFM2 350M β€” onnx-community/LFM2-350M-ONNX β€” mobile + desktop
  • LFM2 700M β€” onnx-community/LFM2-700M-ONNX β€” desktop only
  • LFM2 1.2B β€” onnx-community/LFM2-1.2B-ONNX β€” desktop only

Llama

  • Llama 3.2 1B β€” onnx-community/Llama-3.2-1B-Instruct-ONNX β€” desktop only

TinySwallow

  • TinySwallow 1.5B β€” onnx-community/TinySwallow-1.5B-Instruct-ONNX β€” desktop only

Bonsai

  • Bonsai 1.7B β€” onnx-community/Bonsai-1.7B-ONNX β€” desktop only

Limitations

  • Educational and experimentalβ€”not a production AI platform
  • First run downloads tokenizer and weights; later visits depend on browser cache behavior
  • Very low-memory hardware can still struggle even with small models
  • Browser-based inference is not the same as a fully offline native desktop runtime
  • Quality and coherence are limited by model size and quantization

Local development

Prerequisites

Setup

bun install
bun run dev

Useful commands

bun run build
bun run test
bun run lint
bun run typecheck
bun run preview   # local preview of production build

Project structure

License

This project is licensed under the MIT License.

About

🐝 Run on-device models directly from your browser via Transformers.

Topics

Resources

License

Stars

Watchers

Forks

Contributors

Languages