Skip to content

AndrewMn123/sonicfind

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 

Repository files navigation

SonicFind

Audio similarity search that runs entirely in your browser. Drop in a sound file or type a description like "dark punchy kick", and SonicFind finds the most similar sounds from your sample library. No server, no uploads — all processing happens locally on your machine.

Try the Demo

How It Works

SonicFind uses a 3-step pipeline: Extract → Embed → Compare.

  1. Extract — Audio files are decoded via Web Audio API, resampled to 16kHz mono, and normalized to a standard Float32Array.
  2. Embed — The CLAP neural network (Contrastive Language-Audio Pretraining) converts audio into a 512-dimensional vector that captures timbral properties — what the sound sounds like, not what it's called.
  3. Compare — Cosine similarity ranks every indexed sample against the query. Results are sorted by similarity score.

CLAP also supports text-to-audio search: type "dark punchy kick" and find matching sounds without uploading anything.

Tech Stack

Component Technology
Framework React 19 + Vite 7
ML Runtime Transformers.js (ONNX Runtime WebAssembly)
Audio Model CLAP (Xenova/clap-htsat-unfused, 161 MB)
Audio Decode Web Audio API
Folder Access File System Access API
Styling Tailwind CSS v4
Threading Web Workers

Performance

Metric Value
Single file embed (cold) ~537 ms
Single file embed (warm) ~133 ms
Batch indexing rate ~450 samples/min
Audio decode ~11 ms
Cosine search (2000+ vectors) < 10 ms
Model init (cached) ~4.4 s
Model init (first download) ~31 s

Benchmarked with 2188 WAV samples on a standard desktop. Indexing speed varies with file length and hardware.

Browser Compatibility

Feature Chrome 86+ Edge 86+ Firefox Safari
Audio search (drop file) Yes Yes Yes Yes
Text search Yes Yes Yes Yes
Demo library Yes Yes Yes Yes
Folder indexing Yes Yes No No

Folder indexing requires the File System Access API (Chrome/Edge only). All other features work in any modern browser.

Known Limitations

  • Text search accuracy — Abstract adjectives ("soft", "warm") produce correct rankings but low confidence scores. Concrete terms ("kick", "hihat") work well.
  • First-load model download — 161 MB on first visit. Cached after that.
  • Folder indexing — Chrome/Edge only (File System Access API).
  • No persistent storage — Indexed library is lost on page refresh. IndexedDB persistence planned for Phase 2.
  • VectorStore scaling — Brute-force cosine similarity, fine up to ~5000 samples. HNSW indexing planned for Phase 2.

Roadmap

  • Hybrid DSP+CLAP search — Re-rank CLAP results using spectral centroid, attack time, and zero crossing rate for better adjective-based queries
  • HNSW indexing — Approximate nearest neighbor for 10,000+ sample libraries
  • Persistent storage — Save/load indexed libraries via IndexedDB
  • Binary serialization — Replace JSON-based VectorStore with ArrayBuffer format
  • Rust/WASM — Move VectorStore and audio preprocessing to WebAssembly for SIMD acceleration

License

Source code is not publicly available. The live demo is free to use.

About

Browser-based audio similarity search — no server, no uploads, runs entirely in your browser.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors