Zero-dependency vector similarity library. Pure Go.
pkg/vector/ ← library code
vector.go Vector type, Dot, Norm, Normalize, Add, Sub, Scale, Equal, EqualEps, Clone, Dims
similarity.go Metric enum, Cosine, CosineDist, Euclidean, Manhattan, Distance
store.go Store: NN search + Gob/JSON persistence (Save/Load/SaveJSON/LoadJSON)
embedder.go Embedder interface
random_projections.go RandomProjections: sparse JL projection + tokenizer
cmd/go-vector/ ← minimal CLI demo
docs/ ← GitHub Pages landing page
index.html Dark-themed single-page site
.nojekyll GitHub Pages raw HTML flag
- Zero dependencies — never add to go.mod. stdlib only:
math,sort,encoding/gob,encoding/json,os,strings,unicode,math/rand. - Vector = []float32 — no struct, no interface, just a slice.
- Mismatched lengths → zero — return zero/nil rather than panicking.
- Clone on output — Get() and Search() return copies. Store.Add() clones on insert.
- Zero-alloc distance — Cosine and Euclidean compute in a single pass.
- Gob persistence —
storeDatainternal struct bridges unexported fields to encoder. - Deterministic embeddings — RandomProjections uses fixed seed 42.
- Tests in
_test.gofiles — packagevector.
Sparse random projection (Achlioptas 2003):
- Entries: {-1, 0, +1} × sqrt(3/D) with probabilities {1/6, 2/3, 1/6}
- Vocabulary built from corpus via
Fit() - Tokenizer: split on non-letter/digit, lowercase, min 2 chars
- Output always L2-normalized
// storeData bridges unexported Store fields to encoder
type storeData struct {
Vectors []Vector
IDs []string
Metric Metric
}init() registers gob.Register(Vector{}) so []float32 serializes correctly.
docker exec -i projects-dev bash -c 'export PATH=$PATH:/usr/local/go/bin && cd /workspace/go-vector && make ci'
# Benchmark
docker exec -i projects-dev bash -c 'export PATH=$PATH:/usr/local/go/bin && cd /workspace/go-vector && go test ./pkg/vector/ -bench=. -benchmem'Target: >95% coverage. 40 tests, 96.8% currently.
- Implement
Embedderinterface (Embed(text) (Vector, error),Dims() int) - Add constructor function
- Add test file with Fit/Embed/determinism/similarity tests
- Document in README under Text Embedding section
- Add constant to
Metricenum insimilarity.go - Add case to
Distance()switch, updateAscending()if needed - Add direct function — zero-alloc, single-pass
- Add test + benchmark
- No panics. Return zero/nil for invalid input.
- No unsafe. Pure Go, no CGo, no syscalls.
- Clone hygiene. Internal state never exposed.
- Persistence:
Loadreplaces all data. Caller handles atomicity.