Commit 0946455
API documentation + quant.h sync (closes #7)
docs/api.md (730 lines):
- Single-header API reference (6 functions + config)
- Full library API (model, inference, tokenizer, KV)
- Build instructions for all platforms (macOS/Linux/Windows/WASM/mobile)
quant.h sync:
- Add IQ3_XXS dequantization (iq3xxs_grid[256] codebook)
- Add Gemma 4 MoE support (dual-FFN, QK-norm, GeGLU)
- Add Llama 3 EOS tokens
- Add thought token filtering
- Verified: compiles with cc -std=c11
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>1 parent d81c975 commit 0946455
2 files changed
Lines changed: 991 additions & 2 deletions
0 commit comments