forked from ggml-org/llama.cpp
-
-
Notifications
You must be signed in to change notification settings - Fork 150
Pull requests: TheTom/llama-cpp-turboquant
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
vulkan: TQ4_1s support for model weights
ggml
testing
Vulkan
#69
opened Apr 11, 2026 by
Titaniumtown
Loading…
fix: inverse WHT in test-turbo-quant.c round-trip (#59)
ggml
testing
#63
opened Apr 9, 2026 by
seanrasch
Loading…
perf: turbo VEC flash attention — +9% decode on CUDA via autoresearch
ggml
Nvidia GPU
script
#53
opened Apr 4, 2026 by
signalnine
Loading…
7 tasks done
metal: MUL_MAT kernels for turbo3/turbo4 + dual-LUT dequant (port of PR #22)
Apple Metal
ggml
#49
opened Apr 4, 2026 by
TheTom
Owner
Loading…
fix: HIP/ROCm compatibility — check cudaMemcpyToSymbol errors, guard …
ggml
Nvidia GPU
#41
opened Apr 1, 2026 by
terrysimons
•
Draft
feat: ROCm/HIP support for turbo3 KV cache (gfx1100/RDNA3)
documentation
Improvements or additions to documentation
ggml
Nvidia GPU
#5
opened Mar 27, 2026 by
apollosenvy
Loading…
5 of 7 tasks
ProTip!
Adding no:label will show everything without a label.