Skip to content

UPSTREAM PR #21283: [SYCL] fix llama_kv_cache hang when kv_cache is huge: 5GB#1326

Open
loci-dev wants to merge 1 commit intomainfrom
loci/pr-21283-fix_buffer_clear
Open

UPSTREAM PR #21283: [SYCL] fix llama_kv_cache hang when kv_cache is huge: 5GB#1326
loci-dev wants to merge 1 commit intomainfrom
loci/pr-21283-fix_buffer_clear

Conversation

@loci-dev
Copy link
Copy Markdown

@loci-dev loci-dev commented Apr 2, 2026

Note

Source pull request: ggml-org/llama.cpp#21283

In llama_kv_cache, when the cache size is huge, like 5GB, the code will hang.
The root cause is the memset() can't support more than 4GB.
Verified on Arc770.

@loci-review
Copy link
Copy Markdown

loci-review Bot commented Apr 2, 2026

No meaningful performance changes were detected across 123165 analyzed functions in the following binaries: build.bin.libllama.so, build.bin.llama-bench, build.bin.libmtmd.so, build.bin.llama-cvector-generator, build.bin.llama-tts, build.bin.llama-quantize, build.bin.llama-qwen2vl-cli, build.bin.llama-tokenize, build.bin.llama-gemma3-cli, build.bin.llama-gguf-split, build.bin.llama-llava-cli, build.bin.llama-minicpmv-cli, build.bin.libggml-base.so, build.bin.libggml-cpu.so, build.bin.libggml.so.

🔎 Full breakdown: Loci Inspector
💬 Questions? Tag @loci-dev

@loci-dev loci-dev force-pushed the main branch 9 times, most recently from a8215be to 34734bc Compare April 9, 2026 02:17
@loci-dev loci-dev force-pushed the main branch 9 times, most recently from 245e873 to d101579 Compare April 17, 2026 02:18
@loci-dev loci-dev force-pushed the main branch 3 times, most recently from 7638ab4 to f1b46d5 Compare April 20, 2026 02:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants