UPSTREAM PR #21452: metal : add GATED_LINEAR_ATTN op by loci-dev · Pull Request #1334 · auroralabs-loci/llama.cpp

loci-dev · 2026-04-05T03:10:26Z

Note

Source pull request: ggml-org/llama.cpp#21452

Add Metal backend support for GGML_OP_GATED_LINEAR_ATTN (GLA). Supports head_size 64 and 128, f32 only.

Tested with test-backend-ops (7/7 passed):

M5 Max 128GB (Apple10)
M2 Mac Mini (Apple8)

Overview

Adds Metal backend support for GGML_OP_GATED_LINEAR_ATTN, which currently falls back to CPU on Apple devices.

The kernel follows the RWKV WKV6 Metal structure: one thread per head element, threadgroup shared memory for k/q/gate, and a float4-vectorized inner loop.

supports_op restricts execution to F32 and head sizes 64 or 128.

No dedicated performance benchmarks are currently configured for this op.

Additional information

docs/ops.md and docs/ops/Metal.csv were regenerated using the required workflow (test-backend-ops support --output csv and ./scripts/create_ops_docs.py). The resulting diff includes updates to multiple ops due to recent changes in main; the only functional change in this PR is support for GATED_LINEAR_ATTN.

Mentions #14909

Requirements

I have read and agree with the contributing guidelines
AI usage disclosure: YES - AI used for research and guidance, all code manually reviewed and understood.

loci-review · 2026-04-05T04:06:38Z

No meaningful performance changes were detected across 125473 analyzed functions in the following binaries: build.bin.llama-tts, build.bin.llama-cvector-generator, build.bin.libllama.so, build.bin.libmtmd.so, build.bin.llama-bench, build.bin.libggml-cpu.so, build.bin.libggml-base.so, build.bin.libggml.so, build.bin.llama-tokenize, build.bin.llama-gemma3-cli, build.bin.llama-gguf-split, build.bin.llama-llava-cli, build.bin.llama-minicpmv-cli, build.bin.llama-quantize, build.bin.llama-qwen2vl-cli.

💬 Questions? Tag @loci-dev

Add Metal backend support for GGML_OP_GATED_LINEAR_ATTN (GLA). Supports head_size 64 and 128, f32 only. Tested with test-backend-ops (7/7 passed): - M5 Max 128GB (Apple10) - M2 Mac Mini (Apple8)

loci-review · 2026-04-20T03:41:10Z

No meaningful performance changes were detected across 46867 analyzed functions in the following binaries: build.bin.libllama.so, build.bin.libmtmd.so, build.bin.libggml-cpu.so, build.bin.libggml.so, build.bin.libggml-base.so, build.bin.llama-tts, build.bin.llama-bench, build.bin.llama-cvector-generator, build.bin.llama-gemma3-cli, build.bin.llama-gguf-split, build.bin.llama-llava-cli, build.bin.llama-minicpmv-cli, build.bin.llama-quantize, build.bin.llama-qwen2vl-cli, build.bin.llama-tokenize.

💬 Questions? Tag @loci-dev

loci-dev temporarily deployed to PROD__AL_DEMO April 5, 2026 03:10 — with GitHub Actions Inactive

loci-dev force-pushed the main branch 8 times, most recently from 55afbee to ef0eff4 Compare April 12, 2026 02:18

loci-dev force-pushed the main branch 9 times, most recently from 63ab8d1 to 7638ab4 Compare April 19, 2026 02:19

metal : add GATED_LINEAR_ATTN op

6244339

Add Metal backend support for GGML_OP_GATED_LINEAR_ATTN (GLA). Supports head_size 64 and 128, f32 only. Tested with test-backend-ops (7/7 passed): - M5 Max 128GB (Apple10) - M2 Mac Mini (Apple8)

loci-dev force-pushed the main branch from 7638ab4 to f1b46d5 Compare April 20, 2026 02:19

loci-dev force-pushed the loci/pr-21452-metal-gla-op branch from ffce212 to 6244339 Compare April 20, 2026 03:11

loci-dev temporarily deployed to PROD__AL_DEMO April 20, 2026 03:11 — with GitHub Actions Inactive

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

UPSTREAM PR #21452: metal : add GATED_LINEAR_ATTN op#1334

UPSTREAM PR #21452: metal : add GATED_LINEAR_ATTN op#1334
loci-dev wants to merge 1 commit intomainfrom
loci/pr-21452-metal-gla-op

loci-dev commented Apr 5, 2026

Uh oh!

loci-review Bot commented Apr 5, 2026

Uh oh!

loci-review Bot commented Apr 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

loci-dev commented Apr 5, 2026

Overview

Additional information

Requirements

Uh oh!

loci-review Bot commented Apr 5, 2026

Uh oh!

loci-review Bot commented Apr 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants