-
Notifications
You must be signed in to change notification settings - Fork 947
Pull requests: flashinfer-ai/flashinfer
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
fix(dcp_alltoall): require MNNVL workspace, drop broken plain-memory path
op: comm
v0.6.11
release blocker label for 0.6.11
#3210
opened Apr 30, 2026 by
davidjpyu
Contributor
Loading…
3 tasks done
fix: Update top-k generation not to produce duplicate expert ids
op: comm
op: moe
#3208
opened Apr 30, 2026 by
djns99
Contributor
Loading…
3 of 5 tasks
fix: guard NVFP4 KV cache on legacy decode kernel
op: attention
#3207
opened Apr 30, 2026 by
leonardHONG
Loading…
5 tasks done
ci: diagnose AOT build/import memory usage
#3204
opened Apr 29, 2026 by
dierksen
Collaborator
Loading…
Include TinyGEMM into BF16 autotuner
arch: DGX Spark
op: gemm
run-ci
v0.6.11
release blocker label for 0.6.11
#3203
opened Apr 29, 2026 by
askliar
Contributor
Loading…
Enable TRT-LLM Gen sparse MLA block-sparse path
#3199
opened Apr 28, 2026 by
saltyminty
Collaborator
Loading…
fix(cute_dsl/moe): correct off-by-one in get_max_num_tiles to match TRT-LLM
op: moe
v0.6.11
release blocker label for 0.6.11
#3198
opened Apr 28, 2026 by
leejnau
Contributor
Loading…
5 tasks done
perf(moe): optimize SM120 b12x MoE short decode
op: moe
run-ci
#3193
opened Apr 27, 2026 by
lukealonso
Loading…
5 tasks done
Yanqinz/fix cudnn sm120 nan
op: gemm
run-ci
#3192
opened Apr 27, 2026 by
yanqinz2
Collaborator
Loading…
3 of 5 tasks
fix(sm12x): fix micro-kernel workspace sizing when routed_rows > num_local_experts
op: moe
#3191
opened Apr 27, 2026 by
meena-at-work
Contributor
Loading…
1 task
Add
set_autotune_process_group to synchronize tactic choice across ranks
#3187
opened Apr 27, 2026 by
thanhhao98
Loading…
feat: enable glm5 router gemm
op: gemm
#3185
opened Apr 26, 2026 by
b8zhong
Contributor
Loading…
5 tasks done
test: enable bmm_mxfp8 cutlass backend coverage on SM12x
arch: DGX Spark
op: gemm
run-ci
#3183
opened Apr 26, 2026 by
leonardHONG
Loading…
4 of 5 tasks
test: align test_fmha_v2_prefill SM gating with is_sm12x_supported
arch: DGX Spark
op: attention
run-ci
#3182
opened Apr 26, 2026 by
leonardHONG
Loading…
4 of 5 tasks
cute-dsl fmha prefill (cubin integration): remove front-padding, add attention_sink, and pdl support
op: attention
#3181
opened Apr 26, 2026 by
limin2021
Contributor
Loading…
5 tasks done
Fix/3170 dense blockscaled sm12x
arch: DGX Spark
op: gemm
run-ci
#3180
opened Apr 26, 2026 by
leonardHONG
Loading…
3 of 5 tasks
feat: fmha fwd cutlass supports fp8 output
op: attention
#3177
opened Apr 25, 2026 by
carlyou
Loading…
3 of 4 tasks
fix: align is_sm120f_supported with SM12x family semantics
arch: DGX Spark
run-ci
#3175
opened Apr 25, 2026 by
leonardHONG
Loading…
2 of 5 tasks
doc: align user-facing SM120 messages with SM12x dispatch
arch: DGX Spark
op: attention
run-ci
#3174
opened Apr 25, 2026 by
leonardHONG
Loading…
3 of 5 tasks
fix: add sm_121 to TMEM column fallback map
arch: DGX Spark
run-ci
#3173
opened Apr 25, 2026 by
leonardHONG
Loading…
3 of 5 tasks
Previous Next
ProTip!
no:milestone will show everything without a milestone.