Skip to content

Pull requests: lightseekorg/tokenspeed

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

perf: fuse kv write for decode only
#281 opened May 27, 2026 by borontion Contributor Loading…
Replace sampling FlashInfer backend with TokenSpeed Triton
#280 opened May 27, 2026 by FlamingoPg Contributor Loading…
feat(memory-saver): optional CPU staging for round-trip weight preservation
#275 opened May 27, 2026 by qywu Collaborator Draft
3 of 5 tasks
feat(memory-saver): wrap CUDA graphs and attention workspaces in saver.region()
#274 opened May 27, 2026 by qywu Collaborator Draft
3 of 5 tasks
feat: expose POST /release_memory_occupation and /resume_memory_occupation
#272 opened May 27, 2026 by qywu Collaborator Loading…
6 tasks done
feat(engine): add pause_generation / continue_generation API
#270 opened May 26, 2026 by qywu Collaborator Loading…
5 tasks
feat(scheduler): per-adapter KV prefix-cache namespace + max_loras batch cap
#268 opened May 26, 2026 by qywu Collaborator Loading…
5 tasks
fix(vlm): substitute mm pad ids in drafter before draft embed_tokens
#265 opened May 26, 2026 by chenht2022 Contributor Loading…
2 tasks
[WIP] ci(perf): add 1m perf bench for qwen3.5
#264 opened May 26, 2026 by minedec Contributor Loading…
feat(spec-decode): add native DFlash support
#263 opened May 26, 2026 by mesaleh Loading…
perf(sampling): fuse logits fp32 cast to argmax or softmax
#262 opened May 26, 2026 by syuoni Member Loading…
fix(deepseek): guard missing quant weight_block_size
#261 opened May 26, 2026 by mesaleh Loading…
ci(perf): add qwen3.5 agentic perf ci bs16 case
#257 opened May 26, 2026 by minedec Contributor Loading…
perf: TokenSpeed MLA decode kernel optimization for num_heads=16
#255 opened May 26, 2026 by dishengbin Contributor Loading…
ci(eval): add Kimi-K2.5-NVFP4 ocr_bench task
#253 opened May 26, 2026 by zhyncs Member Draft
feat(eplb): eplb support high priority
#251 opened May 26, 2026 by XucSh Contributor Loading…
refactor(scheduler): unify hybrid cache model boundaries.
#243 opened May 25, 2026 by SimonCqk Contributor Loading…
Support DP sampling for spec decode
#232 opened May 24, 2026 by yubofredwang Contributor Loading…
Use operand format signatures for kernel selection
#230 opened May 23, 2026 by antiagainst Member Loading…
ProTip! Mix and match filters to narrow down what you’re looking for.