-
Notifications
You must be signed in to change notification settings - Fork 122
Pull requests: lightseekorg/tokenspeed
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Update tokenspeed-smg* pins to 1.4.1.post20260527 / 0.4.8.post20260527 / 0.5.3.post20260527
#283
opened May 27, 2026 by
lightseek-bot
Contributor
Loading…
Replace sampling FlashInfer backend with TokenSpeed Triton
#280
opened May 27, 2026 by
FlamingoPg
Contributor
Loading…
feat: expose POST /release_memory_occupation and /resume_memory_occupation
#272
opened May 27, 2026 by
qywu
Collaborator
Loading…
6 tasks done
feat(engine): add pause_generation / continue_generation API
#270
opened May 26, 2026 by
qywu
Collaborator
Loading…
5 tasks
feat(scheduler): per-adapter KV prefix-cache namespace + max_loras batch cap
#268
opened May 26, 2026 by
qywu
Collaborator
Loading…
5 tasks
fix(vlm): clamp gpu_memory_utilization conditionally and add --language-model-only
#266
opened May 26, 2026 by
chenht2022
Contributor
Loading…
4 tasks
fix(vlm): substitute mm pad ids in drafter before draft embed_tokens
#265
opened May 26, 2026 by
chenht2022
Contributor
Loading…
2 tasks
[WIP] ci(perf): add 1m perf bench for qwen3.5
#264
opened May 26, 2026 by
minedec
Contributor
Loading…
perf(sampling): fuse logits fp32 cast to argmax or softmax
#262
opened May 26, 2026 by
syuoni
Member
Loading…
fix(trtllm-mla): make spec-decode CUDA graph capture causal
#260
opened May 26, 2026 by
mesaleh
Loading…
ci(perf): add qwen3.5 agentic perf ci bs16 case
#257
opened May 26, 2026 by
minedec
Contributor
Loading…
chore(logging): suppress noisy third-party server-launch warnings
#256
opened May 26, 2026 by
zhyncs
Member
Loading…
perf: TokenSpeed MLA decode kernel optimization for num_heads=16
#255
opened May 26, 2026 by
dishengbin
Contributor
Loading…
refactor(scheduler): unify hybrid cache model boundaries.
#243
opened May 25, 2026 by
SimonCqk
Contributor
Loading…
perf(deepseek-v4): vectorize read_deepseek_v4_indexer_fp8_cache
#238
opened May 24, 2026 by
yuanqingz
Loading…
Use operand format signatures for kernel selection
#230
opened May 23, 2026 by
antiagainst
Member
Loading…
Previous Next
ProTip!
Mix and match filters to narrow down what you’re looking for.