-
Notifications
You must be signed in to change notification settings - Fork 2.4k
Pull requests: NVIDIA/TensorRT-LLM
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[None][perf] Optimize DeepSeek-V4 compressor BF16 input
deepseek-v4
#13761
opened May 5, 2026 by
mingyangHao
Collaborator
Loading…
1 task
Add log for raw model weights memory consumption
#13760
opened May 5, 2026 by
HuiGao-NV
Collaborator
Loading…
1 task
[https://nvbugs/5805494] Limit maximum warmup token count to prevent crash in autotuner
#13758
opened May 5, 2026 by
dbari
Collaborator
Loading…
1 task
[None][perf] optimize trtllm-gen fused attention preprocessing
#13757
opened May 5, 2026 by
yihwang-nv
Collaborator
•
Draft
[TRTLLM-12457][test] split _torch/speculative into hw-agnostic subdir
#13756
opened May 5, 2026 by
QiJune
Collaborator
Loading…
1 task done
[https://nvbugs/6115562][fix] wait for disagg worker preparation
#13755
opened May 5, 2026 by
reasonsolo
Collaborator
Loading…
1 task done
[TRTLLM-12453][fix] Accommodate chunked prefill in Nemotron's EVS merging logic
#13754
opened May 5, 2026 by
moraxu
Collaborator
Loading…
1 task done
[#13320][fix] Propagate FlashMLA tokens_per_block override onto kv_cache_config
#13752
opened May 5, 2026 by
eopXD
Collaborator
Loading…
6 of 8 tasks
[TRTLLM-12089][test] split thop/parallel into hw-agnostic siblings
#13751
opened May 5, 2026 by
QiJune
Collaborator
Loading…
1 task done
[https://nvbugs/5615248][fix] Reduce beam-search prefill->decode handoff cost
#13748
opened May 5, 2026 by
brb-nv
Collaborator
Loading…
1 task done
[None][chore] Update flashinfer-python from 0.6.9 to 0.6.10
#13746
opened May 5, 2026 by
yihwang-nv
Collaborator
Loading…
3 of 4 tasks
[None][feat] AutoDeploy: Support Gemma4 mixed-shape pools in KVCacheManager
#13745
opened May 5, 2026 by
eopXD
Collaborator
Loading…
3 tasks done
[None][fix] Gate cudaProfilerStart/Stop on iter_counter, not loop counter
#13744
opened May 5, 2026 by
Tabrizian
Member
Loading…
3 of 4 tasks
[https://nvbugs/6115290][fix] Fix GPT OSS 120B GB200 Test Regression
#13743
opened May 5, 2026 by
yijingl-nvidia
Collaborator
Loading…
1 task
Disagg cancellation + fail-closed (PR #13728 rebased onto v1.3.0rc13)
Community want to contribute
PRs initiated from Community
[https://nvbugs/6108841][fix] add hidden_dim=6144 router GEMM instantiation for GLM-5
#13740
opened May 4, 2026 by
yijingl-nvidia
Collaborator
Loading…
1 task done
[None][test] Add Wan 2.2 5B TI2V pipeline test in CI
#13739
opened May 4, 2026 by
chang-l
Collaborator
Loading…
6 tasks done
[None][fix] rename torch -> pytorch in tests
Community want to contribute
PRs initiated from Community
#13738
opened May 4, 2026 by
tfogal
Loading…
[None][infra] Allow MPI to truly be disabled.
Community want to contribute
PRs initiated from Community
#13737
opened May 4, 2026 by
tfogal
Loading…
[None][chore] Import torchvision when used, not at module-level.
Community want to contribute
PRs initiated from Community
#13736
opened May 4, 2026 by
tfogal
Loading…
[None][fix] Drop allInputShapesSpecified asserts.
Community want to contribute
PRs initiated from Community
#13735
opened May 4, 2026 by
tfogal
Loading…
[None][infra] Allow BUILD_DEEP_EP=OFF for wheels
Community want to contribute
PRs initiated from Community
#13734
opened May 4, 2026 by
tfogal
Loading…
Previous Next
ProTip!
no:milestone will show everything without a milestone.