Skip to content

Pull requests: NVIDIA/TensorRT-LLM

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

[None][test] Waive 3 failed cases for main in QA CI
#13762 opened May 5, 2026 by xinhe-nv Collaborator Draft
[None][perf] Optimize DeepSeek-V4 compressor BF16 input deepseek-v4
#13761 opened May 5, 2026 by mingyangHao Collaborator Loading…
1 task
Add log for raw model weights memory consumption
#13760 opened May 5, 2026 by HuiGao-NV Collaborator Loading…
1 task
[None][test] Waive 24 failed cases for main in QA CI
#13759 opened May 5, 2026 by xinhe-nv Collaborator Draft
[TRTLLM-12457][test] split _torch/speculative into hw-agnostic subdir
#13756 opened May 5, 2026 by QiJune Collaborator Loading…
1 task done
[https://nvbugs/6115562][fix] wait for disagg worker preparation
#13755 opened May 5, 2026 by reasonsolo Collaborator Loading…
1 task done
[TRTLLM-12453][fix] Accommodate chunked prefill in Nemotron's EVS merging logic
#13754 opened May 5, 2026 by moraxu Collaborator Loading…
1 task done
[#13320][fix] Propagate FlashMLA tokens_per_block override onto kv_cache_config
#13752 opened May 5, 2026 by eopXD Collaborator Loading…
6 of 8 tasks
[TRTLLM-12089][test] split thop/parallel into hw-agnostic siblings
#13751 opened May 5, 2026 by QiJune Collaborator Loading…
1 task done
[TRTLLM-12429][tests] Add audio E2E test for nano v3 omni
#13750 opened May 5, 2026 by 2ez4bz Collaborator Draft
1 task
[https://nvbugs/5615248][fix] Reduce beam-search prefill->decode handoff cost
#13748 opened May 5, 2026 by brb-nv Collaborator Loading…
1 task done
[None][chore] Update flashinfer-python from 0.6.9 to 0.6.10
#13746 opened May 5, 2026 by yihwang-nv Collaborator Loading…
3 of 4 tasks
[None][feat] AutoDeploy: Support Gemma4 mixed-shape pools in KVCacheManager
#13745 opened May 5, 2026 by eopXD Collaborator Loading…
3 tasks done
[None][fix] Gate cudaProfilerStart/Stop on iter_counter, not loop counter
#13744 opened May 5, 2026 by Tabrizian Member Loading…
3 of 4 tasks
[https://nvbugs/6115290][fix] Fix GPT OSS 120B GB200 Test Regression
#13743 opened May 5, 2026 by yijingl-nvidia Collaborator Loading…
1 task
[None][test] Add Wan 2.2 5B TI2V pipeline test in CI
#13739 opened May 4, 2026 by chang-l Collaborator Loading…
6 tasks done
[None][fix] rename torch -> pytorch in tests Community want to contribute PRs initiated from Community
#13738 opened May 4, 2026 by tfogal Loading…
[None][infra] Allow MPI to truly be disabled. Community want to contribute PRs initiated from Community
#13737 opened May 4, 2026 by tfogal Loading…
[None][fix] Drop allInputShapesSpecified asserts. Community want to contribute PRs initiated from Community
#13735 opened May 4, 2026 by tfogal Loading…
[None][infra] Allow BUILD_DEEP_EP=OFF for wheels Community want to contribute PRs initiated from Community
#13734 opened May 4, 2026 by tfogal Loading…
ProTip! no:milestone will show everything without a milestone.