Matthias Jouanneaux
|
eda1467061
|
[TRTLLM-5966][feat] Helix: add alltoall op (#6815)
Signed-off-by: Matthias Jouanneaux <mjoux@nvidia.com>
|
2025-09-25 07:18:29 -07:00 |
|
PeganovAnton
|
396c0ea677
|
[None][chore] relax version constraints on fastapi (#7935)
Signed-off-by: Anton Peganov <apeganov@nvidia.com>
Co-authored-by: Pengyun Lin <81065165+LinPoly@users.noreply.github.com>
|
2025-09-25 21:58:53 +08:00 |
|
Yueh-Ting (eop) Chen
|
c5012423f5
|
[None][chore] Remove developer name in comment (#7981)
Signed-off-by: eopXD <yuehtingc@nvidia.com>
|
2025-09-25 06:43:38 -07:00 |
|
Yan Chunwei
|
40c6103ef8
|
[None][doc] add Llama PP known issue to release note (#7959)
Signed-off-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
|
2025-09-25 21:02:35 +08:00 |
|
Guoming Zhang
|
663ce3a4de
|
[None][doc] fix invalid links in perf benchmarking. (#7933)
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
|
2025-09-25 21:02:35 +08:00 |
|
Guoming Zhang
|
202bed4574
|
[None][chroe] Rename TensorRT-LLM to TensorRT LLM for source code. (#7851)
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
|
2025-09-25 21:02:35 +08:00 |
|
QI JUN
|
961418908c
|
[https://nvbugs/5531963][fix] cherry pick #7725 (#7907)
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
|
2025-09-25 21:02:35 +08:00 |
|
Yan Chunwei
|
5999fab146
|
[https://nvbugs/5427043][fix] cherrypick: request length exceeds max_num_tokens (#7718)
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
|
2025-09-25 21:02:35 +08:00 |
|
Yan Chunwei
|
cb466a846d
|
[None][fix] api stability bug in status label (#7861)
Signed-off-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
|
2025-09-25 21:02:35 +08:00 |
|
Yan Chunwei
|
9d48898def
|
[None][doc] add stable label to all the un-labelled arguments in LLM class (#7863)
Signed-off-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
|
2025-09-25 21:02:35 +08:00 |
|
Zac Patel
|
c38d4cf6a6
|
[None][doc] Update Perf-Overview.md for release/1.0 (#7848)
Signed-off-by: zpatel <22306219+zbpatel@users.noreply.github.com>
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
|
2025-09-25 21:02:35 +08:00 |
|
Yan Chunwei
|
57c098956e
|
[None][doc] add a guide for modifying APIs (#7866)
Signed-off-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
|
2025-09-25 21:02:35 +08:00 |
|
Guoming Zhang
|
9f0f52249e
|
[None][doc] Rename TensorRT-LLM to TensorRT LLM for homepage and the … (#7850)
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
|
2025-09-25 21:02:35 +08:00 |
|
Guoming Zhang
|
5ecc8d0ee2
|
[None][doc] Replace the main in the examples' link with commit id. (#7837)
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
|
2025-09-25 21:02:35 +08:00 |
|
Yan Chunwei
|
5342c607cd
|
[https://nvbugs/5516710][fix] fix Llama 3.3 TP PP case (#7717)
Signed-off-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
|
2025-09-25 21:02:35 +08:00 |
|
Tao Li @ NVIDIA
|
44d7c3b245
|
[https://nvbugs/1234567][fix] Revert https://github.com/NVIDIA/TensorRT-LLM/pull/7768/files (#7813)
Signed-off-by: Tao Li
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
|
2025-09-25 21:02:35 +08:00 |
|
Guoming Zhang
|
4a09be40f0
|
[None][doc] Update docker cmd in quick start guide and trtllm-serve … (#7787)
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
|
2025-09-25 21:02:35 +08:00 |
|
xinhe-nv
|
e30d9aced9
|
[https://nvbugs/4955671][fix] update test list (#7980)
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
|
2025-09-25 02:58:09 -07:00 |
|
Chuang Zhu
|
791e73edf6
|
[https://nvbugs/5536141][fix] fix_disagg_single_gpu_test (#7990)
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
|
2025-09-25 02:07:22 -07:00 |
|
Jinyang Yuan
|
b622cde5d5
|
[None][perf] Fix the tactic sorting in TrtllmGenBatchedGemmRunner::getValidConfigIndices (#7419)
Signed-off-by: Jinyang Yuan <154768711+jinyangyuan-nvidia@users.noreply.github.com>
|
2025-09-25 10:27:57 +02:00 |
|
Emma Qiao
|
cb53261aaf
|
[None][infra] Unwaive some tests since dev already have a PR to collect more info (#7984)
Signed-off-by: qqiao <qqiao@nvidia.com>
|
2025-09-25 01:03:13 -07:00 |
|
Wanli Jiang
|
22b45ff9c7
|
[TRTLLM-7758][feat] Phi4-mm image modality inference optimization (#7918)
Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>
|
2025-09-25 15:58:29 +08:00 |
|
WeiHaocheng
|
259cc66c34
|
[None][doc] scaffolding tech blog part one (#7835)
Signed-off-by: Fred Wei <20514172+WeiHaocheng@users.noreply.github.com>
Signed-off-by: zheyuf <zheyuf@NVIDIA.com>
Co-authored-by: zheyuf <zheyuf@NVIDIA.com>
|
2025-09-25 14:41:59 +08:00 |
|
fredricz-20070104
|
0945403174
|
[TRTLLM-6541][test] Add NIM perf test cases (#7924)
Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com>
|
2025-09-25 13:15:26 +08:00 |
|
Guoming Zhang
|
bb6067176f
|
[None][chroe] Update the cuda and tensorrt version in homepage icons. (#7963)
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
|
2025-09-24 19:20:04 -07:00 |
|
Aurelien Chartier
|
98726a3bed
|
[None][chore] Update trtllm-bench documentation on setting FP8 KV cache (#7885)
Signed-off-by: Aurelien Chartier <2567591+achartier@users.noreply.github.com>
|
2025-09-25 09:28:53 +08:00 |
|
Void
|
336c2ef540
|
[None][feat] DeepEP LL fp8 dispatch/combine (#7927)
Signed-off-by: Yilin Zhang <18275976+yilin-void@users.noreply.github.com>
|
2025-09-25 09:20:24 +08:00 |
|
Iman Tabrizian
|
be7e51727e
|
[https://nvbugs/5456485][bug] unwaive triton test (#7966)
Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
|
2025-09-24 17:02:55 -07:00 |
|
Leslie Fang
|
342014069e
|
[None][chore] Validate features combination (#7630)
Signed-off-by: leslie-fang25 <leslief@nvidia.com>
|
2025-09-25 08:01:13 +08:00 |
|
Iman Tabrizian
|
da30d496b0
|
[None][fix] Revert "[None][feat] Return topk logprobs in torch backend (#7756)" (#7969)
Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
|
2025-09-24 15:36:38 -07:00 |
|
sychen52
|
5a65af24cd
|
[OMNIML-2336][feat] Add NVFP4 x FP8 moe kernels (#7821)
Signed-off-by: Shiyang Chen <shiychen@nvidia.com>
|
2025-09-24 12:14:35 -07:00 |
|
Iman Tabrizian
|
6d45cd163e
|
[None][bug] Fix transformers version for Triton backend (#7964)
Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
|
2025-09-24 12:55:52 -04:00 |
|
Mike Iovine
|
42c2ec3239
|
[https://nvbugs/5473781][fix] Fix llama 4 FP8 for PP>1 (#7220)
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
|
2025-09-24 12:16:27 -04:00 |
|
Pamela Peng
|
b1dc84b4a3
|
[TRTLLM-7399][test] Add DS-R1/Qwen3 test cases for RTX 6000 (#7662)
Signed-off-by: Pamela <179191831+pamelap-nvidia@users.noreply.github.com>
Signed-off-by: Pamela Peng <179191831+pamelap-nvidia@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
|
2025-09-24 11:40:26 -04:00 |
|
Yuxian Qiu
|
48fda86c56
|
[None][fix] Fix dummy load format for DeepSeek. (#7874)
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
|
2025-09-24 23:03:16 +08:00 |
|
Macrocell
|
6e5e8b8a3b
|
[None][fix] fix get_iteration_stats IndexError (#7216)
Signed-off-by: yuhongwei <yumiao.yhw@antgroup.com>
Co-authored-by: yuhongwei <yumiao.yhw@antgroup.com>
|
2025-09-24 22:43:03 +08:00 |
|
Eran Geva
|
603517f72a
|
[#7675][feat] CapturedGraph to support max_batch_size > max(cuda_graph_batch_sizes) (#7888)
Signed-off-by: Eran Geva <19514940+MrGeva@users.noreply.github.com>
|
2025-09-24 10:11:44 -04:00 |
|
Yuan Tong
|
51bef1beb0
|
[None][chore] cleanup build script (#7865)
Signed-off-by: Yuan Tong <13075180+tongyuantongyu@users.noreply.github.com>
|
2025-09-24 21:15:01 +08:00 |
|
Perkz Zheng
|
60101eb8a5
|
[None][fix] trtllm-gen cubins compiled with wrong arch. (#7953)
Signed-off-by: Perkz Zheng <67892460+PerkzZheng@users.noreply.github.com>
|
2025-09-24 04:13:36 -07:00 |
|
HuiGao-NV
|
c8bda4b3a9
|
[None][ci] Waive some intermittent failures (#7955)
Signed-off-by: Hui Gao <huig@nvidia.com>
|
2025-09-24 19:00:38 +08:00 |
|
Necofish
|
cfbcf9b9e8
|
[None][feat] Support Seed-OSS model in pytorch backend (#7496)
Signed-off-by: Nekofish-L <liuxiangyang@mail.ustc.edu.cn>
|
2025-09-24 03:57:12 -07:00 |
|
Enwei Zhu
|
a1a57e83b8
|
[TRTLLM-5235][feat] Enable regex and EBNF grammar in trtllm-serve (#7925)
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
|
2025-09-24 18:30:23 +08:00 |
|
xinhe-nv
|
b8bfa63197
|
[None][chore] add test_w4_1gpu[True-True-cutlass-fp8] & TestKimiK2::test_fp8_blocks… (#7944)
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
|
2025-09-24 03:25:17 -07:00 |
|
QI JUN
|
18ff1e31b8
|
[None][ci] remove duplicate test cases (#7956)
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
|
2025-09-24 17:47:22 +08:00 |
|
yufeiwu-nv
|
f323b74d42
|
[None][test] Update llm_models_root to improve path handling on BareMetal environment (#7876)
Signed-off-by: yufeiwu <230315618+yufeiwu-nv@users.noreply.github.com>
Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: ruodil <200874449+ruodil@users.noreply.github.com>
|
2025-09-24 17:35:57 +08:00 |
|
HuiGao-NV
|
29e63d3bc2
|
[https://nvbugs/5532248][fix] Fix fused_moe OOM (#7931)
Signed-off-by: Hui Gao <huig@nvidia.com>
|
2025-09-24 02:22:38 -07:00 |
|
JunyiXu-nv
|
6654b78c94
|
[https://nvbugs/5521799][fix] Trim incorrectly generated harmony messages (#7849)
Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
|
2025-09-24 16:38:43 +08:00 |
|
Li Min
|
0252cee4c3
|
[None][chore] Recover cutlass-dsl pkg install and dsl op testing. (#7945)
Signed-off-by: Mindy Li <11663212+limin2021@users.noreply.github.com>
|
2025-09-24 15:45:18 +08:00 |
|
QI JUN
|
946ffcd2eb
|
[None][ci] optimize test cases of dgx b200 (#7948)
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
|
2025-09-24 00:39:45 -07:00 |
|
Cao Dong
|
2f8dc6feb0
|
[None][feat] Return topk logprobs in torch backend (#7756)
Signed-off-by: Dong Cao <docao@nvidia.com>
|
2025-09-24 15:30:39 +08:00 |
|