Commit Graph

4081 Commits

Author SHA1 Message Date
zhanghaotong
36c9e7cfe6
[None][chore] Add unittest for otlp tracing (#8716)
Signed-off-by: zhanghaotong <zhanghaotong.zht@antgroup.com>
Signed-off-by: Shunkang <182541032+Shunkangz@users.noreply.github.co>
2025-12-09 18:34:08 -08:00
dhansen-nvidia
2d33ae94d5
[https://nvbugs/5508301][feat] Move D->H copies to a worker thread whe… (#8463)
Signed-off-by: Dan Hansen <1+dhansen-nvidia@users.noreply.github.com>
Signed-off-by: dhansen-nvidia <218031328+dhansen-nvidia@users.noreply.github.com>
Co-authored-by: Dan Hansen <1+dhansen-nvidia@users.noreply.github.com>
2025-12-09 18:51:31 -05:00
Patrice Castonguay
414448bb37
[https://nvbugs/5719561][chore] Unwaive tests for nvbug 5719561 (#9801)
Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>
2025-12-09 18:21:50 -05:00
Patrice Castonguay
ff0ef19ee9
[https://nvbugs/5688388][chore] Unwaiving fixed disagg test (#9800)
Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>
2025-12-09 16:51:46 -05:00
Matt Lefebvre
5de4e3f621
[TRTINFRA-7328][infra] Consume SlurmCluster scratchPath and cleanup mounts (#9600)
Signed-off-by: Matt Lefebvre <mlefebvre@nvidia.com>
2025-12-09 13:34:09 -08:00
Eran Geva
4da3121363
[#8921][chore] AutoDeploy NanoV3 to use SYMM_MEM allreduce strategy (#9797)
Signed-off-by: Eran Geva <19514940+MrGeva@users.noreply.github.com>
2025-12-09 13:05:38 -08:00
Patrice Castonguay
7d7d05d8db
[None][chore] Adding flaky auto scaling test to waives (#9851)
Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>
2025-12-09 15:05:19 -05:00
Mike Iovine
07c76a5fac
[None][feat] Make 2-model spec dec use the 1-model kernels (Hopper) (#8810)
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
2025-12-09 11:06:31 -05:00
Dom Brown
3156f2e852
[https://nvbugs/5575841] [fix] Nvbug 5575841: Remove additional test waivers for TestMoEFP4 (#9788)
Signed-off-by: Dom Brown <3886319+DomBrown@users.noreply.github.com>
2025-12-09 13:37:55 +00:00
Emma Qiao
75bc386b65
[None][infra] Waive failed cases for main branch on 12/09 (#9839)
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-12-09 19:39:29 +08:00
QI JUN
58c29957d9
[TRTLLM-9794][ci] move qwen3-next test cases to gb200 (#9827)
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-12-09 01:58:25 -08:00
Stefan Niebler
d600b9f851
[TRTLLM-6756][feat] Update BeamSearch for TorchSampler (#9660)
Signed-off-by: Stefan Niebler <82932102+stnie@users.noreply.github.com>
2025-12-09 10:44:01 +01:00
Robin Kobus
76f49c903b
[None][fix] Additional model outputs for pipeline parallelism (#9794)
Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
2025-12-09 10:41:22 +01:00
Yiqing Yan
2ddcb45b2a
[None][chore] Generate lock file for release/1.2.0rc4.post1 branch automatically (#9829)
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
2025-12-09 16:34:17 +08:00
yufeiwu-nv
fbcf03040f
[None][test] Refactor qa/llm_perf_nim.yml test list (#9700)
Signed-off-by: yufeiwu <230315618+yufeiwu-nv@users.noreply.github.com>
2025-12-08 22:00:43 -08:00
QI JUN
252769c930
[TRTLLM-9794][ci] remove duplicated test cases in DGX B200 (#9817)
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-12-08 21:51:30 -08:00
Zhanrui Sun
309f92ec09
[None][infra] Use artifactory pypi mirror for Cython install (#9774)
Signed-off-by: ZhanruiSunCh <184402041+ZhanruiSunCh@users.noreply.github.com>
2025-12-09 13:49:41 +08:00
Shi Xiaowei
b050804b63
[TRTLLM-6537][infra] extend multi-gpu tests related file list (#9614)
Signed-off-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
2025-12-09 12:54:53 +08:00
JunyiXu-nv
90890785eb
[https://nvbugs/5722653][fix] Fix config file used by disagg_client (#9783)
Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
Signed-off-by: JunyiXu-nv <219237550+JunyiXu-nv@users.noreply.github.com>
2025-12-08 20:34:55 -08:00
Balaram Buddharaju
bafb60c1bc
[None][chore] Fix tests failing on pre-merge 12/08 (#9819)
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-12-08 20:08:52 -08:00
Bo Li
f2006a1f74
[https://nvbugs/5726066][infra] Waive timeout disaggregated/test_auto_scaling tests. (#9815)
Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>
2025-12-08 19:51:43 -08:00
TensorRT LLM
c7a2568872 [None][infra] Check in most recent lock file from nightly pipeline
Signed-off-by: TensorRT LLM <90828364+tensorrt-cicd@users.noreply.github.com>
2025-12-09 03:19:48 +00:00
JunyiXu-nv
f521f6d910
[None][fix] Fix unterminated process issue for RemoteOpenAIServer (#9490)
Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
2025-12-09 11:15:40 +08:00
Jiagan Cheng
4a3a66b124
[https://nvbugs/5677746][fix] Use first PP rank's schedule result in other PP ranks to fix PP hang (#9659)
Signed-off-by: Jiagan Cheng <jiaganc@nvidia.com>
2025-12-08 18:43:52 -08:00
bhsueh_NV
d6f961d3fe
[None][feat] Add llama4 scaling (#9771)
Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>
2025-12-09 10:27:39 +08:00
Tri Dao
1c4dacb19a
[None][fix] Fix PDL in TRTLLM MOE for dsv3 (#9799)
Signed-off-by: Tri Dao <daominhtri0503@gmail.com>
2025-12-09 10:16:29 +08:00
yuanjingx87
390391ebf1
[None][infra] Correct the waived test names due to a merge conflict (#9803)
Signed-off-by: Yuanjing Xue <197832395+yuanjingx87@users.noreply.github.com>
2025-12-09 09:48:21 +08:00
Chenghao Zhang
75f5446d67
[#9753][feat] AutoDeploy: Implement add rms_norm fusion (#9754)
Signed-off-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com>
2025-12-08 14:24:27 -08:00
Jhao-Ting Chen
da074be037
[None][fix] Fix #8383 introduced TRTLLM backend python error (#9804)
Signed-off-by: Jhao-Ting Chen <jhaotingc@nvidia.com>
2025-12-08 13:31:37 -08:00
Eran Geva
23cf72b0f8
[#8921][feat] Added symetric memory AllReduce strategy (#8919)
Signed-off-by: Eran Geva <19514940+MrGeva@users.noreply.github.com>
2025-12-08 13:12:56 -08:00
Thor Johnsen
f9380581c5
[https://nvbugs/5508267][fix] Proper handling of inactive canceled requests (#9280)
Signed-off-by: thorjohnsen <41591019+thorjohnsen@users.noreply.github.com>
2025-12-08 13:11:44 -08:00
Yibin Li
faabc1a387
[TRTLLM-7967][chore] Add more tests (#9415)
Signed-off-by: Yibin Li <109242046+yibinl-nvidia@users.noreply.github.com>
2025-12-08 11:57:32 -08:00
Jhao-Ting Chen
0a09465089
[https://nvbugs/5567586][feat] Ampere xqa swa specdec for GPT-OSS Eagle3-one-model (#8383)
Signed-off-by: Jhao-Ting Chen <jhaotingc@nvidia.com>
2025-12-08 11:16:05 -08:00
Frank
f6df9eb2a6
[TRTLLM-9089][chore] Port prepare_dataset into trtllm-bench (#9250) 2025-12-08 10:37:40 -08:00
sunnyqgg
1c7b7cdd47
[TRTLLM-9506][fix] Fix AR for DeepSeek-R1 2 model path (#9661)
Signed-off-by: qgai <qgai@nvidia.com>
2025-12-08 10:12:32 -05:00
Eran Geva
98db262a67
[None][fix] Switch AutoDeploy's default allreduce strategy to NCCL (#9666)
Signed-off-by: Eran Geva <19514940+MrGeva@users.noreply.github.com>
2025-12-08 03:26:21 -08:00
Lizhi Zhou
52f78e4000
[http://nvbugs/5649010][fix] fix test_auto_scaling.py::test_worker_restart timeout (#9775)
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
2025-12-08 03:26:01 -08:00
fredricz-20070104
96d9b67d65
[https://nvbugs/5527655][test] Add test case for RCCA 5527655 (#9511)
Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com>
2025-12-08 01:27:13 -08:00
fredricz-20070104
ededeecb0f
[None][test] Add Kimi k2 WIDEEP perf and accuracy cases (#9686)
Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com>
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
Co-authored-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
2025-12-08 01:25:07 -08:00
Zheng Duan
e7395c6607
[None][infra] update mooncake in docker images (#9584)
Signed-off-by: zhengd-nv <200704041+zhengd-nv@users.noreply.github.com>
Signed-off-by: Zheng Duan <200704041+zhengd-nv@users.noreply.github.com>
2025-12-08 16:56:40 +08:00
xinhe-nv
3f55c07223
[None][chore] Remove closed bugs (#9770)
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2025-12-07 22:51:55 -08:00
Guoming Zhang
448bb1a44f
[TRTLLM-9431][perf] Enable multistream for Linear Attention in Qwen3-… (#9696)
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
2025-12-08 13:39:12 +08:00
Li Min
a422d70be6
[None][chore] Enable tvm_ffi for cute dsl nvfp4_gemm to reduce host overhead. (#9690)
Signed-off-by: Mindy Li <11663212+limin2021@users.noreply.github.com>
2025-12-08 13:28:11 +08:00
Fanrong Li
2f526583fb
[None][chore] Move the rocketkv e2e test to post-merge (#9768)
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
2025-12-08 13:22:16 +08:00
Emma Qiao
137713a869
[None][infra] Waive failed cases for main on 12/08 (#9773)
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-12-07 20:18:29 -08:00
ruodil
d232709568
[https://nvbugs/5666804][test] only adding sampler config for limited models (#9512)
Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com>
Co-authored-by: Ruodi Lu <ruodil@users.noreply.github.com>
Co-authored-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>
Co-authored-by: Larry Xu <197874197+LarryXFly@users.noreply.github.com>
2025-12-07 19:40:29 -08:00
Kaiyu Xie
069b05cf3d
[TRTLLM-9706] [doc] Update wide EP documents (#9724)
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
2025-12-08 11:21:11 +08:00
TensorRT LLM
03f89d7aa4 [None][infra] Check in most recent lock file from nightly pipeline
Signed-off-by: TensorRT LLM <90828364+tensorrt-cicd@users.noreply.github.com>
2025-12-08 03:07:49 +00:00
Yukun He
8b9ab9a701
[None][fix] Fix two tuning cache miss issues. (#9743)
Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>
2025-12-08 10:47:21 +08:00
fredricz-20070104
9bfb6179ec
[https://nvbugs/5422621][test] Add GB 200 WIDEEP test case for RCCA 5422621 (#9506)
Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com>
2025-12-08 10:41:40 +08:00