Commit Graph

4120 Commits

Author SHA1 Message Date
dominicshanshan
093465ed29
[https://nvbugs/5599176][fix] Unwaive fixed test for Ray (#9861)
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
2025-12-12 11:24:05 +08:00
TensorRT LLM
0132769c22 [None][infra] Check in most recent lock file from nightly pipeline
Signed-off-by: TensorRT LLM <90828364+tensorrt-cicd@users.noreply.github.com>
2025-12-12 03:20:43 +00:00
Yiqing Yan
5065b60cd1
[None][infra] Fix mergeWaiveList stage (#9892)
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
2025-12-12 11:19:42 +08:00
xinhe-nv
e8efeb765d
[TRTLLM-9717][fix] fix multi nodes tests cases (#9736)
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-12-12 10:14:23 +08:00
Chuang Zhu
4670e0c297
[None][infra] update ucx to 1.20 (#9786)
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
2025-12-12 09:49:46 +08:00
JunyiXu-nv
710c592d7c
[https://nvbugs/5727517][fix] Preserve ip:port for disagg (#9859)
Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
2025-12-12 09:45:34 +08:00
Kanghwan
98c68c195b
[None][infra] Ignore comments from bots and CI accounts (#9929)
Signed-off-by: Kanghwan Jang <861393+karljang@users.noreply.github.com>
2025-12-12 09:20:51 +08:00
jthomson04
4f6d4da035
[None][perf] Fix TPOT when min_tokens set (#9862)
Signed-off-by: jthomson04 <jwillthomson19@gmail.com>
2025-12-11 13:55:31 -08:00
Kanghwan
95d928f071
[None][infra] Add workflow to auto-label 'waiting for feedback' on team comments (#9886)
Signed-off-by: Kanghwan Jang <861393+karljang@users.noreply.github.com>
2025-12-12 05:43:30 +08:00
Venky
fd1270b9ab
[TRTC-43] [feat] Add config db and docs (#9420)
Signed-off-by: Frank Di Natale <3429989+FrankD412@users.noreply.github.com>
Signed-off-by: Venky Ganesh <23023424+venkywonka@users.noreply.github.com>
Co-authored-by: Frank Di Natale <3429989+FrankD412@users.noreply.github.com>
2025-12-12 04:00:03 +08:00
Simeng Liu
24f92721f2
[https://nvbugs/5597647][ci] Unwaive fixed tests. (#9812)
Signed-off-by: SimengLiu-nv <simengl@nvidia.com>
2025-12-12 02:29:30 +08:00
Erin
89dabf5aa1
[TRTLLM-9736][feat] AsyncLLM and verl integ (#9353)
Signed-off-by: Liwei Ma <liweim@nvidia.com>
Signed-off-by: Yuan Tong <13075180+tongyuantongyu@users.noreply.github.com>
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
Signed-off-by: Erin Ho <14718778+hchings@users.noreply.github.com>
Co-authored-by: Liwei Ma <liweim@nvidia.com>
Co-authored-by: Yuan Tong <13075180+tongyuantongyu@users.noreply.github.com>
Co-authored-by: Superjomn <328693+Superjomn@users.noreply.github.com>
2025-12-11 09:33:25 -08:00
JadoTu
02edb19f43
[None] [feat] add eos_token_id in generation_config to sampling params (#9514)
Signed-off-by: jiant <107457950+JadoTu@users.noreply.github.com>
2025-12-12 00:52:03 +08:00
xxi
488d38f88d
[TRTLLM-8959][feat] ConfigurableMoE support CUTLASS (#9772) 2025-12-12 00:22:13 +08:00
Fanrong Li
af2849cc7a
[None][doc] Add DeepSeek-V3.2 to the supported models (#9893)
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
2025-12-11 18:04:48 +08:00
Yan Chunwei
04a39a4e2b
[None][chore] enable test_ipc.py (#9865)
Signed-off-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
2025-12-11 17:47:14 +08:00
Zongfei Jing
c76b428e2e
[TRTLLM-9685] [feat] Add gather fc1 kernel by cuteDSL (#9618)
Signed-off-by: Zongfei Jing <20381269+zongfeijing@users.noreply.github.com>
2025-12-11 16:21:32 +08:00
ChristinaZ
b8a5159fad
[None][feat] Enable PDL for indexer topK (#9843)
Signed-off-by: Christina Zhang <83400082+ChristinaZ@users.noreply.github.com>
2025-12-11 14:31:23 +08:00
Kanghwan
d147ad053e
[#2730][fix] Fix circular import bug in medusa/weight.py (#9866)
Signed-off-by: Kanghwan Jang <861393+karljang@users.noreply.github.com>
2025-12-11 13:51:08 +08:00
JunyiXu-nv
454e7e59e5
[https://nvbugs/5718004][fix] Add warmup for cancellation test (#9860)
Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
2025-12-11 12:20:33 +08:00
Ziyi Xiong
81222c3670
[None] Fix warning when capturing CUDA graph (#9746)
Signed-off-by: ziyixiong-nv <219238287+ziyixiong-nv@users.noreply.github.com>
2025-12-10 19:22:38 -08:00
Bo Deng
c1d53ee43d
[https://nvbugs/5582258][fix] unwaive (#9650)
Signed-off-by: Bo Deng <deemod@nvidia.com>
2025-12-10 19:18:30 -08:00
fredricz-20070104
341cb1a12c
[None][chore] Add GB300 support since it does not support segment (#9731)
Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com>
2025-12-10 18:36:55 -08:00
Patrice Castonguay
2c0293c612
[https://nvbugs/5601682][fix] Unwaiving disagg test (#9627)
Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>
2025-12-10 13:42:26 -05:00
Tian Zheng
ece3a8748f
[None][doc] Update doc for NVFP4 KV cache (#9475)
Signed-off-by: Tian Zheng <29906817+Tom-Zheng@users.noreply.github.com>
2025-12-10 06:20:12 -08:00
cheshirekow
2f030312a8
[TRTLLM-9228][infra] Verify thirdparty C++ process (#9367)
Signed-off-by: Josh Bialkowski <1309820+cheshirekow@users.noreply.github.com>
Co-authored-by: Josh Bialkowski <1309820+cheshirekow@users.noreply.github.com>
2025-12-10 21:01:19 +08:00
Yiqing Yan
1c11cae54d
[None][chore] bump version to 1.2.0rc6 (#9874)
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
2025-12-10 04:53:26 -08:00
Yukun He
072f236002
[None][fix] Fully resolve the tactic recovery issues in AutoTuner serialized cache (#9835)
Restrict tactic types to those compatible with AutoTuner cache serialization and deserialization.

Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>
2025-12-10 20:41:04 +08:00
Matt Lefebvre
df1adfbb50
[TRTINFRA-7328][infra] - Move half B200 tests to lbd (#9853)
Signed-off-by: Matt Lefebvre <mlefebvre@nvidia.com>
2025-12-10 04:24:30 -08:00
Brian K. Ryu
8cec2da375
[None][feat] Port fp4 quantization kernel optimization from FlashInfer (#9854)
Signed-off-by: Brian Ryu <bryu@nvidia.com>
Co-authored-by: Nikita Korobov <14355239+nekorobov@users.noreply.github.com>
2025-12-10 13:13:48 +01:00
Matt Lefebvre
8fefa2c9d1
[None][infra] Fail fast if SLURM entrypoint fails (#9744)
Signed-off-by: Matt Lefebvre <mlefebvre@nvidia.com>
2025-12-10 02:31:29 -08:00
Perkz Zheng
e34302986d
[https://nvbugs/5727952][fix] PDL bugs with trtllm-gen fmha kernels (#9863)
Signed-off-by: Perkz Zheng <67892460+PerkzZheng@users.noreply.github.com>
2025-12-10 01:47:03 -08:00
Guoming Zhang
12693a526b
[None][chore] Enable L0 multi-gpus testing for Qwen3-next (#9789)
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
2025-12-10 17:11:32 +08:00
Zhanrui Sun
49fe089470
[TRTLLM-9811][infra] Update urllib3 version >= 2.6.0 to fix high vulnerability issue (#9823)
Signed-off-by: ZhanruiSunCh <184402041+ZhanruiSunCh@users.noreply.github.com>
Signed-off-by: Zhanrui Sun <184402041+ZhanruiSunCh@users.noreply.github.com>
2025-12-10 00:18:11 -08:00
dominicshanshan
0e78a4b244
[https://nvbugs/5702791][fix] Unwaive fixed test (#9844)
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
2025-12-10 14:01:44 +08:00
Yukun He
979f37e443
[None][fix] Fix nvfp4 gemm allowed backends arg passing (#9837)
Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>
2025-12-09 20:09:53 -08:00
QI JUN
2c46126a93
[TRTLLM-9794][ci] move some deepseek test cases to gb200 (#9841)
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-12-09 19:54:51 -08:00
Bo Li
9d3c675a0b
[None][chore] Support larger topK for NVLinkOneSided AlltoAll. (#9816)
Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>
2025-12-10 11:10:55 +08:00
TensorRT LLM
6a39bb983c [None][infra] Check in most recent lock file from nightly pipeline
Signed-off-by: TensorRT LLM <90828364+tensorrt-cicd@users.noreply.github.com>
2025-12-10 03:07:34 +00:00
zhanghaotong
36c9e7cfe6
[None][chore] Add unittest for otlp tracing (#8716)
Signed-off-by: zhanghaotong <zhanghaotong.zht@antgroup.com>
Signed-off-by: Shunkang <182541032+Shunkangz@users.noreply.github.co>
2025-12-09 18:34:08 -08:00
dhansen-nvidia
2d33ae94d5
[https://nvbugs/5508301][feat] Move D->H copies to a worker thread whe… (#8463)
Signed-off-by: Dan Hansen <1+dhansen-nvidia@users.noreply.github.com>
Signed-off-by: dhansen-nvidia <218031328+dhansen-nvidia@users.noreply.github.com>
Co-authored-by: Dan Hansen <1+dhansen-nvidia@users.noreply.github.com>
2025-12-09 18:51:31 -05:00
Patrice Castonguay
414448bb37
[https://nvbugs/5719561][chore] Unwaive tests for nvbug 5719561 (#9801)
Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>
2025-12-09 18:21:50 -05:00
Patrice Castonguay
ff0ef19ee9
[https://nvbugs/5688388][chore] Unwaiving fixed disagg test (#9800)
Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>
2025-12-09 16:51:46 -05:00
Matt Lefebvre
5de4e3f621
[TRTINFRA-7328][infra] Consume SlurmCluster scratchPath and cleanup mounts (#9600)
Signed-off-by: Matt Lefebvre <mlefebvre@nvidia.com>
2025-12-09 13:34:09 -08:00
Eran Geva
4da3121363
[#8921][chore] AutoDeploy NanoV3 to use SYMM_MEM allreduce strategy (#9797)
Signed-off-by: Eran Geva <19514940+MrGeva@users.noreply.github.com>
2025-12-09 13:05:38 -08:00
Patrice Castonguay
7d7d05d8db
[None][chore] Adding flaky auto scaling test to waives (#9851)
Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>
2025-12-09 15:05:19 -05:00
Mike Iovine
07c76a5fac
[None][feat] Make 2-model spec dec use the 1-model kernels (Hopper) (#8810)
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
2025-12-09 11:06:31 -05:00
Dom Brown
3156f2e852
[https://nvbugs/5575841] [fix] Nvbug 5575841: Remove additional test waivers for TestMoEFP4 (#9788)
Signed-off-by: Dom Brown <3886319+DomBrown@users.noreply.github.com>
2025-12-09 13:37:55 +00:00
Emma Qiao
75bc386b65
[None][infra] Waive failed cases for main branch on 12/09 (#9839)
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-12-09 19:39:29 +08:00
QI JUN
58c29957d9
[TRTLLM-9794][ci] move qwen3-next test cases to gb200 (#9827)
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-12-09 01:58:25 -08:00