Emma Qiao
653aa6b6dc
[None][infra] Waive failed tests for main 10/21 ( #8524 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-10-21 06:24:15 -04:00
Yan Chunwei
9ba5959e8e
[None][fix] the api_stability unify default values of None and inspect._empty ( #8496 )
...
Signed-off-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
2025-10-21 16:57:40 +08:00
xinhe-nv
c566890624
[TRTLLM-8638][fix] Remove closed bugs ( #8478 )
...
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-10-21 03:48:58 -04:00
Pengyun Lin
a4227cf1b0
[None][feat] Support Qwen3 reasoning parser ( #8000 )
...
Signed-off-by: Pengyun Lin <81065165+LinPoly@users.noreply.github.com>
2025-10-21 14:08:39 +08:00
xinhe-nv
3264d605fb
[TRTLLM-8638][fix] Add failed cases into waives.txt ( #8486 )
...
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-10-21 01:20:29 -04:00
ruodil
ab4b9966b2
[TRTLLM-7287][test] add multimodal chunked_prefill cases ( #8011 )
...
Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com>
Co-authored-by: Ruodi Lu <ruodil@users.noreply.github.com>
Co-authored-by: Larry Xu <197874197+LarryXFly@users.noreply.github.com>
2025-10-20 22:43:47 -04:00
mpikulski
87eb5086fb
[None][fix] restore list[list[list[int]]] in add_token ( #8502 )
...
Signed-off-by: ixlmar <206748156+ixlmar@users.noreply.github.com>
2025-10-20 22:34:57 -04:00
Suyog Gupta
7050b1ea49
[ #8272 ][feat] Enable chunked prefill for SSMs in AutoDeploy ( #8477 )
...
Signed-off-by: Suyog Gupta <41447211+suyoggupta@users.noreply.github.com>
2025-10-20 15:31:52 -07:00
Venky
3e681e2a80
[None] [chore] Add architecture-specific ATTRIBUTIONS files ( #8468 )
...
Signed-off-by: Venky Ganesh <23023424+venkywonka@users.noreply.github.com>
2025-10-20 16:29:15 -04:00
Lucas Liebenwein
55c468b218
[ #8461 ][feat] AutoDeploy: trtllm-serve bug fix + unit test ( #8462 )
...
Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
2025-10-20 16:06:39 -04:00
dongfengy
9b289d5230
[ https://nvbugs/5568676 ][fix] Remove test waive ( #8437 )
...
Signed-off-by: Dongfeng Yu <dongfengy@nvidia.com>
2025-10-20 12:03:50 -07:00
HuiGao-NV
d0663e16e0
[ https://nvbugs/5492250 ][fix] Remove isolated cases and unwaive cases ( #8492 )
...
Signed-off-by: Hui Gao <huig@nvidia.com>
2025-10-20 07:40:07 -04:00
Pamela Peng
b818a912d7
[ https://nvbugs/5540752 ][fix] Support quantized Phi4 MM models ( #8190 )
...
Signed-off-by: Pamela <179191831+pamelap-nvidia@users.noreply.github.com>
2025-10-20 06:36:09 -04:00
mpikulski
97ce0ecefe
[TRTLLM-8436][feat] batched sampling and top-k logprobs improvements ( #8398 )
...
Signed-off-by: ixlmar <206748156+ixlmar@users.noreply.github.com>
2025-10-20 11:15:41 +02:00
QI JUN
d05079ba4b
[None][ci] move some test cases from H100 to A10 ( #8449 )
...
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-10-20 01:58:34 -04:00
Yi Zhang
3c2b3bd4d4
[TRTLLM-7255][feat] Add iteration log parser script for benchmark log ( #6942 )
...
Signed-off-by: Yi Zhang <187001205+yizhang-nv@users.noreply.github.com>
2025-10-20 01:34:52 -04:00
ChristinaZ
c8b9998acb
[TRTLLM-8637][feat] Optimize the routing kernel for DeepseekV3 (MoE CUTLASS backend); Add support for KimiK2 and Qwen-next (MoE TRTLLM backend) ( #7761 )
...
Signed-off-by: Christina Zhang <83400082+ChristinaZ@users.noreply.github.com>
2025-10-20 10:08:31 +08:00
xiweny
f7722e2b65
[TRTLLM-4866] [test] Support waiving unit tests by waives.txt ( #8359 )
...
Signed-off-by: Xiwen Yu <13230610+VALLIS-NERIA@users.noreply.github.com>
2025-10-20 09:52:51 +08:00
xinhe-nv
9aa086d3bb
[None][chore] update test duration ( #8377 )
...
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-10-19 20:45:51 -04:00
Emma Qiao
796891ba2a
[None][infra] Skip a failed case in pre-merge for main on 10/19 ( #8479 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-10-19 22:19:00 +08:00
Bo Deng
dd25595ae8
[TRTLLM-7964][infra] Set nixl to default cache transceiver backend ( #7926 )
...
Signed-off-by: Bo Deng <deemod@nvidia.com>
2025-10-19 19:24:43 +08:00
Emma Qiao
e185173240
[None][infra] Waive test for main branch on 10/18 ( #8472 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-10-19 04:36:42 -04:00
brb-nv
7cc65a6296
[None][chore] Waive failing transceiver test ( #8473 )
...
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-10-18 17:22:10 -04:00
Lucas Liebenwein
41169fb20c
[None][feat] AutoDeploy: chunked prefill support ( #8158 )
...
Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
2025-10-18 00:47:35 -07:00
Kyle McGill
136e0e6882
[None][feat] Enable CUDA graph support for KvConnectorWorker API ( #8275 )
...
Signed-off-by: Kyle McGill <kmcgill@nvidia.com>
Signed-off-by: Kyle McGill <101670481+nv-kmcgill53@users.noreply.github.com>
2025-10-17 18:09:03 -04:00
Anish Shanbhag
5ff4f88be6
[TRTLLM-8683][chore] Migrate PluginConfig to Pydantic ( #8277 )
...
Signed-off-by: Anish Shanbhag <ashanbhag@nvidia.com>
2025-10-17 16:13:22 -04:00
h-guo18
55fed1873c
[None][chore] AutoDeploy: cleanup old inference optimizer configs ( #8039 )
...
Signed-off-by: h-guo18 <67671475+h-guo18@users.noreply.github.com>
Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
Co-authored-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
2025-10-17 15:55:57 -04:00
xinhe-nv
bc833d3de3
[TRTLLM-8638][fix] add waives tests ( #8445 )
...
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-10-17 03:37:53 -07:00
zhhuang-nv
7a2bab93f0
[None][test] Add post merge test for Seed-OSS-36B-Instruct ( #8321 )
...
Signed-off-by: Zhen Huang <145532724+zhhuang-nv@users.noreply.github.com>
2025-10-17 02:30:33 -07:00
yufeiwu-nv
1e1f430163
[None][test] Filter out all fp8 test case for A100. ( #8420 )
...
Signed-off-by: yufeiwu <230315618+yufeiwu-nv@users.noreply.github.com>
2025-10-16 20:42:50 -07:00
Ivy Zhang
70a0f5beb6
[TRTLLM-8580][test] save runtime report periodically ( #8312 )
...
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
2025-10-17 10:47:26 +08:00
John Calderon
46ee7acb33
[TRTLLM-6780][fix] Add multimodal data to dummy requests during memory profiling ( #7539 )
...
Signed-off-by: John Calderon <johncalesp@gmail.com>
Signed-off-by: John Calderon <jcalderon@nvidia.com>
Signed-off-by: john calderon <jcalderon@nvidia.com>
Signed-off-by: John Calderon <jcalderon@nvidia>
2025-10-16 17:49:22 +02:00
Yiqing Yan
05dd437084
[ https://nvbugs/5565541 ][fix] Add timeout threshold for H100 FHMA test ( #8354 )
...
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
2025-10-16 22:46:19 +08:00
bhsueh_NV
69325e1aa3
[ https://nvbugs/5574556 ][fix] fix bug of Qwen3_235B_A22B::test_fp8 CI ( #8351 )
...
Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
2025-10-16 22:46:19 +08:00
Lizhi Zhou
982d4b65e8
[ https://nvbugs/5550671 ][fix] fix disagg-serving multinodes test failure ( #8307 )
...
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
2025-10-16 22:46:19 +08:00
Chuang Zhu
18a534d2b4
[ https://nvbugs/5465642 ][fix] Increase server timeout to wait weight loading ( #8297 )
...
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
2025-10-16 22:46:19 +08:00
Enwei Zhu
526cad37d7
[ https://nvbugs/5568951 ][fix] Fix guided decoding disagg tests ( #8311 )
...
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
2025-10-16 22:46:19 +08:00
Yechan Kim
4230639370
[ https://nvbugs/5550722 ][fix] Fix image load ( #8093 )
...
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
2025-10-16 22:46:19 +08:00
Ivy Zhang
1b559ba91d
[None][chore] Update test configs for release ( #8224 )
...
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
2025-10-16 22:46:19 +08:00
Ivy Zhang
4789c1e588
[TRTLLM-8246][test] add multimodal kvcache+chunked_prefil cases in to QA test list ( #8212 )
...
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
2025-10-16 22:46:19 +08:00
Ivy Zhang
be2ab98233
[None][chore] Update constaintfor release ( #8211 )
...
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
2025-10-16 22:46:19 +08:00
Yan Chunwei
4e51148088
[ https://nvbugs/5532023 ][fix] unwaive GenerationExecutor tests ( #8251 )
...
Signed-off-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
2025-10-16 22:46:19 +08:00
Yukun He
179c7dc501
[ https://nvbugs/5536131 ][fix] Fix illegal access issue when scale is not provided in Llama3/4. ( #7960 )
...
Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
2025-10-16 22:46:19 +08:00
sunnyqgg
dd61454d5f
[ https://nvbugs/5461761 ][fix] Unwaive eagle3 test ( #8363 )
...
Signed-off-by: qgai <qgai@nvidia.com>
2025-10-16 09:51:48 -04:00
Wangjue Yao
9865d3d770
[None][feat] Support cached tokens for Openai server ( #7637 )
...
Signed-off-by: wjueyao <wyao123@terpmail.umd.edu>
Co-authored-by: Pengyun Lin <81065165+LinPoly@users.noreply.github.com>
2025-10-16 20:51:37 +08:00
xinhe-nv
f70eff30b3
[TRTLLM-8638][fix] waive llam4 tests on H20 ( #8416 )
...
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-10-16 03:14:56 -07:00
HuiGao-NV
4e6a492aa3
[None][chore] Isolate several intermittent cases ( #8408 )
...
Signed-off-by: Hui Gao <huig@nvidia.com>
2025-10-15 23:48:31 -07:00
Yan Chunwei
42ab473bb0
[ https://nvbugs/5583261 ][ci] waive test_fetch_responses_streaming_sync ( #8407 )
...
Signed-off-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
2025-10-15 23:19:31 -07:00
Min Yu
0a0159fdd8
[ https://nvbugs/5378031 ] [feat] W4A8 AWQ MoE supports Per Expert Pre-quant Scale Factor for PyT backend ( #7286 )
...
Signed-off-by: Min Yu <171526537+yumin066@users.noreply.github.com>
2025-10-16 11:07:48 +08:00
xiweny
4143887370
[ https://nvbugs/5541494 ] [fix] Remove waivers ( #8353 )
...
Signed-off-by: xiweny <13230610+VALLIS-NERIA@users.noreply.github.com>
2025-10-15 19:10:35 -07:00