heyuhhh
e3f27e06c7
[None][chore] Waive star attention unittests ( #10439 )
...
Signed-off-by: yuhangh <58161490+heyuhhh@users.noreply.github.com>
2026-01-16 10:12:32 +08:00
Anish Shanbhag
faa80e73fd
[None][feat] Auto download speculative models from HF for pytorch backend, add speculative_model field alias ( #10099 )
...
Signed-off-by: Anish Shanbhag <ashanbhag@nvidia.com>
2026-01-14 21:06:07 -08:00
mpikulski
052c36ddd2
[TRTLLM-9522][feat] support image_embeds in OpenAI API ( #9715 )
...
Signed-off-by: ixlmar <206748156+ixlmar@users.noreply.github.com>
2026-01-14 10:31:03 +01:00
JunyiXu-nv
e291a834db
[TRTLLM-8462][feat] Support GET/DELETE v1/responses/{response_id} ( #9937 )
...
Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
2026-01-13 03:57:14 -05:00
JennyLiu
2967d299fb
[TRTLLM-10271][test] Add Spark QA functional and performance cases ( #10564 )
...
Signed-off-by: Jenny Liu <JennyLiu-nv+JennyLiu@users.noreply.github.com>
Co-authored-by: Jenny Liu <JennyLiu-nv+JennyLiu@users.noreply.github.com>
2026-01-13 13:20:15 +08:00
William Zhang
ff7eb93f31
[ https://nvbugs/5669097 ][tests] Add MMMU test for mistral small ( #10530 )
...
Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>
2026-01-09 16:09:28 -08:00
Mike Iovine
db2614ef10
[ https://nvbugs/5772414 ][fix] Fix draft token tree depth=1 corner case ( #10385 )
...
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
2026-01-05 17:20:14 +01:00
xinhe-nv
827d12caaf
[ https://nvbugs/5558516 ][test] add disaggregated stress test ( #9354 )
...
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-12-31 16:47:36 +08:00
Venky
dfa11d810e
[TRTC-102][docs] --extra_llm_api_options->--config in docs/examples/tests ( #10005 )
2025-12-19 13:48:43 -05:00
JunyiXu-nv
356ad4fe3a
[ https://nvbugs/5722653 ][fix] Address port conflict by assigning different port section in the same node. ( #10035 )
...
Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
2025-12-19 15:34:04 +08:00
xinhe-nv
c1cfb61b1b
[TRTLLM-9381][feat] Add kimi k2 fp4 tests ( #9906 )
...
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-12-17 18:15:27 -08:00
Lizhi Zhou
bd13957e70
[TRTLLM-9181][feat] improve disagg-server prometheus metrics; synchronize workers' clocks when workers are dynamic ( #9726 )
...
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
2025-12-16 05:16:32 -08:00
Balaram Buddharaju
dfc8799352
[ https://nvbugs/5669114 ][fix] Switch to MMMU benchmark for Gemma3 27B ( #9966 )
...
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-12-14 21:23:59 -08:00
xinhe-nv
e8efeb765d
[TRTLLM-9717][fix] fix multi nodes tests cases ( #9736 )
...
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-12-12 10:14:23 +08:00
Frank
f6df9eb2a6
[TRTLLM-9089][chore] Port prepare_dataset into trtllm-bench ( #9250 )
2025-12-08 10:37:40 -08:00
JunyiXu-nv
6d2daec5d0
[TRTLLM-8274][feat] Check if executor is shutdown in /health entrypoint ( #9057 )
...
Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
2025-12-04 13:49:40 +08:00
dominicshanshan
6345074686
[None][chore] Weekly mass integration of release/1.1 -- rebase ( #9522 )
...
Signed-off-by: yunruis <205571022+yunruis@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
Signed-off-by: qgai <qgai@nvidia.com>
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
Signed-off-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
Signed-off-by: Simeng Liu <simengl@nvidia.com>
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
Signed-off-by: Jin Li <59594262+liji-nv@users.noreply.github.com>
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
Signed-off-by: Vincent Zhang <vinczhang@nvidia.com>
Signed-off-by: peaceh <103117813+peaceh-nv@users.noreply.github.com>
Signed-off-by: Michal Guzek <mguzek@nvidia.com>
Signed-off-by: Michal Guzek <moraxu@users.noreply.github.com>
Signed-off-by: Chang Liu (Enterprise Products) <9713593+chang-l@users.noreply.github.com>
Signed-off-by: leslie-fang25 <leslief@nvidia.com>
Signed-off-by: Shunkang <182541032+Shunkangz@users.noreply.github.co>
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
Co-authored-by: yunruis <205571022+yunruis@users.noreply.github.com>
Co-authored-by: sunnyqgg <159101675+sunnyqgg@users.noreply.github.com>
Co-authored-by: brb-nv <169953907+brb-nv@users.noreply.github.com>
Co-authored-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
Co-authored-by: JunyiXu-nv <219237550+JunyiXu-nv@users.noreply.github.com>
Co-authored-by: Simeng Liu <109828133+SimengLiu-nv@users.noreply.github.com>
Co-authored-by: Guoming Zhang <137257613+nv-guomingz@users.noreply.github.com>
Co-authored-by: Jin Li <59594262+liji-nv@users.noreply.github.com>
Co-authored-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
Co-authored-by: Vincent Zhang <vcheungyi@163.com>
Co-authored-by: peaceh-nv <103117813+peaceh-nv@users.noreply.github.com>
Co-authored-by: Michal Guzek <moraxu@users.noreply.github.com>
Co-authored-by: Chang Liu <9713593+chang-l@users.noreply.github.com>
Co-authored-by: Leslie Fang <leslief@nvidia.com>
Co-authored-by: Shunkangz <182541032+Shunkangz@users.noreply.github.com>
Co-authored-by: Shunkang <182541032+Shunkangz@users.noreply.github.co>
Co-authored-by: QI JUN <22017000+QiJune@users.noreply.github.com>
2025-11-29 21:48:48 +08:00
JunyiXu-nv
b7308a4000
[ https://nvbugs/5580099 ][fix] Cherry pick IMA issue fix from release/1.1 ( #9032 )
...
Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
2025-11-26 13:09:06 +08:00
Wanli Jiang
d100599ea7
[TRTLLM-9264][fix] Add accuracy/unit tests/doc for phi4mm ( #9246 )
...
Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>
2025-11-26 11:12:35 +08:00
YueWeng
cc336c4abd
[TRTLLM-8160][feat] Add draft token tree runtime on CDL ( #8586 )
...
Signed-off-by: Yue Weng <25103990+yweng0828@users.noreply.github.com>
2025-11-25 09:40:55 -05:00
Barry Kang
a3433dd54e
[ https://nvbugs/5325296 ][fix] Enable relaxed acceptance test on Blackwell ( #8709 )
...
Signed-off-by: Barry Kang <43644113+Barry-Delaney@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-11-20 12:43:13 -05:00
Ivy Zhang
25c0624750
[None][test] Clean cache for certain easily hang cases ( #8619 )
...
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
Co-authored-by: Larry Xu <197874197+LarryXFly@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-11-20 12:43:13 -05:00
Ivy Zhang
160b361588
[TRTLLM-8949][test] Add rcca test case for eagle3 consistency check ( #9088 )
...
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
2025-11-18 05:55:00 -08:00
Wanli Jiang
ebdd1cc8e0
[TRTLLM-8119][feat] Update doc/tests/chat_template for nano-v2-vlm ( #8840 )
...
Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>
2025-11-11 07:48:23 -08:00
Yechan Kim
0938a3ad2a
[ https://nvbugs/5644187 ][fix] Llava-Next MMMU bugfix and Phi4 test bugfix ( #9034 )
...
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
2025-11-11 10:24:31 +09:00
Stanley Sun
def9c0004d
[TRTLLM-8113][test] Add pytorch workflow e2e tests with pp enabled ( #8357 )
...
Signed-off-by: Stanley Sun <stsun@nvidia.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
2025-11-04 16:42:31 +08:00
Yechan Kim
f48968b6cc
[TRTLLM-6928][fix] Refactor multimodal unittest ( #8453 )
...
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
2025-11-03 06:01:07 -08:00
Pengyun Lin
2aade46d18
[TRTLLM-8214][feat] Support Qwen3 tool parser ( #8216 )
...
Signed-off-by: Pengyun Lin <81065165+LinPoly@users.noreply.github.com>
2025-10-29 15:48:29 +08:00
Yechan Kim
a6017f6266
[ https://nvbugs/5608723 ][fix] Use local data on multimodal tests and unwaive tests ( #8673 )
...
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
2025-10-28 09:20:02 +09:00
Simeng Liu
2b27810198
[ https://nvbugs/5494718 ][fix] Fix Single GPU Multi-node issue and OOM on DGX Spark ( #8514 )
...
Signed-off-by: Simeng Liu <simengl@nvidia.com>
2025-10-24 19:09:07 -07:00
xinhe-nv
2aaedd08cd
[TRTLLM-8638][fix] fix test issues ( #8557 )
...
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-10-24 02:16:55 -04:00
xinhe-nv
04e2b2752a
[None][feat] add Nemotron-Ultra multi nodes eval tests ( #8577 )
...
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-10-23 02:44:26 -04:00
Pamela Peng
b818a912d7
[ https://nvbugs/5540752 ][fix] Support quantized Phi4 MM models ( #8190 )
...
Signed-off-by: Pamela <179191831+pamelap-nvidia@users.noreply.github.com>
2025-10-20 06:36:09 -04:00
Lizhi Zhou
982d4b65e8
[ https://nvbugs/5550671 ][fix] fix disagg-serving multinodes test failure ( #8307 )
...
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
2025-10-16 22:46:19 +08:00
Ivy Zhang
1b559ba91d
[None][chore] Update test configs for release ( #8224 )
...
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
2025-10-16 22:46:19 +08:00
Jin Li
206a9930df
[ https://nvbugs/5547435 ][fix] Fix a merge conflict ( #8365 )
...
Signed-off-by: Jin Li <59594262+liji-nv@users.noreply.github.com>
2025-10-15 10:43:10 +08:00
xinhe-nv
371fcb0338
[TRTLLM-8366][feat] add kimi multi nodes case ( #8025 )
...
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-10-13 21:36:03 -07:00
xinhe-nv
b555f1ff98
[None][chore] Add failed cases into waives.txt ( #8229 )
...
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-10-09 23:45:28 -07:00
Mike Iovine
7facac077b
[None][fix] Fix MTP illegal memory access ( #8161 )
...
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
2025-10-07 14:02:55 -04:00
Faraz
27a5091fcb
[None][feat] GPT-OSS Sm120/Sm121 Support ( #7937 )
...
Signed-off-by: Perkz Zheng <67892460+PerkzZheng@users.noreply.github.com>
Signed-off-by: list <58580514+farazkh80@users.noreply.github.com>
Signed-off-by: Vincent Huang <vincenth@nvidia.com>
Co-authored-by: Perkz Zheng <67892460+PerkzZheng@users.noreply.github.com>
Co-authored-by: Vincent Huang <vincenth@nvidia.com>
2025-10-06 16:59:06 -04:00
Ivy Zhang
0ecafd84da
[None][chore] Update chunked prefill test case configs ( #7868 )
...
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
2025-09-29 10:37:34 +08:00
Pamela Peng
b1dc84b4a3
[TRTLLM-7399][test] Add DS-R1/Qwen3 test cases for RTX 6000 ( #7662 )
...
Signed-off-by: Pamela <179191831+pamelap-nvidia@users.noreply.github.com>
Signed-off-by: Pamela Peng <179191831+pamelap-nvidia@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
2025-09-24 11:40:26 -04:00
Enwei Zhu
a1a57e83b8
[TRTLLM-5235][feat] Enable regex and EBNF grammar in trtllm-serve ( #7925 )
...
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
2025-09-24 18:30:23 +08:00
Lizhi Zhou
7550251988
[TRTLLM-7182][test] add multi-nodes test for disagg-serving ( #7470 )
...
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
2025-09-24 08:31:56 +08:00
Wanli Jiang
f5bfd68a50
[ https://nvbugs/5509024 ][fix] Print full parsed outputs and update keywords for multimodal model ( #7670 )
...
Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
2025-09-22 14:28:38 +08:00
xinhe-nv
efb763402f
[None][chore] Add failed cases into waives.txt ( #7841 )
...
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-09-19 17:59:47 +08:00
sunnyqgg
80dd8fe197
[TRTLLM-6746][feat] Enable two-model spec dec for MTP Eagle ( #7001 )
...
Signed-off-by: qgai <qgai@nvidia.com>
2025-09-18 12:05:36 -04:00
Wanli Jiang
fe104dc20d
[TRTLLM-7918][feat] Support kvcache reuse and chunk prefill for phi4mm ( #7723 )
...
Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>
2025-09-18 17:37:16 +08:00
Wanli Jiang
a7ca0fff54
[TRTLLM-6577][feat] Support nano_v2_vlm in pytorch backend ( #7207 )
...
Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>
2025-09-18 16:26:20 +08:00
Ivy Zhang
26d50eb539
[TRTLLM-8070][test] add generation logits case for llama3 ( #7759 )
...
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
2025-09-18 13:33:16 +08:00