Commit Graph

179 Commits

Author SHA1 Message Date
heyuhhh
e3f27e06c7
[None][chore] Waive star attention unittests (#10439)
Signed-off-by: yuhangh <58161490+heyuhhh@users.noreply.github.com>
2026-01-16 10:12:32 +08:00
Anish Shanbhag
faa80e73fd
[None][feat] Auto download speculative models from HF for pytorch backend, add speculative_model field alias (#10099)
Signed-off-by: Anish Shanbhag <ashanbhag@nvidia.com>
2026-01-14 21:06:07 -08:00
mpikulski
052c36ddd2
[TRTLLM-9522][feat] support image_embeds in OpenAI API (#9715)
Signed-off-by: ixlmar <206748156+ixlmar@users.noreply.github.com>
2026-01-14 10:31:03 +01:00
JunyiXu-nv
e291a834db
[TRTLLM-8462][feat] Support GET/DELETE v1/responses/{response_id} (#9937)
Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
2026-01-13 03:57:14 -05:00
JennyLiu
2967d299fb
[TRTLLM-10271][test] Add Spark QA functional and performance cases (#10564)
Signed-off-by: Jenny Liu <JennyLiu-nv+JennyLiu@users.noreply.github.com>
Co-authored-by: Jenny Liu <JennyLiu-nv+JennyLiu@users.noreply.github.com>
2026-01-13 13:20:15 +08:00
William Zhang
ff7eb93f31
[https://nvbugs/5669097][tests] Add MMMU test for mistral small (#10530)
Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>
2026-01-09 16:09:28 -08:00
Mike Iovine
db2614ef10
[https://nvbugs/5772414][fix] Fix draft token tree depth=1 corner case (#10385)
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
2026-01-05 17:20:14 +01:00
xinhe-nv
827d12caaf
[https://nvbugs/5558516][test] add disaggregated stress test (#9354)
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-12-31 16:47:36 +08:00
Venky
dfa11d810e
[TRTC-102][docs] --extra_llm_api_options->--config in docs/examples/tests (#10005) 2025-12-19 13:48:43 -05:00
JunyiXu-nv
356ad4fe3a
[https://nvbugs/5722653][fix] Address port conflict by assigning different port section in the same node. (#10035)
Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
2025-12-19 15:34:04 +08:00
xinhe-nv
c1cfb61b1b
[TRTLLM-9381][feat] Add kimi k2 fp4 tests (#9906)
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-12-17 18:15:27 -08:00
Lizhi Zhou
bd13957e70
[TRTLLM-9181][feat] improve disagg-server prometheus metrics; synchronize workers' clocks when workers are dynamic (#9726)
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
2025-12-16 05:16:32 -08:00
Balaram Buddharaju
dfc8799352
[https://nvbugs/5669114][fix] Switch to MMMU benchmark for Gemma3 27B (#9966)
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-12-14 21:23:59 -08:00
xinhe-nv
e8efeb765d
[TRTLLM-9717][fix] fix multi nodes tests cases (#9736)
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-12-12 10:14:23 +08:00
Frank
f6df9eb2a6
[TRTLLM-9089][chore] Port prepare_dataset into trtllm-bench (#9250) 2025-12-08 10:37:40 -08:00
JunyiXu-nv
6d2daec5d0
[TRTLLM-8274][feat] Check if executor is shutdown in /health entrypoint (#9057)
Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
2025-12-04 13:49:40 +08:00
dominicshanshan
6345074686
[None][chore] Weekly mass integration of release/1.1 -- rebase (#9522)
Signed-off-by: yunruis <205571022+yunruis@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
Signed-off-by: qgai <qgai@nvidia.com>
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
Signed-off-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
Signed-off-by: Simeng Liu <simengl@nvidia.com>
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
Signed-off-by: Jin Li <59594262+liji-nv@users.noreply.github.com>
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
Signed-off-by: Vincent Zhang <vinczhang@nvidia.com>
Signed-off-by: peaceh <103117813+peaceh-nv@users.noreply.github.com>
Signed-off-by: Michal Guzek <mguzek@nvidia.com>
Signed-off-by: Michal Guzek <moraxu@users.noreply.github.com>
Signed-off-by: Chang Liu (Enterprise Products) <9713593+chang-l@users.noreply.github.com>
Signed-off-by: leslie-fang25 <leslief@nvidia.com>
Signed-off-by: Shunkang <182541032+Shunkangz@users.noreply.github.co>
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
Co-authored-by: yunruis <205571022+yunruis@users.noreply.github.com>
Co-authored-by: sunnyqgg <159101675+sunnyqgg@users.noreply.github.com>
Co-authored-by: brb-nv <169953907+brb-nv@users.noreply.github.com>
Co-authored-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
Co-authored-by: JunyiXu-nv <219237550+JunyiXu-nv@users.noreply.github.com>
Co-authored-by: Simeng Liu <109828133+SimengLiu-nv@users.noreply.github.com>
Co-authored-by: Guoming Zhang <137257613+nv-guomingz@users.noreply.github.com>
Co-authored-by: Jin Li <59594262+liji-nv@users.noreply.github.com>
Co-authored-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
Co-authored-by: Vincent Zhang <vcheungyi@163.com>
Co-authored-by: peaceh-nv <103117813+peaceh-nv@users.noreply.github.com>
Co-authored-by: Michal Guzek <moraxu@users.noreply.github.com>
Co-authored-by: Chang Liu <9713593+chang-l@users.noreply.github.com>
Co-authored-by: Leslie Fang <leslief@nvidia.com>
Co-authored-by: Shunkangz <182541032+Shunkangz@users.noreply.github.com>
Co-authored-by: Shunkang <182541032+Shunkangz@users.noreply.github.co>
Co-authored-by: QI JUN <22017000+QiJune@users.noreply.github.com>
2025-11-29 21:48:48 +08:00
JunyiXu-nv
b7308a4000
[https://nvbugs/5580099][fix] Cherry pick IMA issue fix from release/1.1 (#9032)
Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
2025-11-26 13:09:06 +08:00
Wanli Jiang
d100599ea7
[TRTLLM-9264][fix] Add accuracy/unit tests/doc for phi4mm (#9246)
Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>
2025-11-26 11:12:35 +08:00
YueWeng
cc336c4abd
[TRTLLM-8160][feat] Add draft token tree runtime on CDL (#8586)
Signed-off-by: Yue Weng <25103990+yweng0828@users.noreply.github.com>
2025-11-25 09:40:55 -05:00
Barry Kang
a3433dd54e [https://nvbugs/5325296][fix] Enable relaxed acceptance test on Blackwell (#8709)
Signed-off-by: Barry Kang <43644113+Barry-Delaney@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-11-20 12:43:13 -05:00
Ivy Zhang
25c0624750 [None][test] Clean cache for certain easily hang cases (#8619)
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
Co-authored-by: Larry Xu <197874197+LarryXFly@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-11-20 12:43:13 -05:00
Ivy Zhang
160b361588
[TRTLLM-8949][test] Add rcca test case for eagle3 consistency check (#9088)
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
2025-11-18 05:55:00 -08:00
Wanli Jiang
ebdd1cc8e0
[TRTLLM-8119][feat] Update doc/tests/chat_template for nano-v2-vlm (#8840)
Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>
2025-11-11 07:48:23 -08:00
Yechan Kim
0938a3ad2a
[https://nvbugs/5644187][fix] Llava-Next MMMU bugfix and Phi4 test bugfix (#9034)
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
2025-11-11 10:24:31 +09:00
Stanley Sun
def9c0004d [TRTLLM-8113][test] Add pytorch workflow e2e tests with pp enabled (#8357)
Signed-off-by: Stanley Sun <stsun@nvidia.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
2025-11-04 16:42:31 +08:00
Yechan Kim
f48968b6cc
[TRTLLM-6928][fix] Refactor multimodal unittest (#8453)
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
2025-11-03 06:01:07 -08:00
Pengyun Lin
2aade46d18
[TRTLLM-8214][feat] Support Qwen3 tool parser (#8216)
Signed-off-by: Pengyun Lin <81065165+LinPoly@users.noreply.github.com>
2025-10-29 15:48:29 +08:00
Yechan Kim
a6017f6266
[https://nvbugs/5608723][fix] Use local data on multimodal tests and unwaive tests (#8673)
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
2025-10-28 09:20:02 +09:00
Simeng Liu
2b27810198
[https://nvbugs/5494718][fix] Fix Single GPU Multi-node issue and OOM on DGX Spark (#8514)
Signed-off-by: Simeng Liu <simengl@nvidia.com>
2025-10-24 19:09:07 -07:00
xinhe-nv
2aaedd08cd
[TRTLLM-8638][fix] fix test issues (#8557)
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-10-24 02:16:55 -04:00
xinhe-nv
04e2b2752a
[None][feat] add Nemotron-Ultra multi nodes eval tests (#8577)
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-10-23 02:44:26 -04:00
Pamela Peng
b818a912d7
[https://nvbugs/5540752][fix] Support quantized Phi4 MM models (#8190)
Signed-off-by: Pamela <179191831+pamelap-nvidia@users.noreply.github.com>
2025-10-20 06:36:09 -04:00
Lizhi Zhou
982d4b65e8 [https://nvbugs/5550671][fix] fix disagg-serving multinodes test failure (#8307)
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
2025-10-16 22:46:19 +08:00
Ivy Zhang
1b559ba91d [None][chore] Update test configs for release (#8224)
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
2025-10-16 22:46:19 +08:00
Jin Li
206a9930df
[https://nvbugs/5547435][fix] Fix a merge conflict (#8365)
Signed-off-by: Jin Li <59594262+liji-nv@users.noreply.github.com>
2025-10-15 10:43:10 +08:00
xinhe-nv
371fcb0338
[TRTLLM-8366][feat] add kimi multi nodes case (#8025)
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-10-13 21:36:03 -07:00
xinhe-nv
b555f1ff98
[None][chore] Add failed cases into waives.txt (#8229)
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-10-09 23:45:28 -07:00
Mike Iovine
7facac077b
[None][fix] Fix MTP illegal memory access (#8161)
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
2025-10-07 14:02:55 -04:00
Faraz
27a5091fcb
[None][feat] GPT-OSS Sm120/Sm121 Support (#7937)
Signed-off-by: Perkz Zheng <67892460+PerkzZheng@users.noreply.github.com>
Signed-off-by: list <58580514+farazkh80@users.noreply.github.com>
Signed-off-by: Vincent Huang <vincenth@nvidia.com>
Co-authored-by: Perkz Zheng <67892460+PerkzZheng@users.noreply.github.com>
Co-authored-by: Vincent Huang <vincenth@nvidia.com>
2025-10-06 16:59:06 -04:00
Ivy Zhang
0ecafd84da
[None][chore] Update chunked prefill test case configs (#7868)
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
2025-09-29 10:37:34 +08:00
Pamela Peng
b1dc84b4a3
[TRTLLM-7399][test] Add DS-R1/Qwen3 test cases for RTX 6000 (#7662)
Signed-off-by: Pamela <179191831+pamelap-nvidia@users.noreply.github.com>
Signed-off-by: Pamela Peng <179191831+pamelap-nvidia@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
2025-09-24 11:40:26 -04:00
Enwei Zhu
a1a57e83b8
[TRTLLM-5235][feat] Enable regex and EBNF grammar in trtllm-serve (#7925)
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
2025-09-24 18:30:23 +08:00
Lizhi Zhou
7550251988
[TRTLLM-7182][test] add multi-nodes test for disagg-serving (#7470)
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
2025-09-24 08:31:56 +08:00
Wanli Jiang
f5bfd68a50 [https://nvbugs/5509024][fix] Print full parsed outputs and update keywords for multimodal model (#7670)
Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
2025-09-22 14:28:38 +08:00
xinhe-nv
efb763402f
[None][chore] Add failed cases into waives.txt (#7841)
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-09-19 17:59:47 +08:00
sunnyqgg
80dd8fe197
[TRTLLM-6746][feat] Enable two-model spec dec for MTP Eagle (#7001)
Signed-off-by: qgai <qgai@nvidia.com>
2025-09-18 12:05:36 -04:00
Wanli Jiang
fe104dc20d
[TRTLLM-7918][feat] Support kvcache reuse and chunk prefill for phi4mm (#7723)
Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>
2025-09-18 17:37:16 +08:00
Wanli Jiang
a7ca0fff54
[TRTLLM-6577][feat] Support nano_v2_vlm in pytorch backend (#7207)
Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>
2025-09-18 16:26:20 +08:00
Ivy Zhang
26d50eb539
[TRTLLM-8070][test] add generation logits case for llama3 (#7759)
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
2025-09-18 13:33:16 +08:00