TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-02-16 15:55:08 +08:00

Author	SHA1	Message	Date
yuanjingx87	ca499d600d	[None][infra] Waive failed test in Post-Merge (#11491 ) Signed-off-by: Yuanjing Xue <197832395+yuanjingx87@users.noreply.github.com>	2026-02-12 22:57:17 -08:00
Balaram Buddharaju	db35119c7c	[None][chore] Waive test blocking pre-merge (#11498 ) Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>	2026-02-12 20:08:14 -08:00
xxi	2565f0f4e4	[TRTLLM-9108][feat] refactor MoE unit tests: add unified ConfigurableMoE test framework (#11437 ) Signed-off-by: xxi <xxi@nvidia.com>	2026-02-13 11:05:38 +08:00
Yukun He	cb1d8d130f	[TRTLLM-10791][feat] TorchSampler general host time optimization (#11141 ) Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>	2026-02-12 18:05:58 +01:00
Pamela Peng	4b2b1d146b	[https://nvbugs/5810935 ][test] unwaive RTX 6000 pro tests (#11452 ) Signed-off-by: Pamela <179191831+pamelap-nvidia@users.noreply.github.com>	2026-02-12 11:17:45 -05:00
Wanli Jiang	421eb9e39c	[None][feat] Optimize NemotronH model with elementwise and nvfp4 fusion (#11273 ) Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>	2026-02-12 09:25:31 -05:00
xinhe-nv	ef7830d137	[None][chore] Add failed cases into waives.txt (#11447 ) Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>	2026-02-12 07:47:25 -05:00
JennyLiu	11d79aa875	[https://nvbugs/5832481 ][test] Add gpt-oss-120b-Eagle3-throughput case on DGX-Spark (#11419 ) Signed-off-by: Jenny Liu <JennyLiu-nv+JennyLiu@users.noreply.github.com> Co-authored-by: Jenny Liu <JennyLiu-nv+JennyLiu@users.noreply.github.com>	2026-02-12 05:33:39 -05:00
Tailing Yuan	31cdbdfd72	[https://nvbugs/5808500 ][chore] Move DeepEPLowLatency tests to machines that support IBGDA with GPU handles (#11178 ) Signed-off-by: Tailing Yuan <yuantailing@gmail.com>	2026-02-12 03:58:01 -05:00
mpikulski	d0f3c412ff	[TRTLLM-10030][chore] refactor finish reasons tests (#11445 ) Signed-off-by: ixlmar <206748156+ixlmar@users.noreply.github.com>	2026-02-12 08:32:50 +01:00
xinhe-nv	3c1323442b	[None][chore] Add failed cases into waives.txt (#11451 ) Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>	2026-02-12 02:31:34 -05:00
Simeng Liu	12085536df	[TRTLLM-10487][feat] Add user-provided UUID support for multimodal KV cache identification. (#11075 ) Signed-off-by: SimengLiu-nv <simengl@nvidia.com>	2026-02-12 00:48:47 -05:00
Perkz Zheng	e0b11d6ea0	[https://nvbugs/5804923 ][none] unwaive test (#11005 ) Signed-off-by: Perkz Zheng <67892460+PerkzZheng@users.noreply.github.com>	2026-02-12 13:26:28 +08:00
William Zhang	ca9537e17c	[TRTLLM-10858][feat] Multi-image support for EPD disagg (#11264 ) * Why? Prior to this commit, we only supported a single multimodal input for E/P/D disaggregated serving. * What? This commit does a minor refactor of the multimodal embedding handles that cross process boundaries to enable this. Existing unit tests are updated accordingly to test this. The `RequestOutput` has its `mm_embedding_handle` replaced in favor of `disaggregated_params`, addressing a previous TODO. Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>	2026-02-11 20:50:00 -08:00
xinhe-nv	42648734b8	[None][chore] Add failed cases into waives.txt (#11392 ) Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com> Signed-off-by: Jie Li <76780849+jieli-matrix@users.noreply.github.com> Co-authored-by: Jie Li <76780849+jieli-matrix@users.noreply.github.com>	2026-02-11 21:52:29 -05:00
Liao Lanyu	58165d5394	[None][chore] Introduceing an abstract WaitingQueue interface to decouple the request scheduling logic from specific queue implementations (#11330 ) Signed-off-by: Lanyu Liao <lancelly@users.noreply.github.com> Signed-off-by: Lance Liao <108499334+lancelly@users.noreply.github.com> Co-authored-by: Lanyu Liao <lancelly@users.noreply.github.com>	2026-02-12 09:18:24 +08:00
Emma Qiao	8ebd6056fa	[None][infra] Waive failed cases for main on 2/11 (#11441 ) Signed-off-by: qqiao <qqiao@nvidia.com>	2026-02-11 15:25:52 +08:00
Bo Li	5ea6888dda	[https://nvbugs/5810940 ][fix] Update lm_eval to 4.9.10 and re-enable Skip Softmax Attention tests on CI. (#11176 ) Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com> Signed-off-by: Tian Zheng <29906817+Tom-Zheng@users.noreply.github.com> Co-authored-by: Tian Zheng <29906817+Tom-Zheng@users.noreply.github.com>	2026-02-11 00:54:40 -05:00
peihengh	a982554190	[https://nvbugs/5868038 ][fix] Gracefully terminate disagg serving servers to prevent leftover subprocess warnings (#11395 ) Signed-off-by: peihu-nv <259410613+peihu-nv@users.noreply.github.com>	2026-02-10 22:41:37 -05:00
Iman Tabrizian	7d992972b2	[TRTLLM-10273][feat] Move MambaCacheManager from Python to C++ (#10540 ) Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>	2026-02-10 07:20:56 -08:00
Yiqing Yan	cf02456613	[TRTLLM-9711][infra] Fix the testcase name in timeout xml (#9781 ) Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>	2026-02-10 18:50:42 +08:00
xinhe-nv	c7689df152	[None][chore] Add failed cases into waives.txt (#11396 ) Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com> Signed-off-by: Jie Li <76780849+jieli-matrix@users.noreply.github.com> Co-authored-by: Jie Li <76780849+jieli-matrix@users.noreply.github.com>	2026-02-10 05:50:16 -05:00
xinhe-nv	6e0659dc4d	[None][chore] Add failed cases into waives.txt (#11363 ) Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com> Signed-off-by: Jie Li <76780849+jieli-matrix@users.noreply.github.com> Co-authored-by: Jie Li <76780849+jieli-matrix@users.noreply.github.com>	2026-02-10 05:48:33 -05:00
dominicshanshan	2a4e70b4a9	[None][chore] Unwaive tests after last MI (#11400 ) Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>	2026-02-10 17:12:39 +08:00
Emma Qiao	8a74ccc57e	[None][infra] Waive failed cases for main branch on 02/10 (#11413 ) Signed-off-by: qqiao <qqiao@nvidia.com>	2026-02-10 03:21:59 -05:00
Yuxian Qiu	5f4df89109	[None][feat] Fully non-blocking pipeline parallelism executor loop. (#10349 ) Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>	2026-02-10 15:43:28 +08:00
shuyixiong	c3cdc93211	[TRTLLM-9771][feat] Make update_weights compatible with CUDA Graph (#11267 ) Signed-off-by: Shuyi Xiong <219646547+shuyixiong@users.noreply.github.com>	2026-02-10 01:12:49 -05:00
Lucas Liebenwein	a2fb5afecf	[#11032 ][feat] MLA revisited and GLM 4.7 Flash support (#11324 )	2026-02-09 23:26:51 -05:00
JennyLiu	b5508ed75b	[None][test] Add DGX-Spark multinode perf cases including eagle3 (#11184 ) Signed-off-by: Jenny Liu <JennyLiu-nv+JennyLiu@users.noreply.github.com> Co-authored-by: Jenny Liu <JennyLiu-nv+JennyLiu@users.noreply.github.com>	2026-02-10 10:44:41 +08:00
Mike Iovine	f33086914f	[https://nvbugs/5843112 ][chore] Unwaive ngram test (#11320 ) Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>	2026-02-09 21:31:29 -05:00
Lucas Liebenwein	fe4c690b6c	[https://nvbugs/5855540 ][fix] AutoDeploy: thread cleanup of eagle test (#11289 ) Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>	2026-02-09 18:01:12 -05:00
Ziyi Xiong	e76b634251	[TRTLLM-10321][feat] Support different KV cache layout for one-model spec dec (#10502 ) Signed-off-by: ziyixiong-nv <219238287+ziyixiong-nv@users.noreply.github.com>	2026-02-10 05:16:02 +08:00
Mike Iovine	092f4ce774	[https://nvbugs/5853997 ][chore] Unwaive gpt-oss test (#11287 ) Signed-off-by: Mike Iovine <miovine@nvidia.com>	2026-02-09 16:04:41 -05:00
Patrice Castonguay	c68d916b6f	[None][chore] Unit test for disagg gen cancellation (#11108 ) Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>	2026-02-09 14:39:02 -05:00
Lizhi Zhou	e719721a60	[TRTLLM-10866][feat] implement disaggregated harmony chat (#11336 ) Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>	2026-02-09 12:09:03 -05:00
Ivy Zhang	9384cf8458	[https://nvbugs/5839569 ][test] update test constraint (#11054 ) Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com> Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>	2026-02-09 23:53:40 +08:00
Emma Qiao	03b635bb08	[None][infra] Waive failed case for release on 1/28 (#11055 ) Signed-off-by: qqiao <qqiao@nvidia.com> Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>	2026-02-09 23:53:40 +08:00
Lizhi Zhou	1524c172a4	[https://nvbugs/5821433 ][fix] WAR for popen in QA env (#10989 ) Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com> Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>	2026-02-09 23:53:40 +08:00
Balaram Buddharaju	5f8b1b8cbb	[https://nvbugs/5811087 ][chore] Unwaive Gemma3 27B multimodal test (#11049 ) Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com> Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>	2026-02-09 23:53:40 +08:00
Enwei Zhu	1ba039f044	[https://nvbugs/5819452 ][ci] Unwaive LLaMA2 7B FP8 case (#10997 ) Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com> Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>	2026-02-09 23:53:40 +08:00
William Zhang	abb8106c01	[https://nvbugs/5835925 ][fix] Add EPD disagg support for Qwen3 VL MoE (#10962 ) * Why? Trying to instantiate a `MultimodalEncoder` for a Qwen3 VL MoE model would fail during weight loading. * What? This commit fixes the bug, alongside: - explicit, intentional support for EPD for Qwen3 VL MoE. - extends EPD unit tests for Qwen3 VL MoE, albeit with dummy weights. - unit tests for the weight mapper fixes. Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com> Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>	2026-02-09 23:53:40 +08:00
Jin Li	0ead17bb85	[https://nvbugs/5800646 ][fix] Fix hang issue by avoid exposing UB buf… (#10842 ) Signed-off-by: Jin Li <59594262+liji-nv@users.noreply.github.com> Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>	2026-02-09 23:53:40 +08:00
yingguo-trt	d348dd95a7	[None][feat] support Lyris GB200 and increase disagg test timeout (#11019 ) Signed-off-by: yingguo-trt <244492186+yingguo-trt@users.noreply.github.com> Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com> Co-authored-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com> Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>	2026-02-09 23:53:40 +08:00
yufeiwu-nv	fd4e6132e5	[None][test] Fix missing test cases (#10881 ) Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com> Co-authored-by: Larry Xu <197874197+LarryXFly@users.noreply.github.com> Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>	2026-02-09 23:53:40 +08:00
Stefan Niebler	d50010cd1f	[https://nvbugs/5769815 ][fix] Fix offset calculation in _are_stop_words when using speculative decoding (#10854 ) Signed-off-by: Stefan Niebler <82932102+stnie@users.noreply.github.com> Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>	2026-02-09 23:53:40 +08:00
Lizhi Zhou	6c4e0c3dbe	[https://nvbugs/5826689 ][fix] replace etcd3 with etcd-sdk-python (#10886 ) Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com> Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>	2026-02-09 23:53:40 +08:00
Emma Qiao	c659280445	[None][infra] Waive failed cases for release branch on 01/26 (#10999 ) Signed-off-by: qqiao <qqiao@nvidia.com> Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>	2026-02-09 23:53:40 +08:00
Pengbo Wang	59f59efb83	[https://nvbugs/5779536 ][fix] Unwaive DeepSeekR1 nvfp4 pp4 mtp test case (#10902 ) Signed-off-by: Pengbo Wang <221450789+pengbowang-nv@users.noreply.github.com> Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>	2026-02-09 23:53:40 +08:00
JunyiXu-nv	90ea6c1e09	[https://nvbugs/5804146 ][fix] Enable responses tests and remove ds to… (#10925 ) Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com> Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>	2026-02-09 23:53:40 +08:00
mpikulski	196d94a419	[TRTLLM-10030][perf] avoid syncs in beam search + other improvements (#11349 ) Signed-off-by: ixlmar <206748156+ixlmar@users.noreply.github.com>	2026-02-09 16:13:58 +01:00

1 2 3 4 5 ...

2846 Commits