TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-14 06:27:45 +08:00

Author	SHA1	Message	Date
Bo Li	a66eeab537	[TRTLLM-9805][feat] Skip Softmax Attention. (#9821 ) Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com> Signed-off-by: Tian Zheng <29906817+Tom-Zheng@users.noreply.github.com> Co-authored-by: Tian Zheng <29906817+Tom-Zheng@users.noreply.github.com>	2025-12-21 02:52:42 -05:00
Balaram Buddharaju	dcd3f7b5ea	[https://nvbugs/5744427 ][fix] Fix accuracy test OOM (#10173 ) Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>	2025-12-21 02:03:38 -05:00
Bo Li	77e37d9dd0	[https://nvbugs/5753250 ][infra] Further waive all tests in _test_openai_responses.py (#10176 ) Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>	2025-12-20 10:25:14 -05:00
Enwei Zhu	2ce785f39a	[https://nvbugs/5643631 ][fix] Fix hostfunc seg fault (#10028 ) Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>	2025-12-20 07:58:43 -05:00
Yuxian Qiu	3b3069b390	[https://nvbugs/5747930 ][fix] Use offline tokenizer for whisper models. (#10121 ) Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>	2025-12-20 09:42:07 +08:00
Anish Shanbhag	7c82605327	[None][fix] enable KV cache reuse for config database (#10094 )	2025-12-19 15:16:56 -08:00
Balaram Buddharaju	bee9051484	[None][chore] Waive timing out pre-merge test (#10167 ) Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>	2025-12-19 17:56:33 -05:00
Gal Hubara-Agam	20b69a982a	[#10056 ][test] AutoDeploy: Add accuracy test for Nemotron SuperV3 (#10131 ) Signed-off-by: Gal Hubara Agam <96368689+galagam@users.noreply.github.com> Signed-off-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com> Co-authored-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com>	2025-12-19 13:28:42 -08:00
Chang Liu	5489d188a4	[None][fix] Revert the change and remove device count guard for DSv32 (#9631 ) Signed-off-by: Chang Liu (Enterprise Products) <9713593+chang-l@users.noreply.github.com>	2025-12-19 15:00:55 -05:00
longcheng-nv	b882393d69	[https://nvbugs/5720357 ][fix] Fix indice offset overflow in custom Top-K kernel and corresponding UT case (#10027 ) Signed-off-by: longcheng-nv <243710427+longcheng-nv@users.noreply.github.com> Co-authored-by: Chang Liu (Enterprise Products) <9713593+chang-l@users.noreply.github.com>	2025-12-19 14:58:01 -05:00
Venky	dfa11d810e	[TRTC-102][docs] `--extra_llm_api_options`->`--config` in docs/examples/tests (#10005 )	2025-12-19 13:48:43 -05:00
JunyiXu-nv	7b71ff6b8a	[https://nvbugs/5722653 ][fix] Unwaive fixed test (#10157 ) Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>	2025-12-19 11:19:20 -05:00
xxi	27e49e2904	[None][fix] waive the failed test test_service_discovery[etcd-load_ba… (#10161 ) Signed-off-by: xxi <xxi@nvidia.com>	2025-12-19 06:14:26 -08:00
xinhe-nv	7b51e3cedb	[TRTLLM-8638][fix] Add failed cases into waives.txt (#10129 ) Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>	2025-12-19 17:55:17 +08:00
Emma Qiao	dd8ce68c94	[None][infra] Update waive and waive failed tests for main branch on 12/19 (#10151 ) Signed-off-by: qqiao <qqiao@nvidia.com>	2025-12-19 01:20:42 -08:00
Pengyun Lin	ac03915dc3	[TRTLLM-9604][feat] DS R1 & V3.1 tool parser (#10010 ) Signed-off-by: Pengyun Lin <81065165+LinPoly@users.noreply.github.com>	2025-12-19 17:20:03 +08:00
Chang Liu	31bc14b350	[TRTLLM-9654][feat] Support DeepSeek-V32 chat template (#9814 ) Signed-off-by: Chang Liu (Enterprise Products) <9713593+chang-l@users.noreply.github.com>	2025-12-19 17:05:38 +08:00
yufeiwu-nv	52cee573ad	[TRTLLM-8830][test] Overlap scheduler enhancement perf test: Add qwen3_0,8b and llama3.1 test cases (#10114 ) Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>	2025-12-19 17:01:52 +08:00
xinhe-nv	cb0444b1b5	[TRTLLM-8638][fix] Add failed cases into waives.txt (#10132 ) Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com> Co-authored-by: Larry Xu <197874197+LarryXFly@users.noreply.github.com>	2025-12-19 16:07:56 +08:00
JunyiXu-nv	356ad4fe3a	[https://nvbugs/5722653 ][fix] Address port conflict by assigning different port section in the same node. (#10035 ) Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>	2025-12-19 15:34:04 +08:00
William Zhang	478b6b20a1	[#9230 ][refactor] Replace nemotron patches with custom model implementation (#9751 ) [#9230][refactor] Replace nemotron patches with custom model implementation * Why? Patching for nemotron H models was growing out of hand, and made certain optimizations more complex than they needed to be. * What? This commit finally gets rid of them, and replaces them with the custom model implementation in `modeling_nemotron_h.py`. Closes #9230 Closes NvBug 5747867 Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>	2025-12-18 19:36:27 -08:00
Balaram Buddharaju	72c5480dfb	[None][chore] Waive test blocking pre-merge 12/18 (#10145 ) Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>	2025-12-18 19:12:05 -08:00
Ivy Zhang	9aa40871c2	[TRTLLM-9840][test] switch ucx backend to default backend (#10101 ) Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>	2025-12-18 18:54:15 -08:00
Wangjue Yao	9f283f330b	[None][feat] Support Mooncake transfer engine as a cache transceiver backend (#8309 ) Signed-off-by: wjueyao <wyao123@terpmail.umd.edu> Signed-off-by: Shunkang <182541032+Shunkangz@users.noreply.github.co> Co-authored-by: Shunkang <182541032+Shunkangz@users.noreply.github.co>	2025-12-19 10:09:51 +08:00
Chuang Zhu	e0b2a94309	[None][fix] Fix ready signal in NIXL backend (#10000 ) Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>	2025-12-19 09:43:40 +08:00
Yukun He	bd5b3c2ac0	[https://nvbugs/5721912 ][chore] Unwaive the test (#10108 ) Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>	2025-12-19 09:12:25 +08:00
Anish Shanbhag	91a9ae42d2	[TRTC-71][feat] Add regression testing for config database (#9832 ) Signed-off-by: Anish Shanbhag <ashanbhag@nvidia.com>	2025-12-18 16:15:38 -08:00
Balaram Buddharaju	799a2ae311	[https://nvbugs/5741331 ][fix] Fix helix accuracy test (#10021 ) Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>	2025-12-18 15:27:53 -08:00
Chang Liu	a97e411b44	[https://nvbugs/5747911 ][fix] Use offline data path for the unit test of mmencoder server (#10135 ) Signed-off-by: Chang Liu (Enterprise Products) <9713593+chang-l@users.noreply.github.com>	2025-12-18 15:19:23 -08:00
Lizhi Zhou	f02782a6f2	[https://nvbugs/5726066 ][fix] fix auto-scaling related failures (#9845 ) Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com> Co-authored-by: Emma Qiao <qqiao@nvidia.com>	2025-12-18 16:37:48 -05:00
CarstyYou	0b279f4ad4	[https://nvbugs/5456493 ][feat] Add fp8 bmm on sm120 (#9687 ) Signed-off-by: CarstyYou <186021327+CarstyYou@users.noreply.github.com>	2025-12-18 22:57:20 +08:00
ZhichenJiang	4e55b83101	[None][perf] Add more optimization options for MOE CuteDSL finalized kernel (#10042 ) Signed-off-by: zhichen jiang <zhichenj@NVIDIA.com>	2025-12-18 22:49:28 +08:00
Bo Li	9d7e038bcb	[https://nvbugs/5753250 ][infra] Waive _test_openai_responses. (#10110 ) Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>	2025-12-18 00:15:06 -08:00
Emma Qiao	33a90f2dd2	[None][infra] Waive failed cases for main branch on 12/18 (#10105 ) Signed-off-by: qqiao <qqiao@nvidia.com>	2025-12-17 21:35:45 -08:00
Yuxian Qiu	bec864a78c	[None][fix] avoid ID conversion for non enable_configurable_moe cases. (#10003 ) Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>	2025-12-18 13:29:52 +08:00
Wanli Jiang	601c29ca73	[https://nvbugs/5721644 ][fix] Update tests for nemotron_h (#9993 ) Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>	2025-12-18 12:38:02 +08:00
Lucas Liebenwein	76ec820465	[#7532 ][feat] AutoDeploy: gather logits before lm head (#9962 ) Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com> Co-authored-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com>	2025-12-17 19:50:13 -08:00
xinhe-nv	4a98f190a8	[None][chore] Add failed cases into waives.txt (#10025 ) Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com> Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>	2025-12-17 19:13:52 -08:00
xinhe-nv	c1cfb61b1b	[TRTLLM-9381][feat] Add kimi k2 fp4 tests (#9906 ) Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>	2025-12-17 18:15:27 -08:00
Chenghao Zhang	22c6e8a424	[None][fix] Autodeploy: fix some legacy flashinfer attention test errors (#9928 ) Signed-off-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com>	2025-12-17 12:27:22 -08:00
yufeiwu-nv	5d71f662c3	[https://nvbugs/5698434 ][test] Add Qwen3-4B-Eagle3 One-model perf test (#10041 ) Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>	2025-12-17 13:37:25 +08:00
Emma Qiao	0dbf3948cc	[None][infra] Waive failed tests due to llm model files (#10068 ) Signed-off-by: qqiao <qqiao@nvidia.com>	2025-12-16 20:12:57 -08:00
JunyiXu-nv	6649c3743c	[https://nvbugs/5635153 ][chore] Remove responses tests from waive list (#10026 ) Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>	2025-12-17 11:22:02 +08:00
shuyixiong	26fb063076	[https://nvbugs/5741060 ][fix] Fix pg op test (#9989 ) Signed-off-by: Shuyi Xiong <219646547+shuyixiong@users.noreply.github.com>	2025-12-17 09:44:25 +08:00
Aurelien Chartier	7175d89b48	[None][fix] Fix iteration stats for spec-dec (#9855 ) Signed-off-by: Aurelien Chartier <2567591+achartier@users.noreply.github.com>	2025-12-16 14:11:38 -08:00
Lizhi Zhou	bd13957e70	[TRTLLM-9181][feat] improve disagg-server prometheus metrics; synchronize workers' clocks when workers are dynamic (#9726 ) Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>	2025-12-16 05:16:32 -08:00
Enwei Zhu	609d1d0383	[None][fix] Fix Illegal Memory Access for CuteDSL Grouped GEMM (#10008 ) Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>	2025-12-16 04:06:49 -08:00
Emma Qiao	12727ebd7f	[None][infra] Waive failed test for main branch on 12/16 (#10029 ) Signed-off-by: qqiao <qqiao@nvidia.com>	2025-12-16 02:54:32 -08:00
Wanli Jiang	8af51211c1	[FMDL-1222][feat] Support weight and weight_scale padding for NVFP4 MoE cutlass (#9358 ) Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>	2025-12-16 12:41:17 +08:00
Eran Geva	ce7a42f4cf	[https://nvbugs/5731717 ][fix] fixed flashinfer build race condition during test (#9983 ) Signed-off-by: Eran Geva <19514940+MrGeva@users.noreply.github.com>	2025-12-15 20:30:24 -08:00
Yechan Kim	8ba8699f66	[TRTLLM-8310][feat] Add Qwen3-VL-MoE (#9689 ) Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>	2025-12-15 20:05:20 -08:00
ChristinaZ	dff77efa2a	[None][feat] Add routing support for the new model for both cutlass and trtllm moe backend (#9792 ) Signed-off-by: Christina Zhang <83400082+ChristinaZ@users.noreply.github.com>	2025-12-15 19:59:08 -08:00
xinhe-nv	cdf56c278f	[TRTLLM-8638][fix] Add failed cases into waives.txt New activity. (#9979 ) Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>	2025-12-15 18:59:13 -08:00
Michal Guzek	e6187d8109	[https://nvbugs/5708810 ][fix] Fix TRTLLMSampler (#9710 ) Signed-off-by: Michal Guzek <mguzek@nvidia.com>	2025-12-15 23:26:52 +01:00
Patrice Castonguay	9ba14263db	[https://nvbugs/5673559 ][fix] Unwaiving disagg test for nvbug 5673559 (#9957 ) Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>	2025-12-15 12:32:15 -05:00
Emma Qiao	d5d15c06df	[None][infra] Waive failed tests for main branch on 12/15 (#10001 ) Signed-off-by: qqiao <qqiao@nvidia.com> Signed-off-by: Yanchao Lu <yanchaol@nvidia.com> Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>	2025-12-16 01:29:43 +08:00
Kaiyu Xie	44b0f8c3ed	[None] [fix] Revert "[None] [feat] add eos_token_id in generation_config to sampling params" (#10002 )	2025-12-15 08:52:52 -08:00
Wanli Jiang	3230fbe79a	[None][feat] Update reasoning parser for nano-v3 (#9944 ) Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>	2025-12-15 05:39:37 -08:00
Yukun He	9e7182b603	[TRTLLM-9615][feat] Implement a distributed tuning system (#9621 ) Four distinct strategies are implemented to accommodate different distributed tuning scenarios, including BROADCAST, INDEPENDENT, MERGE, PARALLEL. * Distributed tuning is disabled by default, with the INDEPENDENT strategy as the fallback. This conservative approach prevents unexpected behavior in standard use cases. * Only operations with significant tuning time overhead have been assigned the PARALLEL strategy, which allows the same tensor parallelism (TP) rank to tune tactics concurrently across different ranks. This targeted approach balances performance gains with stability. * Operations with nested tuning structures, such as NVFP4GemmUnifiedRunner, currently support only the INDEPENDENT strategy. This restriction exists because the synchronization mechanism is optimized only for leaf operations and doesn't yet handle nested hierarchies. Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>	2025-12-15 21:08:53 +08:00
Bo Li	9eb5a229dd	[None][infra] Fully waive test_worker_restart test_disagg_server_restart. (#9988 ) Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>	2025-12-15 01:26:18 -08:00
Grzegorz Kwasniewski	83885c69e7	[TRTLLM-9136][feat] 2D parallel EP TP support (#9459 ) Signed-off-by: greg-kwasniewski1 <213329731+greg-kwasniewski1@users.noreply.github.com>	2025-12-15 09:52:29 +01:00
xinhe-nv	3c98b25005	[None][chore] Add failed cases into waives.txt (#9941 ) Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>	2025-12-14 23:14:24 -08:00
JunyiXu-nv	af899d2fe7	[TRTLLM-9860][doc] Add docs and examples for Responses API (#9946 ) Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>	2025-12-14 21:46:13 -08:00
Ziyi Xiong	f2aee0db03	[TRTLLM-9854][feat] Optimize the host overhead of _sample_async (#9935 ) Signed-off-by: ziyixiong-nv <219238287+ziyixiong-nv@users.noreply.github.com>	2025-12-15 13:28:54 +08:00
shuyixiong	25db9e7b3e	[https://nvbugs/5741060 ][chore] Waive all pg operator tests (#9991 ) Signed-off-by: Shuyi Xiong <219646547+shuyixiong@users.noreply.github.com>	2025-12-14 21:24:43 -08:00
Balaram Buddharaju	dfc8799352	[https://nvbugs/5669114 ][fix] Switch to MMMU benchmark for Gemma3 27B (#9966 ) Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>	2025-12-14 21:23:59 -08:00
Fanrong Li	8f144d9282	[TRTLLM-9416][feat] Skip DS-v3.2 indexer MQA and Top-K for short sequences. (#9524 ) Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>	2025-12-15 12:42:25 +08:00
QI JUN	b57650f1e6	[TRTLLM-9794][ci] move test cases of gpt-oss to gb200 (#9934 ) Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>	2025-12-14 19:21:54 -08:00
xxi	f5696df285	[TRTLLM-8961][feat] ConfigurableMoE support DeepGemm (#9858 )	2025-12-15 10:47:15 +08:00
dominicshanshan	4bf42f8fa8	[https://nvbugs/5580297 ][fix] Skip capture request error test from Ray stage (#9947 ) Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>	2025-12-15 10:03:16 +08:00
Simeng Liu	f21e2b3329	[TRTLLM-9601][feat] Expose mmKeys for multimodal to integrate with dynamo. (#9604 ) Signed-off-by: SimengLiu-nv <simengl@nvidia.com>	2025-12-15 08:42:30 +08:00
Emma Qiao	e0a4b72279	[None][infra] Waive failed tests for main branch on 12/14 (#9982 ) Signed-off-by: qqiao <qqiao@nvidia.com>	2025-12-14 22:48:34 +08:00
Mike Iovine	96d654029d	[https://nvbugs/5666816 ][fix] Unwaive llama3 eagle3 test (#9964 ) Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>	2025-12-14 15:07:35 +08:00
nvxuanyuc	a5a37227d6	[None][feat] Fused kernels (qknormrope + moe routing) and two-model MTP support for glm4moe (#9852 ) Signed-off-by: Xuanyu Chen <xuanyuc@nvidia.com>	2025-12-14 10:47:24 +08:00
Mike Iovine	383b13e0e5	[None][feat] Implement sampling on 1-model EAGLE3 (#9885 ) Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com> Signed-off-by: Mike Iovine <miovine@nvidia.com>	2025-12-13 07:38:22 -08:00
Yan Chunwei	85406f9dda	[https://nvbugs/5720482 ][fix] Fix test rpc streaming (#9902 ) Signed-off-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>	2025-12-13 01:14:43 -08:00
shuyixiong	8cbf2d958c	[TRTLLM-9738][chore] Guard accuracy with nccl allreduce strategy (#9793 ) Signed-off-by: Shuyi Xiong <219646547+shuyixiong@users.noreply.github.com>	2025-12-13 01:02:11 -08:00
Balaram Buddharaju	6a6e41f802	[TRTLLM-9468][chore] Update disagg benchmarking scripts to support context parallelism (#9720 ) Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>	2025-12-12 22:29:41 -08:00
bhsueh_NV	e49c70f6df	[None][feat] Support Mistral Large3 LLM part (#9820 ) Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>	2025-12-13 11:44:27 +08:00
Balaram Buddharaju	461446045e	[TRTLLM-9493][feat] Add helixPostProcessNative kernel for cp_dim=2 (#9924 ) Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>	2025-12-12 16:49:25 -08:00
tburt-nv	6147452158	[https://nvbugs/4141427 ][chore] Add more details to LICENSE file (#9881 ) Signed-off-by: Tyler Burt <195370667+tburt-nv@users.noreply.github.com>	2025-12-13 08:35:31 +08:00
Chuang Zhu	4cc4cbe926	[https://nvbugs/5716787 ][fix] terminate nixl running when exiting (#9785 ) Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com> Co-authored-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>	2025-12-12 11:15:02 -05:00
Chuang Zhu	9c59c9f920	[https://nvbugs/5643787 ][fix] remove the war path for notify to itself (#9834 ) Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>	2025-12-12 11:10:05 -05:00
JunyiXu-nv	2fec53dfa5	[TRTLLM-9637][feat] Support tool parser for Kimi K2 (#9830 ) Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>	2025-12-12 23:32:39 +08:00
Yihan Wang	9df4dad3b6	[None][fix] Introduce inline namespace to avoid symbol collision (#9541 ) Signed-off-by: Yihan Wang <yihwang@nvidia.com>	2025-12-12 23:32:15 +08:00
Balaram Buddharaju	af315d8ef1	[TRTLLM-5972][chore] Load balance decode token KV cache with helix parallelism (#9757 ) Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>	2025-12-12 22:29:05 +08:00
Lucas Liebenwein	e767fc649a	[None][feat] AutoDeploy: prepare_metadata revisited (#9764 ) Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>	2025-12-12 20:14:14 +08:00
ruodil	9b3e5e90ee	[None][test] fix a typo in model name in script (#9867 ) Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com> Co-authored-by: Ruodi Lu <ruodil@users.noreply.github.com>	2025-12-12 17:35:55 +08:00
chenfeiz0326	61745f034a	[https://nvbugs/5727481 ][ci] Fix Port Conflict in Perf-Sanity CI Test (#9896 ) Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>	2025-12-12 17:16:50 +08:00
kris1025	2fc94e5dd7	[None][chore] unwaive qwen3 accuracy test (#9895 ) Signed-off-by: linquanh <linquanh@nvidia.com>	2025-12-12 16:30:09 +08:00
Yihan Wang	711016c799	[https://nvbugs/5736923 ][infra] Waive timeout disaggregated/test_auto_scaling[http-round_robin] test (#9942 ) Signed-off-by: Yihan Wang <yihwang@nvidia.com>	2025-12-12 15:15:13 +08:00
Ivy Zhang	fded6c393d	[TRTLLM-9262][test] add groupgemm ada case for rcca (#9833 ) Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>	2025-12-12 13:23:33 +08:00
dominicshanshan	093465ed29	[https://nvbugs/5599176 ][fix] Unwaive fixed test for Ray (#9861 ) Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>	2025-12-12 11:24:05 +08:00
xinhe-nv	e8efeb765d	[TRTLLM-9717][fix] fix multi nodes tests cases (#9736 ) Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>	2025-12-12 10:14:23 +08:00
Venky	fd1270b9ab	[TRTC-43] [feat] Add config db and docs (#9420 ) Signed-off-by: Frank Di Natale <3429989+FrankD412@users.noreply.github.com> Signed-off-by: Venky Ganesh <23023424+venkywonka@users.noreply.github.com> Co-authored-by: Frank Di Natale <3429989+FrankD412@users.noreply.github.com>	2025-12-12 04:00:03 +08:00
Simeng Liu	24f92721f2	[https://nvbugs/5597647 ][ci] Unwaive fixed tests. (#9812 ) Signed-off-by: SimengLiu-nv <simengl@nvidia.com>	2025-12-12 02:29:30 +08:00
Erin	89dabf5aa1	[TRTLLM-9736][feat] AsyncLLM and verl integ (#9353 ) Signed-off-by: Liwei Ma <liweim@nvidia.com> Signed-off-by: Yuan Tong <13075180+tongyuantongyu@users.noreply.github.com> Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com> Signed-off-by: Erin Ho <14718778+hchings@users.noreply.github.com> Co-authored-by: Liwei Ma <liweim@nvidia.com> Co-authored-by: Yuan Tong <13075180+tongyuantongyu@users.noreply.github.com> Co-authored-by: Superjomn <328693+Superjomn@users.noreply.github.com>	2025-12-11 09:33:25 -08:00
JadoTu	02edb19f43	[None] [feat] add eos_token_id in generation_config to sampling params (#9514 ) Signed-off-by: jiant <107457950+JadoTu@users.noreply.github.com>	2025-12-12 00:52:03 +08:00
xxi	488d38f88d	[TRTLLM-8959][feat] ConfigurableMoE support CUTLASS (#9772 )	2025-12-12 00:22:13 +08:00
Yan Chunwei	04a39a4e2b	[None][chore] enable test_ipc.py (#9865 ) Signed-off-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>	2025-12-11 17:47:14 +08:00

1 2 3 4 5 ...

2387 Commits