TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-14 06:27:45 +08:00

Author	SHA1	Message	Date
nvxuanyuc	a79c0dfb43	[None][fix] Update GLM model accuracy test (#9286 ) Signed-off-by: Xuanyu Chen <xuanyuc@nvidia.com>	2025-11-18 21:59:01 -08:00
Ivy Zhang	782dfca7e8	[TRTLLM-9050][test] add llama4 disagg case to cover kv cache overflow error (#9172 ) Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>	2025-11-18 18:26:32 -08:00
xinhe-nv	35658eab55	[None][chore] Add failed cases into waives.txt (#9193 ) Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com> Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>	2025-11-18 17:47:55 -08:00
Enwei Zhu	7c4777a571	[TRTLLM-9286][feat] Integration of CuteDSL NVFP4 grouped GEMM (#8880 ) Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>	2025-11-18 17:40:12 -08:00
Lizhi Zhou	c789000a62	[https://nvbugs/5649010 ][fix] increase status-checking interval to avoid instability (#9203 ) Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>	2025-11-19 08:55:42 +08:00
Bo Deng	34f845bf69	[TRTLLM-9287][infra] Use NIXL backend for accuracy tests (#9247 ) Signed-off-by: Bo Deng <deemod@nvidia.com>	2025-11-18 14:46:20 -08:00
Ajinkya Rasane	8d7cda2318	[None][chore] Update the Flux autodeploy example (#8434 ) Signed-off-by: ajrasane <131806219+ajrasane@users.noreply.github.com> Co-authored-by: Frida Hou <201670829+Fridah-nv@users.noreply.github.com>	2025-11-18 14:16:04 -08:00
Ivy Zhang	160b361588	[TRTLLM-8949][test] Add rcca test case for eagle3 consistency check (#9088 ) Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>	2025-11-18 05:55:00 -08:00
Ivy Zhang	ca41a71f92	[TRTLLM-8948][test] Add long bench case (#9165 ) Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>	2025-11-18 04:41:48 -08:00
Tri Dao	fc088e642c	[None][feat] Support Glm4MoeForCausalLM (#8256 ) Signed-off-by: Tri Dao <daominhtri0503@gmail.com> Co-authored-by: Xuanyu Chen <xuanyuc@nvidia.com>	2025-11-18 09:43:21 +08:00
Robin Kobus	df41f220a2	[TRTLLM-8831][feat] Enable early exit with overlap scheduler (#8587 ) Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>	2025-11-17 18:07:13 +01:00
Chang Liu	bed4e95e9f	[https://nvbugs/5629887 ][fix] Add missing device count guard for DSv32 multiGPU tests (#9159 )	2025-11-14 07:52:23 -08:00
Erin	44d1c75701	[TRTLLM-8988][feat] Unify MPI & Ray's req/response handling with RPC Client/Server (#8765 ) Signed-off-by: Erin Ho <14718778+hchings@users.noreply.github.com>	2025-11-13 17:21:24 -08:00
xinhe-nv	548f5ce4bc	[None][fix] waive failed tests (#9090 ) Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>	2025-11-12 23:40:00 -08:00
ruodil	c86e36fe38	[None][test] add deepseek and qwen cases for rtx series (#8839 ) Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com> Co-authored-by: Ruodi Lu <ruodil@users.noreply.github.com>	2025-11-12 22:28:02 -08:00
Zhenhuan Chen	943b05e2d3	[TRTLLM-9179][feat] add pp_partition to customize each rank's layer number (#9003 ) Signed-off-by: Zhenhuan Chen <zhenhuanc@nvidia.com>	2025-11-13 10:34:17 +08:00
dongxuy04	9241ccaf27	[None][feat] Enable EPLB for trtllm-gen and cutlass backend (#8886 ) Signed-off-by: Dongxu Yang <78518666+dongxuy04@users.noreply.github.com>	2025-11-12 12:30:27 -08:00
Fanrong Li	780d4f9dc5	[None][feat] Add MTP>1 support for DS-v3.2 (#9045 ) Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>	2025-11-12 09:56:12 -08:00
Iman Tabrizian	cdde15b275	[TRTLLM-8540][feat] Add support for disagg in DSv3.2 (#8735 ) Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>	2025-11-12 08:21:11 -08:00
yufeiwu-nv	b7a2574c60	[https://nvbugs/5568991 ][test] Remove Phi-3 models (#9066 ) Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>	2025-11-12 03:16:36 -08:00
QI JUN	fd703fbb7b	[None][ci] run speculative unit tests serially (#9080 ) Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>	2025-11-11 19:06:44 -08:00
Lucas Liebenwein	aca56097cb	[None][fix] AutoDeploy: update nano3 accuracy test (#9061 ) Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>	2025-11-11 12:26:31 -08:00
Wanli Jiang	ebdd1cc8e0	[TRTLLM-8119][feat] Update doc/tests/chat_template for nano-v2-vlm (#8840 ) Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>	2025-11-11 07:48:23 -08:00
Yechan Kim	0938a3ad2a	[https://nvbugs/5644187 ][fix] Llava-Next MMMU bugfix and Phi4 test bugfix (#9034 ) Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>	2025-11-11 10:24:31 +09:00
Fanrong Li	a7033a9193	[TRTLLM-9001][feat] add TP support for DeepSeek-V3.2 (#8943 ) Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>	2025-11-10 12:16:01 +08:00
QI JUN	1c6e490894	[TRTLLM-9065][chore] remove PyTorchConfig completely (#8856 ) Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>	2025-11-06 22:37:03 -08:00
Lizhi Zhou	b26e1617f2	[https://nvbugs/5633340 ][fix] kill processes properly after test (#8970 ) Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>	2025-11-06 21:45:38 -08:00
jthomson04	fcae852cef	[None][fix] Fix KV cache clearing with KV Connector API (#8750 ) Signed-off-by: jthomson04 <jwillthomson19@gmail.com>	2025-11-06 14:28:27 -08:00
shuyixiong	c73efe12e7	[None][chore] Use cached model in all ray tests (#8962 ) Signed-off-by: shuyix <219646547+shuyixiong@users.noreply.github.com>	2025-11-06 15:14:15 +01:00
Fanrong Li	d246f62868	[https://nvbugs/5630345 ] [chore] skip deepseek-v3.2 fp8 kv tests on pre-Blackwell architectures (#8973 ) Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>	2025-11-06 03:41:37 -08:00
xinhe-nv	e822184cd7	[None][feat] add waive by sm version (#8928 ) Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>	2025-11-05 19:20:43 -08:00
Fanrong Li	c2feed798a	[https://nvbugs/5630345 ][chore] unwaive DS-v32 nvfp4 and fp8 tests (#8887 ) Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>	2025-11-05 03:49:23 -08:00
Chuang Zhu	595f78078c	[https://nvbugs/5624367 ][fix] Fix disagg GPT-OSS test (#8870 ) Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>	2025-11-05 01:47:09 -08:00
Patrice Castonguay	782824533e	[https://nvbugs/5587574 ][fix] Increase server timeout to wait for weight loading (#8806 ) Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>	2025-11-04 12:11:08 -08:00
Robin Kobus	7e4b87b17c	[None][ci] Remove outdated test entries (#8909 ) Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>	2025-11-04 05:32:46 -08:00
xiweny	cae468cc8e	[https://nvbugs/5596343 ] [test] Waive flaky GPT-OSS cases (#8904 ) Signed-off-by: Xiwen Yu <13230610+VALLIS-NERIA@users.noreply.github.com>	2025-11-04 03:00:00 -08:00
Ivy Zhang	23717cdb3f	[TRTLLM-8580][test] save runtime report periodically (#8312 ) (#8455 ) Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com> Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>	2025-11-04 16:42:31 +08:00
Yukun He	6c8ba3be27	[None][chore] Remove duplicate log outputs in test_perf.py (#8418 ) Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com> Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>	2025-11-04 16:42:31 +08:00
ruodil	102e556863	[None][test] cherry-pick: add test-model-suites in integration conftest.py (#8388 ) Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com> Co-authored-by: Ruodi Lu <ruodil@users.noreply.github.com> Co-authored-by: Larry <197874197+LarryXFly@users.noreply.github.com> Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>	2025-11-04 16:42:31 +08:00
Patrice Castonguay	65c138108e	[https://nvbugs/5552889 ][fix] fix: Prevent empty batch when using attention DP with disagg (#8372 ) Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com> Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>	2025-11-04 16:42:31 +08:00
Stanley Sun	def9c0004d	[TRTLLM-8113][test] Add pytorch workflow e2e tests with pp enabled (#8357 ) Signed-off-by: Stanley Sun <stsun@nvidia.com> Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>	2025-11-04 16:42:31 +08:00
xiweny	fcac2022e2	[https://nvbugs/5565565 ] [fix] fp8 wideep support sm103 (#8228 ) Signed-off-by: Xiwen Yu <13230610+VALLIS-NERIA@users.noreply.github.com> Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>	2025-11-04 16:42:31 +08:00
Yueh-Ting (eop) Chen	bd1c9c0af4	[https://nvbugs/5625990 ][chore] Add test coverage for current incapability of the KV cache manager (#8829 ) Signed-off-by: eopXD <yuehtingc@nvidia.com>	2025-11-04 16:35:45 +08:00
Mike Iovine	5e6f1bcd24	[TRTLLM-8979][test] Improve qwen3 spec dec test coverage (#8767 ) Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>	2025-11-03 10:12:10 -08:00
Yechan Kim	f48968b6cc	[TRTLLM-6928][fix] Refactor multimodal unittest (#8453 ) Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>	2025-11-03 06:01:07 -08:00
Tailing Yuan	8303cfa477	[None][fix] Fix import issues in layer-wise benchmarks (#8827 ) Signed-off-by: Tailing Yuan <yuantailing@gmail.com>	2025-11-03 02:32:48 -08:00
Fanrong Li	e9f78c687a	[https://nvbugs/5625962 ][chore] unwaive DS-v32-fp4 tests (#8853 ) Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>	2025-11-03 00:34:52 -08:00
chenfeiz0326	cc4ab8d9d1	[TRTLLM-8825][feat] Support Pytest Perf Results uploading to Database (#8653 ) Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>	2025-11-03 16:23:13 +08:00
yufeiwu-nv	b4d17d1a4c	[TRTLLM-8991][test] Add Llama 3.3 70B model with different performance config (#8753 ) Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com> Co-authored-by: Larry Xu <197874197+LarryXFly@users.noreply.github.com>	2025-11-03 13:34:06 +08:00
dongfengy	6d6797c792	[None][test] Enhance GPT-OSS CI with GPQA Diamond and additional Spec Decoding Test (#8661 ) Signed-off-by: Dongfeng Yu <dongfengy@nvidia.com> Signed-off-by: dongfengy <99041270+dongfengy@users.noreply.github.com>	2025-11-02 16:44:02 -08:00

1 2 3 4 5 ...

786 Commits