TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-14 06:27:45 +08:00

Author	SHA1	Message	Date
Pengbo Wang @ NVIDIA	ef0d06df58	[None][chore] Fix kernel launch param and add TRTLLM MoE backend test (#7524 ) Signed-off-by: Pengbo Wang <221450789+pengbowang-nv@users.noreply.github.com>	2025-09-09 23:45:35 +08:00
Yanchao Lu	bc90a34a0e	[None][ci] Fix a typo in the Slurm command Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>	2025-09-08 17:15:15 +08:00
Yanchao Lu	2d5f0e1038	[None][ci] Block some nodes to avoid unstable network access (#7593 ) Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>	2025-09-08 00:34:20 +08:00
Yanchao Lu	2b02dd7891	[None][ci] Improve SSH connection stability (#7567 ) Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>	2025-09-06 17:12:39 +08:00
Yanchao Lu	d1b0c87d41	[None][fix] Fix a typo in the Slurm CI codes (#7485 ) (#7538 ) Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>	2025-09-04 21:49:18 +08:00
Yanchao Lu	c3f23462ab	[None][ci] Cherry-pick some improvements for Slurm CI setup from main branch (#7479 ) Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>	2025-09-03 18:42:28 -04:00
Pengbo Wang @ NVIDIA	62459d533d	[None][chore] Update pre-merge test to add DeepSeek/LLaMA and gpt-oss (#7192 ) Signed-off-by: Pengbo Wang <221450789+pengbowang-nv@users.noreply.github.com> Signed-off-by: Pengbo Wang @ NVIDIA <221450789+pengbowang-nv@users.noreply.github.com> Co-authored-by: Tao Li @ NVIDIA <tali@nvidia.com>	2025-08-29 17:03:46 +08:00
Yanchao Lu	460a34c671	[None][chore] Some improvements for CI stability (#7199 ) Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>	2025-08-28 16:19:20 -04:00
QI JUN	baef70e67e	[None][ci] move qwen3 tests from b200 to gb200 (#7257 ) Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>	2025-08-26 11:50:53 -04:00
Emma Qiao	a142c0c4de	[None][infra] Add retry 3 times if ssh cluster failed (#6859 ) Signed-off-by: qqiao <qqiao@nvidia.com> Signed-off-by: Emma Qiao <qqiao@nvidia.com> Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>	2025-08-26 05:11:50 -04:00
Yiqing Yan	486bc763c3	[None][infra] Split DGX_B200 stage into multiple parts and pre-/post-merge (#7074 ) Signed-off-by: Yiqing Yan <yiqingy@nvidia.com> Signed-off-by: Yanchao Lu <yanchaol@nvidia.com> Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>	2025-08-24 21:09:04 -04:00
Yanchao Lu	ec35481b0a	[None][infra] Prepare for single GPU GB200 test pipeline (#7073 ) Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>	2025-08-24 21:46:39 +08:00
QI JUN	1388e84793	[None][ci] move all B200 TensorRT test cases to post merge (#7165 ) Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>	2025-08-22 06:47:23 -04:00
Linda	898f37faa0	[None][feat] Enable nanobind as the default binding library (#6608 ) Signed-off-by: Linda-Stadter <57756729+Linda-Stadter@users.noreply.github.com>	2025-08-22 09:48:41 +02:00
Emma Qiao	a49cf684f8	[TRTLLM-5801][infra] Add more RTX Pro 6000 test stages (#5126 ) Signed-off-by: qqiao <qqiao@nvidia.com>	2025-08-22 03:12:02 -04:00
Yuan Tong	90bfc8cc29	[https://nvbugs/5453827 ][fix] Fix RPATH of th_common shared library to find pip-installed NCCL (#6984 ) Signed-off-by: Yuan Tong <13075180+tongyuantongyu@users.noreply.github.com>	2025-08-21 17:58:30 +08:00
QI JUN	a918de710a	[None][ci] move some tests of b200 to post merge (#7093 ) Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>	2025-08-20 19:43:40 -04:00
Fanrong Li	816a120af6	[TRTLLM-6991][chore] add DeepSeek-R1 FP8 accuracy tests on Blackwell (#6710 ) Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>	2025-08-19 00:03:03 -04:00
Yanchao Lu	d1d17dbeba	[None][infra] Cherry-pick #6836 from main branch and improve SSH connection (#6971 ) (#7005 ) Signed-off-by: Yanchao Lu <yanchaol@nvidia.com> Co-authored-by: Zhanrui Sun <184402041+ZhanruiSunCh@users.noreply.github.com>	2025-08-19 01:35:30 +08:00
Yanchao Lu	3a987891d8	[TRTLLM-7141][infra] Use repo mirrors to avoid intermittent network failures (#6836 ) Signed-off-by: Yanchao Lu <yanchaol@nvidia.com> Co-authored-by: Zhanrui Sun <184402041+ZhanruiSunCh@users.noreply.github.com>	2025-08-15 11:16:07 +08:00
Yanchao Lu	b7347ce7d1	[https://nvbugs/5433581 ][fix] Revert deep_gemm installation workaround for SBSA (#6666 ) Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>	2025-08-06 18:50:53 +08:00
Emma Qiao	78a75c2990	[None][Infra] - Split gb200 stages for each test (#6594 ) Signed-off-by: qqiao <qqiao@nvidia.com>	2025-08-05 07:10:00 -04:00
Yanchao Lu	d53cc2374b	[https://nvbugs/5433581 ][infra] Update install docs and CI script for SBSA deep_gemm workaround (#6607 ) Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>	2025-08-04 23:36:38 -04:00
Yiqing Yan	4763e94156	[TRTLLM-5563][infra] Move test_rerun.py to script folder (#6571 ) Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>	2025-08-04 13:26:04 +08:00
Yiqing Yan	3f7abf87bc	[TRTLLM-6224][infra] Upgrade dependencies to DLFW 25.06 and CUDA 12.9.1 (#5678 ) Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>	2025-08-03 11:18:59 +08:00
Yiqing Yan	0cf2f6f154	[TRTLLM-5633] - Merge current waive list with the TOT waive list (#5198 ) Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>	2025-07-30 17:50:05 +08:00
Zhanrui Sun	64ba483656	infra: [TRTLLM-6499] Split L0_Test into two pipeline by single GPU and multi GPU(For SBSA) (#6132 ) Signed-off-by: ZhanruiSunCh <184402041+ZhanruiSunCh@users.noreply.github.com>	2025-07-28 22:54:37 -04:00
yuanjingx87	608ed89f96	[None][infra]Update slurm config keys (#6370 ) Signed-off-by: Yuanjing Xue <197832395+yuanjingx87@users.noreply.github.com>	2025-07-28 11:56:37 -07:00
Yiqing Yan	d97419805b	[TRTLLM-5312] - Add bot run rules for triton tests (#4988 ) Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>	2025-07-25 10:31:12 +08:00
yuanjingx87	ef4878db05	set NVIDIA_IMEX_CHANNELS for dlcluster slurm job only (#6234 ) Signed-off-by: Yuanjing Xue <197832395+yuanjingx87@users.noreply.github.com>	2025-07-22 11:27:54 -07:00
Yi Zhang	f9b0a911fb	test: Enable GB200 torch compile multi gpu tests (#6145 ) Signed-off-by: Yi Zhang <187001205+yizhang-nv@users.noreply.github.com>	2025-07-21 22:17:13 +08:00
Zhanrui Sun	3cbc23f783	infra: [TRTLLM-5250] Add sanity check stage for ngc-release images (Build wheels for devel image) (#4656 ) Signed-off-by: ZhanruiSunCh <184402041+ZhanruiSunCh@users.noreply.github.com> Signed-off-by: Zhanrui Sun <184402041+ZhanruiSunCh@users.noreply.github.com> Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>	2025-07-21 16:06:43 +08:00
Linda	3efad2e58c	feat: nanobind bindings (#6185 ) Signed-off-by: Linda-Stadter <57756729+Linda-Stadter@users.noreply.github.com>	2025-07-21 08:56:57 +01:00
Venky	22d4a8c48a	enh: Add script to map tests <-> jenkins stages & vice-versa (#5177 ) Signed-off-by: Venky Ganesh <23023424+venkywonka@users.noreply.github.com> Signed-off-by: Yanchao Lu <yanchaol@nvidia.com> Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>	2025-07-19 00:50:40 +08:00
Iman Tabrizian	b75e53ab69	Revert "feat: nanobind bindings (#5961 )" (#6160 ) Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>	2025-07-18 10:12:54 +08:00
Linda	5bff317abf	feat: nanobind bindings (#5961 ) Signed-off-by: Linda-Stadter <57756729+Linda-Stadter@users.noreply.github.com>	2025-07-17 22:42:52 +08:00
Emma Qiao	1cc49494fe	[Infra] - Add wiave list for pytest when using slurm (#6130 ) Signed-off-by: qqiao <qqiao@nvidia.com>	2025-07-17 16:53:15 +08:00
Zhanrui Sun	4c364b9a73	infra: fix SBSA test stage (#6113 ) Signed-off-by: ZhanruiSunCh <184402041+ZhanruiSunCh@users.noreply.github.com>	2025-07-17 11:56:03 +08:00
Zhanrui Sun	e42f5a9581	infra: [TRTLLM-5879] Spilt single GPU test and multi GPU test into 2 pipelines (#5199 ) Signed-off-by: ZhanruiSunCh <184402041+ZhanruiSunCh@users.noreply.github.com> Signed-off-by: Zhanrui Sun <184402041+ZhanruiSunCh@users.noreply.github.com> Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>	2025-07-16 18:04:04 +08:00
Zhanrui Sun	d811843a08	infra: [TRTLLM-6313] Fix the package sanity stage 'Host Node Name' in… (#5945 ) Signed-off-by: ZhanruiSunCh <184402041+ZhanruiSunCh@users.noreply.github.com>	2025-07-15 15:39:31 +09:00
Yiqing Yan	6b35afaf1b	[Infra][TRTLLM-6013] - Fix stage name in single stage test rerun report (#5672 ) Signed-off-by: Yiqing Yan <yiqingy@nvidia.com> Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>	2025-07-15 12:27:21 +09:00
Zhanrui Sun	01b2def5ef	infra: [TRTLLM-6331] Support show all stage name list when stage name check failed (#5946 ) Signed-off-by: ZhanruiSunCh <184402041+ZhanruiSunCh@users.noreply.github.com> Signed-off-by: Zhanrui Sun <184402041+ZhanruiSunCh@users.noreply.github.com>	2025-07-15 12:06:03 +09:00
Alex Zhang	6c30d78b78	[TRTLLM-5653][infra] Run docs build only if PR contains only doc changes (#5184 ) Signed-off-by: Alex Zhang <13271672+zhanga5@users.noreply.github.com> Signed-off-by: Yanchao Lu <yanchaol@nvidia.com> Co-authored-by: Alex Zhang <13271672+zhanga5@users.noreply.github.com> Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>	2025-07-14 21:40:33 +08:00
Zhanrui Sun	3a0ef73414	infra: [TRTLLM-6242] install cuda-toolkit to fix sanity check (#5709 ) Signed-off-by: ZhanruiSunCh <184402041+ZhanruiSunCh@users.noreply.github.com>	2025-07-14 18:52:13 +09:00
Yi Zhang	e5e87ecf34	test: Move some of the test from post merge to pre-merge, update dgx b200 test case (#5640 ) Signed-off-by: Yi Zhang <187001205+yizhang-nv@users.noreply.github.com>	2025-07-14 17:17:30 +08:00
xavier-nvidia	b6013da198	Fix GEMM+AR fusion on blackwell (#5563 ) Signed-off-by: xsimmons <xsimmons@nvidia.com>	2025-07-09 08:48:47 +08:00
Tailing Yuan	85b4a6808d	Refactor: move DeepEP from Docker images to wheel building (#5534 ) Signed-off-by: Tailing Yuan <yuantailing@gmail.com>	2025-07-07 22:57:03 +09:00
Yanchao Lu	2013034948	[Test] - Waive or fix few known test failures (#5769 ) Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>	2025-07-06 21:14:16 +08:00
Yuan Tong	32b244af38	feat: reduce unnecessary kernel generation (#5476 ) Signed-off-by: Yuan Tong <13075180+tongyuantongyu@users.noreply.github.com>	2025-07-04 14:37:49 +08:00
Yi Zhang	73d30a23c7	test: add more tests for GB200 with 8 GPUs/2 nodes in L0 tests (#5397 ) Signed-off-by: Yi Zhang <187001205+yizhang-nv@users.noreply.github.com>	2025-07-04 13:14:13 +08:00

1 2 3

120 Commits