TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-02-06 19:21:52 +08:00

Author	SHA1	Message	Date
xinhe-nv	272688c663	[None][fix] fix L0 issues (#10670 ) Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>	2026-01-14 18:09:40 +08:00
jmydurant	e7882d5c74	[None][feat] MiniMax M2 support (#10532 ) Signed-off-by: Mingyang Jiang <13463932+jmydurant@users.noreply.github.com>	2026-01-14 17:38:58 +08:00
mpikulski	052c36ddd2	[TRTLLM-9522][feat] support image_embeds in OpenAI API (#9715 ) Signed-off-by: ixlmar <206748156+ixlmar@users.noreply.github.com>	2026-01-14 10:31:03 +01:00
Bo Li	487287a412	[None][chore] Update test name MNNVL->NVLinkTwoSided. (#9672 ) Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>	2026-01-14 04:29:57 -05:00
QI JUN	c4da4fd462	[https://nvbugs/5637220 ][ci] unwaive TestQwen3_235B_A22B::test_nvfp4[latency_moe_trtllm_attention_dp] (#9870 ) Signed-off-by: junq <22017000+QiJune@users.noreply.github.com> Signed-off-by: QI JUN <22017000+QiJune@users.noreply.github.com>	2026-01-14 15:41:14 +08:00
xxi	f841b43cde	[None][chore] waive the CI failure (#10655 ) Signed-off-by: xxi <xxi@nvidia.com>	2026-01-14 13:59:15 +08:00
JennyLiu	92ae490410	[None][test] Spark - Change testlist name and perf yml format (#10626 ) Signed-off-by: Jenny Liu <JennyLiu-nv+JennyLiu@users.noreply.github.com> Co-authored-by: Jenny Liu <JennyLiu-nv+JennyLiu@users.noreply.github.com>	2026-01-13 23:07:11 -05:00
xinhe-nv	07d9390e9b	[None][test] add test into qa test list (#10627 ) Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>	2026-01-13 22:43:00 -05:00
xinhe-nv	7305c61fc9	[TRTLLM-8638][fix] Add failed cases into waives.txt (#10589 ) Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>	2026-01-13 22:00:20 -05:00
Balaram Buddharaju	ccdfa43a6e	[https://nvbugs/5791900 ][fix] Fix HelixCpMnnvlMemory init with PP (#10533 ) Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>	2026-01-13 15:48:42 -05:00
dongfengy	6ee8dbfe0b	[https://nvbugs/5772396 ][fix] WAR: Disable TinyGEMM PDL due to accuracy issues (#10619 ) Signed-off-by: Dongfeng Yu <dongfengy@nvidia.com>	2026-01-13 12:40:11 -05:00
Guoming Zhang	c1b0b7350f	[None][test] Unwaive qwen3 next test case. (#9877 ) Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>	2026-01-13 20:42:31 +08:00
Tailing Yuan	38296a472b	[None][feat] Layer-wise benchmarks: make model init more general and support weights loading (#10562 ) Signed-off-by: Tailing Yuan <yuantailing@gmail.com>	2026-01-13 19:17:03 +08:00
Erin	55580f8ec1	[NVBUG-5670458][chore] Unwaive lp tests (#10524 ) Signed-off-by: Erin Ho <14718778+hchings@users.noreply.github.com> Signed-off-by: Erin <14718778+hchings@users.noreply.github.com>	2026-01-13 04:31:27 -05:00
Guoming Zhang	bdaee87895	[TRTLLM-10060][feat] Enable attention dp for Nemotron Super v3. (#10347 ) Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>	2026-01-13 17:13:55 +08:00
JunyiXu-nv	e291a834db	[TRTLLM-8462][feat] Support GET/DELETE v1/responses/{response_id} (#9937 ) Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>	2026-01-13 03:57:14 -05:00
JennyLiu	2967d299fb	[TRTLLM-10271][test] Add Spark QA functional and performance cases (#10564 ) Signed-off-by: Jenny Liu <JennyLiu-nv+JennyLiu@users.noreply.github.com> Co-authored-by: Jenny Liu <JennyLiu-nv+JennyLiu@users.noreply.github.com>	2026-01-13 13:20:15 +08:00
fredricz-20070104	bbe535fddf	[None][chore] Fix disagg assert (#10596 ) Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com>	2026-01-12 21:39:57 -05:00
Iman Tabrizian	48b09e5a25	[https://nvbugs/5689235 ][fix] Fix cancellation+chunked prefill+disagg (#10111 ) Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>	2026-01-12 18:23:26 -05:00
Anish Shanbhag	dacc881993	[https://nvbugs/5761391 ][fix] Use correct model names for config database regression tests (#10192 ) Signed-off-by: Anish Shanbhag <ashanbhag@nvidia.com>	2026-01-12 10:55:07 -08:00
Suyog Gupta	a1385243e1	[#10580 ][fix] re-enable NemotronH MOE MMLU test (#10594 ) Signed-off-by: Suyog Gupta <41447211+suyoggupta@users.noreply.github.com>	2026-01-12 09:26:07 -08:00
Emma Qiao	9f044b9dd9	[None][infra] Waive failed tests for main 01/12 (#10604 ) Signed-off-by: qqiao <qqiao@nvidia.com>	2026-01-12 10:24:54 -05:00
Wanli Jiang	11da7e3605	[None][fix] Solve pillow version conflict (#10537 ) Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>	2026-01-12 04:05:54 -05:00
Zhenhuan Chen	3bd319dc8e	[https://nvbugs/5794796 ][chore] waive test blocking premerge (#10593 ) Signed-off-by: Zhenhuan Chen <zhenhuanc@nvidia.com>	2026-01-12 15:39:07 +08:00
yufeiwu-nv	8e806abac3	[None][test] Remove most TRT-backend test cases in llm_perf_nim.yml (#10572 ) Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com> Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com> Co-authored-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>	2026-01-12 15:34:55 +08:00
yingguo-trt	c5914f9085	[None][chore] update deepseekv3.2 test parameter (#10595 ) Signed-off-by: yingguo-trt <244492186+yingguo-trt@users.noreply.github.com>	2026-01-12 01:43:22 -05:00
chenfeiz0326	54459377d2	[TRTLLM-10248][feat] Support Bot to Send Perf Regression Msg to Slack Channel (#10489 ) Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>	2026-01-12 14:23:23 +08:00
Jie Li	5e0dbba0c9	[None][chore]: update waive list (#10577 ) Signed-off-by: Jie Li <lijie@nvidia.com>	2026-01-11 22:18:04 -05:00
Eran Geva	c5d5af9e7f	[#8391 ][chore] removed llama and added deepseek to AutoDeploy's L0 perf test (#10585 ) Signed-off-by: Eran Geva <19514940+MrGeva@users.noreply.github.com>	2026-01-11 16:31:24 -05:00
Ivy Zhang	7f018c89e9	[None][test] update core test list (#10538 ) Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>	2026-01-11 14:08:20 -05:00
Yechan Kim	8e0d20d901	[TRTLLM-10195][feat] K-EXAONE support (#10355 ) Signed-off-by: Jaedeok Kim <jaedeokk@nvidia.com> Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com> Co-authored-by: Jaedeok Kim <jaedeokk@nvidia.com>	2026-01-12 00:29:51 +09:00
HuiGao-NV	3c65ec3c55	[None][chore] waive test case (#10581 ) Signed-off-by: Hui Gao <huig@nvidia.com>	2026-01-10 18:53:36 -05:00
fredricz-20070104	f6045fac09	[None][chore] Fix Gitlab CI termination issues (#10576 ) Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com> Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com> Co-authored-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>	2026-01-10 07:51:18 -05:00
William Zhang	ff7eb93f31	[https://nvbugs/5669097 ][tests] Add MMMU test for mistral small (#10530 ) Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>	2026-01-09 16:09:28 -08:00
yingguo-trt	d80f01d205	[None][feat] Add support for DeepSeek v3.2 tests (#10561 ) Signed-off-by: yingguo-trt <244492186+yingguo-trt@users.noreply.github.com>	2026-01-09 10:20:29 -05:00
Yechan Kim	7295af68ba	[None][fix] Enable AttentionDP on Qwen3-VL and fix test (#10435 ) Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>	2026-01-10 00:13:26 +09:00
Iman Tabrizian	ced88424ef	[https://nvbugs/5756008 ][fix] unwaive test (#10523 ) Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>	2026-01-09 09:40:07 -05:00
Jie Li	627d306df9	[None][chore] remove some model support; add device constraint (#10563 ) Signed-off-by: Jie Li <lijie@nvidia.com>	2026-01-09 09:36:23 -05:00
ruodil	2b72d33fdc	[TRTLLM-9932][test] add kimi_k2 single node perf test (#10436 ) Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com> Co-authored-by: Ruodi Lu <ruodil@users.noreply.github.com>	2026-01-09 05:36:50 -05:00
bhsueh_NV	4a09acd012	[https://nvbugs/5785206 ][infra] unwaive the accuracy/test_llm_api_pytorch.py::TestQwen3_30B_A3B (#10560 ) Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>	2026-01-09 03:13:29 -05:00
JadoTu	4c498bfe58	[TRTLLM-9676][fix] Fix mamba_cache_manager when enabling cuda_graph_padding and let test cover this case (#9873 ) Signed-off-by: JadoTu <107457950+JadoTu@users.noreply.github.com>	2026-01-09 14:50:16 +08:00
Jie Li	6fcd4e7099	[None][chore] Add failed cases into waives.txt (#10541 ) Signed-off-by: Jie Li <lijie@nvidia.com>	2026-01-09 01:03:47 -05:00
ruodil	d707286ca8	[None][test] restrict max_num_tokens in disagg mtp config (#10442 ) Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com> Co-authored-by: Ruodi Lu <ruodil@users.noreply.github.com>	2026-01-08 21:53:24 -05:00
Balaram Buddharaju	56e779d09f	[None][chore] Waive tests blocking premerge 01/08 (#10555 ) Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>	2026-01-08 20:22:28 -05:00
Mike Iovine	4092a87b6f	[https://nvbugs/5740075 ][fix] Fix sm120 speculation (#10049 ) Signed-off-by: Mike Iovine <miovine@nvidia.com>	2026-01-08 19:55:43 -05:00
bhsueh_NV	bea61bb17d	[None][fix] Mistral large 3 few code refine (#10405 ) Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>	2026-01-08 06:38:49 -05:00
Emma Qiao	43839c7d9b	[TRTLLM-9642][infra] Increase pytest verbosity for failed tests (#9657 ) Signed-off-by: qqiao <qqiao@nvidia.com> Signed-off-by: Emma Qiao <qqiao@nvidia.com>	2026-01-08 02:33:48 -05:00
HuiGao-NV	22c81cb5fa	[None][chore] Enable seg fault cases since one race condition is fixed (#10398 ) Signed-off-by: Hui Gao <huig@nvidia.com>	2026-01-08 02:15:30 -05:00
Barry Kang	f57aab5255	[https://nvbugs/5775402 ][fix] Fix concurrency list in Wide-EP perf tests (#10529 ) Signed-off-by: Barry Kang <43644113+Barry-Delaney@users.noreply.github.com>	2026-01-08 01:58:55 -05:00
Lucas Liebenwein	30f8455d29	[https://nvbugs/5747878 ][fix] unwaive llama4 scout tests (#10468 ) Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>	2026-01-07 23:33:45 -05:00
yingguo-trt	f8b2a8fd30	[None][chore] Support multiple job submission at the same time (#10492 ) Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com> Co-authored-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com>	2026-01-07 21:51:36 -05:00
xxi	81f878c279	[https://nvbugs/5707392 ][fix] unwaive test_fused_moe_fp8_blockwise_wide_ep[NotEnabled] (#10428 ) Signed-off-by: xxi <xxi@nvidia.com>	2026-01-08 09:17:59 +08:00
yufeiwu-nv	b130d58c88	[None][test] Remove most TRT-backend test cases in llm_perf_nim.yml (#10487 ) Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com> Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com> Co-authored-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>	2026-01-07 17:18:43 +08:00
xinhe-nv	872210468b	[TRTLLM-8638][fix] Add failed cases into waives.txt (#10474 ) Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>	2026-01-07 03:23:43 -05:00
yingguo-trt	cbf8357e5f	[https://nvbugs/5726086 ][fix] update kimi-k2-1k1k dataset (#10473 ) Signed-off-by: yingguo-trt <244492186+yingguo-trt@users.noreply.github.com>	2026-01-07 01:24:08 -05:00
xinhe-nv	be5579633e	[TRTLLM-8638][fix] Add failed cases into waives.txt (#10457 ) Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>	2026-01-07 00:57:03 -05:00
Fanrong Li	a34aa63685	[https://nvbugs/5767223 ][feat] add pp support for DeepSeek-v3.2 (#10449 ) Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>	2026-01-07 12:29:51 +08:00
xinhe-nv	1fbadd2dde	[None][chore] Add failed cases into waives.txt (#10365 ) Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com> Signed-off-by: Jie Li <lijie@nvidia.com> Signed-off-by: Jie Li <76780849+jieli-matrix@users.noreply.github.com> Co-authored-by: Jie Li <lijie@nvidia.com> Co-authored-by: Jie Li <76780849+jieli-matrix@users.noreply.github.com>	2026-01-06 22:08:06 -05:00
Ivy Zhang	4a1b2e23b3	[https://nvbugs/5698434 ][test] add qwen3-4b accuracy test case (#10382 ) Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>	2026-01-06 21:56:34 -05:00
Lucas Liebenwein	6095c80e56	[https://nvbugs/5721907 ][fix] AutoDeploy: improve numerical stability of flashinfer attention test (#10467 ) Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>	2026-01-06 21:11:06 -05:00
Mike Iovine	77be1b7572	[https://nvbugs/5749988 ][fix] Remove redundant qwen3 spec dec test (#10387 ) Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>	2026-01-06 11:46:34 -05:00
Enwei Zhu	037753f65b	[https://nvbugs/5748600 ][ci] Unwaive disagg guided decoding test (#10409 ) Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>	2026-01-06 11:38:12 -05:00
JunyiXu-nv	7d62773c6c	[https://nvbugs/5760726 ][fix] Use random port in container port section (#10432 ) Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>	2026-01-06 23:25:46 +08:00
xinhe-nv	704f58dfbe	[TRTLLM-8638][fix] Add failed cases into waives.txt (#10427 ) Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>	2026-01-06 04:47:54 -05:00
Emma Qiao	6507087c3f	[None][infra] Waive failed cases on 1/6 (#10440 ) Signed-off-by: qqiao <qqiao@nvidia.com>	2026-01-06 16:54:54 +08:00
Bo Li	df0b976b99	[https://nvbugs/5785206 ][infra] Waive TestQwen3_30B_A3B::test_fp8[latency-torch_compile=False]. (#10441 ) Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>	2026-01-06 03:32:19 -05:00
William Zhang	ab58d7cac1	[https://nvbugs/5772361 ][ci] Unwaive tests that have been fixed (#10424 ) These tests were all failing due to the same issue, and were fixed in #10394. Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>	2026-01-05 23:49:54 -08:00
Ivy Zhang	1e828587e5	[TRTLLM-9896][test] add vswa test cases coverage (#10146 ) Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>	2026-01-06 02:02:29 -05:00
Yiqing Yan	5108a69fc0	[TRTLLM-9622][infra] Enable DGX_B300 multi-gpu testing in pre-merge pipeline (#9699 ) Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>	2026-01-06 14:39:55 +08:00
xinhe-nv	998527724c	[TRTLLM-8638][fix] Add failed cases into waives.txt (#10367 ) Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>	2026-01-06 01:09:21 -05:00
Ivy Zhang	22a1d31a27	[None][test] update test case constraint (#10381 ) Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>	2026-01-06 12:28:59 +08:00
xinhe-nv	1b1058279c	[TRTLLM-8638][fix] Add failed cases into waives.txt (#10384 ) Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>	2026-01-05 23:02:27 -05:00
kris1025	3e98265682	[None][chore] unwaive qwen3 30b test (#10115 ) Signed-off-by: linquanh <linquanh@nvidia.com>	2026-01-06 11:17:08 +08:00
chenfeiz0326	8a04c05079	[None][fix] Only Use Throughput Metrics to Check Regression (#10404 ) Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>	2026-01-06 09:21:15 +08:00
Simeng Liu	3b56548fcf	[https://nvbugs/5777044 ][chore] Remove solved bugs from waives.txt (#10422 ) Signed-off-by: Simeng Liu <109828133+SimengLiu-nv@users.noreply.github.com>	2026-01-05 16:56:58 -05:00
Mike Iovine	91ff46d418	[https://nvbugs/5745152 ][fix] Unwaive gpt oss spec decode test (#10370 ) Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>	2026-01-05 16:06:58 -05:00
Mike Iovine	7a2dab8e85	[https://nvbugs/5695984 ][fix] Unwaive llama3 eagle test (#10092 ) Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>	2026-01-05 16:03:35 -05:00
Yan Chunwei	6b71b03947	[TRTLLM-9551][infra] Partition test_llm_pytorch.py for parallel execution (#10400 ) Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>	2026-01-05 13:58:03 -05:00
Mike Iovine	db2614ef10	[https://nvbugs/5772414 ][fix] Fix draft token tree depth=1 corner case (#10385 ) Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>	2026-01-05 17:20:14 +01:00
Gal Hubara-Agam	e98c27ee4f	[TRTLLM-10053][feat] AutoDeploy: Add Super v3 config file, improve test runtime (#10397 ) Signed-off-by: Gal Hubara Agam <96368689+galagam@users.noreply.github.com>	2026-01-05 18:17:27 +02:00
Balaram Buddharaju	a792c23dcf	[TRTLLM-9465][fix] Swap TP-CP grouping order (#10350 ) Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>	2026-01-05 20:08:03 +08:00
xinhe-nv	b1733d56f6	[TRTLLM-9381][test] add disag-serving kimi k2 thinking tests (#10357 ) Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>	2026-01-05 05:15:52 -05:00
Fanrong Li	4931c5eb3a	[None][feat] update deepgemm to the DeepGEMM/nv_dev branch (#9898 ) Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>	2026-01-05 16:43:42 +08:00
HuiGao-NV	2f768b76f8	[https://nvbugs/5715568 ][fix] Force release torch memory when LLM is destroyed (#10314 ) Signed-off-by: Hui Gao <huig@nvidia.com>	2026-01-05 15:30:18 +08:00
Emma Qiao	c63fad7d96	[None][infra] Waive failed cases again on 1/5 (#10403 ) Signed-off-by: qqiao <qqiao@nvidia.com>	2026-01-05 02:12:16 -05:00
Yihan Wang	e7a4486294	[https://nvbugs/5752521 ][fix] Unwaive test_trtllm_flashinfer_symbol_collision.py (#10227 ) Signed-off-by: Yihan Wang <yihwang@nvidia.com>	2026-01-05 14:37:05 +08:00
Yukun He	0937df2c68	[TRTLLM-10185][feat] AutoTuner Cache: Support cache file lock and merge all ranks into one (#10336 ) Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>	2026-01-05 13:44:09 +08:00
Emma Qiao	5a8bfcbb50	[None][infra]Waive failed cases in post-merge on 1/5 (#10399 ) Signed-off-by: qqiao <qqiao@nvidia.com>	2026-01-05 12:30:10 +08:00
Yuxian Qiu	5773a4d775	[https://nvbugs/5701425 ][chore] Unwaive tests. (#10269 ) Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>	2026-01-05 09:54:26 +08:00
Fanrong Li	b5a1e10bc0	[https://nvbugs/5779534 ][fix] fix buffer reuse for CUDA graph attention metadata (#10393 ) Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>	2026-01-05 09:43:44 +08:00
Wanli Jiang	da0830670a	[TRTLLM-10065][feat] Add accuracy tests for super-v3 with multiple-gpus (#10234 ) Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>	2026-01-05 09:41:49 +08:00
Lizhi Zhou	82c1ba84a7	[https://nvbugs/5649010 ][fix] use 0 port as arbitrary port when disagg service discovery is enabled (#10383 ) Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>	2026-01-05 09:40:40 +08:00
Eran Geva	e2f5455533	[#8391 ][chore] added deepseek_r1_distill_qwen_32b AutoDeploy perf test to L0 (#10377 ) Signed-off-by: Eran Geva <19514940+MrGeva@users.noreply.github.com>	2026-01-04 20:35:52 +02:00
chenfeiz0326	a65b0d4efa	[None][fix] Decrease Pre Merge Perf Tests (#10390 ) Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com> Signed-off-by: Yanchao Lu <yanchaol@nvidia.com> Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>	2026-01-04 12:21:34 -05:00
Yanchao Lu	c4f27fa4c0	[None][ci] Some tweaks for the CI pipeline (#10359 ) Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>	2026-01-04 11:10:47 -05:00
dongfengy	afc533193d	[None][feat] Support nvfp4 for gptoss (#8956 ) Signed-off-by: Dongfeng Yu <dongfengy@nvidia.com>	2026-01-04 08:57:44 -05:00
Jaedeok Kim	a4dcc6a711	[TRTLLM-10171][fix] Correct attention handling in ModelConfig and KVCacheManager (#10330 ) Signed-off-by: Jaedeok Kim <jaedeokk@nvidia.com>	2026-01-04 06:07:30 -05:00
Yuxian Qiu	6ba04eba06	[https://nvbugs/5748683 ][fix] Use get_free_port_in_ci to avoid port conflict. (#10392 ) Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>	2026-01-04 19:04:58 +08:00
Yanchao Lu	c0b3c2b919	[None][ci] Remove an invalid test waive Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>	2026-01-03 23:34:13 +08:00
Emma Qiao	865992b86b	[None][infra] Waive failed cases on 1/3 (#10391 ) Signed-off-by: qqiao <qqiao@nvidia.com>	2026-01-03 05:54:09 -05:00
Gal Hubara-Agam	f3dd6da080	[#10056 ][chore] AutoDeploy: Enable Nemo SuperV3 accuracy test (#10308 ) Signed-off-by: Gal Hubara Agam <96368689+galagam@users.noreply.github.com>	2026-01-02 11:20:19 +02:00
chenfeiz0326	5e0e48144f	[None][fix] Minor updates on Perf Test System (#10375 ) Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>	2026-01-02 17:17:42 +08:00
fredricz-20070104	f631b25c85	[None][test] Unified slurm extra args management and session collection logic (#10332 ) Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com> Signed-off-by: yingguo-trt <244492186+yingguo-trt@users.noreply.github.com> Co-authored-by: yingguo-trt <244492186+yingguo-trt@users.noreply.github.com>	2026-01-01 21:10:51 -05:00
Balaram Buddharaju	4a1b742aa0	[TRTLLM-9467][fix] Fix PP+CP combination with helix parallelism (#10312 ) Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>	2026-01-01 13:42:53 -05:00
Balaram Buddharaju	9f5b750a93	[None][chore] Waive tests blocking pre-merge 12/31 (#10373 ) Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>	2026-01-01 03:00:24 -05:00
Balaram Buddharaju	0b75340223	[https://nvbugs/5744427 ][fix] Make Gemma3 multimodal test fp8 (#10368 ) Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>	2026-01-01 01:11:34 -05:00
Yuxian Qiu	ff836d4f41	[https://nvbugs/5740359 ][chore] Unwaive tests. (#10260 ) Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>	2026-01-01 09:53:34 +08:00
Lucas Liebenwein	1bbe71b3ed	[#10244 ][feat] AutoDeploy: separate prefill/decode in flashinfer (#10252 ) Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>	2025-12-31 17:01:24 -05:00
Simeng Liu	84d107b2f0	[https://nvbugs/5717993 ][fix] Add execution_stream across PyExecutor, KVCacheManager, PeftCacheManager to ensure proper CUDA stream synchronization between KV cache transfer operations and model forward kernels. (#10060 ) Signed-off-by: SimengLiu-nv <simengl@nvidia.com>	2025-12-31 09:22:54 -08:00
xinhe-nv	0d2e2718ce	[None][chore] Add failed cases into waives.txt (#10354 ) Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>	2025-12-31 09:30:22 -05:00
chenfeiz0326	a23c6f1092	[TRTLLM-9834][feat] Transfer to TRTLLM-INFRA Database and Fail post-merge tests if regression (#10282 ) Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>	2025-12-31 21:44:59 +08:00
Jin Li	ef1d4a40b5	[https://nvbugs/5727475 ][fix] Avoid use property with setter in nn.Mo… (#10212 ) Signed-off-by: Jin Li <59594262+liji-nv@users.noreply.github.com>	2025-12-31 06:21:36 -05:00
Emma Qiao	d944430f96	[None][infra] Waive failed cases on 12/31 (#10353 ) Signed-off-by: qqiao <qqiao@nvidia.com> Signed-off-by: Yanchao Lu <yanchaol@nvidia.com> Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>	2025-12-31 17:39:49 +08:00
xinhe-nv	827d12caaf	[https://nvbugs/5558516 ][test] add disaggregated stress test (#9354 ) Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>	2025-12-31 16:47:36 +08:00
Yuxian Qiu	910a633066	[https://nvbugs/5774869 ][chore] waive tests. (#10356 ) Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>	2025-12-31 03:00:52 -05:00
xinhe-nv	1e9c153b4c	[None][fix] disable thread leak check for kimi (#10337 ) Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>	2025-12-31 01:31:37 -05:00
xinhe-nv	6c1abf2d45	[None][chore] Add failed cases into waives.txt (#10344 ) Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>	2025-12-31 00:11:54 -05:00
Jin Li	34c2fd50a9	[https://nvbugs/5707359 ][fix] Unwaive OOM case that should be fixed by #9446 (#10334 ) Signed-off-by: Jin Li <59594262+liji-nv@users.noreply.github.com>	2025-12-31 10:41:39 +08:00
Yuxian Qiu	ec8a388c25	[https://nvbugs/5769890 ][fix] Import get_free_port. (#10341 ) Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>	2025-12-31 09:47:27 +08:00
Eran Geva	74832a1895	[https://nvbugs/5766986 ][fix] fixed the shard_all_unprocessed default value to align with the default.yml (#10271 ) Signed-off-by: Eran Geva <19514940+MrGeva@users.noreply.github.com>	2025-12-30 08:54:13 -05:00
Bo Li	1f0365da36	[None][infra] Add LongBenchV1 to trtllm-eval. (#10265 ) Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>	2025-12-30 21:39:34 +08:00
Emma Qiao	6732c76414	[None][infra] Waive failed cases for main on 12/30 (#10338 ) Signed-off-by: qqiao <qqiao@nvidia.com>	2025-12-30 05:17:43 -05:00
Emma Qiao	fb05cd769a	[None][infra] Enable single-gpu CI on spark (#9304 ) Signed-off-by: qqiao <qqiao@nvidia.com> Signed-off-by: Emma Qiao <qqiao@nvidia.com> Signed-off-by: Jenny Liu <JennyLiu-nv+JennyLiu@users.noreply.github.com> Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>	2025-12-30 17:22:14 +08:00
Emma Qiao	cce7247815	[https://nvbugs/5594703 ][infra] Unwaive the failed case to test (#10275 ) Signed-off-by: qqiao <qqiao@nvidia.com>	2025-12-30 16:38:54 +08:00
xinhe-nv	6accdbc6a6	[None][chore] Add failed cases into waives.txt (#10302 ) Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>	2025-12-30 03:11:52 -05:00
ruodil	0f4ed90560	[TRTLLM-9965][test] add long-context disagg test for GB300/GB200 and remove config_index in yaml (#10225 ) Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com> Co-authored-by: Ruodi Lu <ruodil@users.noreply.github.com>	2025-12-30 02:39:50 -05:00
xinhe-nv	3e0344a53d	[None][chore] Add failed cases into waives.txt (#10301 ) Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com> Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>	2025-12-30 14:04:28 +08:00
xinhe-nv	48fee8d0f6	[None][chore] Add failed cases into waives.txt (#10321 ) Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>	2025-12-30 00:11:49 -05:00
Emma Qiao	f396ad83b0	[None][infra] Remove duplicates in waives.txt (#10333 ) Signed-off-by: qqiao <qqiao@nvidia.com>	2025-12-29 22:32:52 -05:00
Balaram Buddharaju	4944192eae	[None][chore] Waive tests failing in pre-merge 12/28 (#10311 ) Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>	2025-12-29 20:53:49 -05:00
Yueh-Ting (eop) Chen	9cee32ab39	[https://nvbugs/5625990 ][fix] Respect VSWA scheme when doing block store for reuse and load block for reuse in KV cache manager (#10183 ) Signed-off-by: eopXD <yuehtingc@nvidia.com>	2025-12-29 14:29:14 +08:00
Yanchao Lu	2f8d6d25a8	[None][ci] Waive an intermittent test hang case (#10324 ) Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>	2025-12-29 13:04:31 +08:00
Yanchao Lu	270be801aa	[None][ci] Move remaining DGX-B200 tests to LBD (#9876 ) Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>	2025-12-28 13:55:39 +08:00
Jin Li	c04563657e	[TRTLLM-7735][feat] Attention NVFP4 out support for torch compile (#9740 ) Signed-off-by: Jin Li <59594262+liji-nv@users.noreply.github.com>	2025-12-27 00:07:20 +08:00
chenfeiz0326	d70aeddc7f	[TRTLLM-8952][feat] Support Multi-Node Disagg Perf Test in CI (#9138 ) Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>	2025-12-26 22:50:53 +08:00
Pengyun Lin	c5b0f9e436	[https://nvbugs/5633700 ][fix] Cache tiktoken vocab for gpt-oss (#10219 ) Signed-off-by: Pengyun Lin <81065165+LinPoly@users.noreply.github.com>	2025-12-26 18:39:03 +08:00
dongfengy	bfc591994c	[https://nvbugs/5745152 ][fix] Fix some GPTOSS test setups (#10085 ) Signed-off-by: Dongfeng Yu <dongfengy@nvidia.com>	2025-12-26 17:52:40 +08:00
bhsueh_NV	db3430f589	[None][feat] Support VLM part for Mistral Large 3 (#10188 ) Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>	2025-12-25 11:20:58 -05:00
ZhichenJiang	46e4af5688	[TRTLLM-9831][perf] Enable 2CTA with autotune for CuteDSL MoE and Grouped GEMM optimizations (#10201 ) Signed-off-by: zhichen jiang <zhichenj@NVIDIA.com> Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com> Co-authored-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>	2025-12-25 09:04:20 -05:00
Lizhi Zhou	fe12faef81	[https://nvbugs/5752516 ][chore] unwaive test; fix port conflicts in CI (#10152 ) Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>	2025-12-25 08:16:09 -05:00
Emma Qiao	0ecdb69b93	[None][infra] Waive failed tests for main on 12/25 (#10298 ) Signed-off-by: qqiao <qqiao@nvidia.com>	2025-12-25 05:22:39 -05:00
Jie Li	83e02ee335	[None][chore] Remove NIM TRT-Backend Test Lists (#10232 ) Signed-off-by: Jie Li <lijie@nvidia.com>	2025-12-25 04:01:51 -05:00
Enwei Zhu	182b3eb633	[None][ci] Waive TestLlama3_1_8B::test_auto_dtype[False-2] for timeout (#10293 ) Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>	2025-12-25 02:35:18 -05:00
xinhe-nv	4ae6f6a46c	[None][chore] Add failed cases into waives.txt (#10249 ) Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>	2025-12-25 01:26:21 -05:00
gramnarayan	a9eb5afc9f	[#9241 ][feat] AutoDeploy: Support Eagle3 Speculative Decoding (#9869 ) Support two model flow with no overlap scheduler or chain drafter. Drafting model is in PyTorch backend. Signed-off-by: Govind Ramnarayan <105831528+govind-ramnarayan@users.noreply.github.com>	2025-12-24 23:30:42 -05:00
Emma Qiao	16fd781e42	[TRTLLM-9862][infra] Move single-gpu tests on rtxpro6000d to pre-merge (#9897 ) Signed-off-by: qqiao <qqiao@nvidia.com>	2025-12-24 21:45:33 -05:00
Stanley Sun	ddac4d7379	[None][test] Add disag-serving auto scaling qa test (#10262 ) Signed-off-by: Stanley Sun <stsun@nvidia.com>	2025-12-24 08:43:47 -05:00
shuyixiong	f4f0fe85e9	[TRTLLM-9737][chore] Add rl perf reproduce script and enhance the robustness of Ray tests (#9939 ) Signed-off-by: Shuyi Xiong <219646547+shuyixiong@users.noreply.github.com>	2025-12-24 15:27:01 +08:00
xinhe-nv	534700ecd9	[None][chore] Add failed cases into waives.txt (#10240 ) Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>	2025-12-24 02:21:50 -05:00
Emma Qiao	7b84e48e0f	[None][infra] Waive failed cases om 12/24 (#10257 ) Signed-off-by: qqiao <qqiao@nvidia.com>	2025-12-23 22:49:57 -05:00
xinhe-nv	fc1f77eafc	[None][chore] Add failed cases into waives.txt (#10204 ) Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com> Signed-off-by: Jie Li <76780849+jieli-matrix@users.noreply.github.com> Co-authored-by: Jie Li <76780849+jieli-matrix@users.noreply.github.com>	2025-12-24 10:37:23 +08:00
Balaram Buddharaju	8c1cfc872b	[TRTLLM-9493][feat] Custom AllToAll for helix parallelism (#9986 ) Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>	2025-12-23 18:14:30 -08:00
Jhao-Ting Chen	92d90fa29a	[None][feat] Expose enable_trt_overlap in Triton_backend brings 1.05x OTPS (#10018 ) Signed-off-by: Jhao-Ting Chen <jhaotingc@nvidia.com>	2025-12-23 11:41:31 -06:00
Grzegorz Kwasniewski	0027a01ad5	[https://nvbugs/5680312 ][fix] Updated test waiving (#9630 ) Signed-off-by: greg-kwasniewski1 <213329731+greg-kwasniewski1@users.noreply.github.com>	2025-12-23 09:38:12 -08:00
Emma Qiao	984c20e0b2	[None][infra] Waive failed cases on 12/23 (#10236 ) Signed-off-by: qqiao <qqiao@nvidia.com>	2025-12-23 08:48:54 -05:00
dongfengy	e284d0bf80	[None][infra] Waive flaky unittest/executor/test_rpc_proxy.py and unittest/executor/test_rpc_worker.py tests (#10209 ) Signed-off-by: Dongfeng Yu <dongfengy@nvidia.com> Signed-off-by: Yanchao Lu <yanchaol@nvidia.com> Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>	2025-12-23 07:43:13 -05:00
Yukun He	522f1d2bc3	[https://nvbugs/5764627 ][chore] waive the time-out test (#10222 ) Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>	2025-12-23 16:36:06 +08:00
Balaram Buddharaju	f2e00a75de	[None][chore] Remove helix test from rtx test list (#10224 ) Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>	2025-12-23 03:07:37 -05:00
chenfeiz0326	48c875f8ea	[None][fix] Add OpenSearch URL in slurm_launch.sh for Multinode Perf Sanity Test (#9990 ) Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>	2025-12-23 16:02:38 +08:00
Chuang Zhu	53db3b2612	[https://nvbugs/5741884 ][fix] unwaive disagg sampler (#10189 ) Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>	2025-12-23 14:38:07 +08:00
xinhe-nv	77b591f73b	[None][chore] Add failed cases into waives.txt (#10177 ) Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com> Signed-off-by: Jie Li <lijie@nvidia.com> Signed-off-by: Jie Li <76780849+jieli-matrix@users.noreply.github.com> Co-authored-by: Jie Li <lijie@nvidia.com> Co-authored-by: Jie Li <76780849+jieli-matrix@users.noreply.github.com> Co-authored-by: Larry Xu <197874197+LarryXFly@users.noreply.github.com>	2025-12-23 13:43:50 +08:00
Harshini Komali	d691371eaf	[TRTLLM-9091] [feat] Replace GenAI-Perf with AIPerf (#9310 ) Signed-off-by: lkomali <lkomali@nvidia.com> Signed-off-by: Harshini Komali <157742537+lkomali@users.noreply.github.com> Co-authored-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>	2025-12-23 13:25:55 +08:00
Pamela Peng	5bc7ffe379	[None][test] Add qa tests for RTX 6K (#10210 ) Signed-off-by: Pamela <179191831+pamelap-nvidia@users.noreply.github.com>	2025-12-22 22:47:09 -05:00
fredricz-20070104	621156ad44	[None][chore] Fix GB300 support issues (#10196 ) Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com> Signed-off-by: fredricz-20070104 <226039983+fredricz-20070104@users.noreply.github.com>	2025-12-23 10:42:41 +08:00
Emma Qiao	ba14a9308e	[None][infra] Waive failed cases on 12/22 (#10200 ) Signed-off-by: qqiao <qqiao@nvidia.com> Signed-off-by: Yanchao Lu <yanchaol@nvidia.com> Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>	2025-12-23 00:05:45 +08:00
Perkz Zheng	c87f1a6b39	[https://nvbugs/5503479 ][fix] update trtllm-gen kernels to address few bugs (#10089 ) Signed-off-by: Perkz Zheng <67892460+PerkzZheng@users.noreply.github.com>	2025-12-22 04:45:33 -05:00
xinhe-nv	d30ee8101e	[None][chore] Remove closed bugs (#10182 ) Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>	2025-12-22 01:58:17 -05:00
Yuxian Qiu	237fd0eae4	[https://nvbugs/5666821 ][chore] unwaive tests. (#9958 ) Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>	2025-12-22 11:39:45 +08:00
Jin Li	066b653940	[TRTLLM-9880][feat] Include torch compile tests in QA test list (#10149 ) Signed-off-by: Jin Li <59594262+liji-nv@users.noreply.github.com>	2025-12-22 10:37:09 +08:00
Yuxian Qiu	2f139ee07e	[https://nvbugs/5701445 ][chore] unwaive test. (#9949 ) Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>	2025-12-22 10:12:54 +08:00
Chuang Zhu	914dd39127	[None][fix] disable cuda ipc on device without nvlink (L40s) for disagg test (#9735 ) Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>	2025-12-22 09:29:24 +08:00
dominicshanshan	d274a4c5d3	[https://nvbugs/5701457 ][fix] Unwaive ray test. (#10175 ) Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>	2025-12-22 09:25:58 +08:00
Enwei Zhu	5549067966	[None][ci] Waive GPTOSS test case (#10155 ) Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>	2025-12-22 08:50:44 +08:00
Balaram Buddharaju	5266475014	[None][feat] Cudagraph updates for helix parallelism (#10141 ) Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>	2025-12-21 15:21:52 -05:00
shuyixiong	4fc6036276	[https://nvbugs/5702793 ][fix] Fix view operation on uncontiguous tensor (#10147 ) Signed-off-by: Shuyi Xiong <219646547+shuyixiong@users.noreply.github.com>	2025-12-21 11:47:20 -05:00
bhsueh_NV	cd4b4f43fa	[None][feat] Support Eagle3 on Mistral Large3 (#9971 ) Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>	2025-12-21 10:25:45 -05:00
Emma Qiao	aa5dbb7ca5	[None][infra] Waive failed tests for main branch on 12/21 (#10184 ) Signed-off-by: qqiao <qqiao@nvidia.com>	2025-12-21 22:23:46 +08:00
Eran Geva	b15f987972	[None][chore] removed duplicated test from l0_b200.yml (#10090 ) Signed-off-by: Eran Geva <19514940+MrGeva@users.noreply.github.com>	2025-12-21 11:34:01 +02:00
Bo Li	a66eeab537	[TRTLLM-9805][feat] Skip Softmax Attention. (#9821 ) Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com> Signed-off-by: Tian Zheng <29906817+Tom-Zheng@users.noreply.github.com> Co-authored-by: Tian Zheng <29906817+Tom-Zheng@users.noreply.github.com>	2025-12-21 02:52:42 -05:00
Balaram Buddharaju	dcd3f7b5ea	[https://nvbugs/5744427 ][fix] Fix accuracy test OOM (#10173 ) Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>	2025-12-21 02:03:38 -05:00
Enwei Zhu	2ce785f39a	[https://nvbugs/5643631 ][fix] Fix hostfunc seg fault (#10028 ) Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>	2025-12-20 07:58:43 -05:00
Yuxian Qiu	3b3069b390	[https://nvbugs/5747930 ][fix] Use offline tokenizer for whisper models. (#10121 ) Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>	2025-12-20 09:42:07 +08:00
Balaram Buddharaju	bee9051484	[None][chore] Waive timing out pre-merge test (#10167 ) Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>	2025-12-19 17:56:33 -05:00
Gal Hubara-Agam	20b69a982a	[#10056 ][test] AutoDeploy: Add accuracy test for Nemotron SuperV3 (#10131 ) Signed-off-by: Gal Hubara Agam <96368689+galagam@users.noreply.github.com> Signed-off-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com> Co-authored-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com>	2025-12-19 13:28:42 -08:00
Chang Liu	5489d188a4	[None][fix] Revert the change and remove device count guard for DSv32 (#9631 ) Signed-off-by: Chang Liu (Enterprise Products) <9713593+chang-l@users.noreply.github.com>	2025-12-19 15:00:55 -05:00
Venky	dfa11d810e	[TRTC-102][docs] `--extra_llm_api_options`->`--config` in docs/examples/tests (#10005 )	2025-12-19 13:48:43 -05:00
JunyiXu-nv	7b71ff6b8a	[https://nvbugs/5722653 ][fix] Unwaive fixed test (#10157 ) Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>	2025-12-19 11:19:20 -05:00
xxi	27e49e2904	[None][fix] waive the failed test test_service_discovery[etcd-load_ba… (#10161 ) Signed-off-by: xxi <xxi@nvidia.com>	2025-12-19 06:14:26 -08:00
xinhe-nv	7b51e3cedb	[TRTLLM-8638][fix] Add failed cases into waives.txt (#10129 ) Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>	2025-12-19 17:55:17 +08:00
Emma Qiao	dd8ce68c94	[None][infra] Update waive and waive failed tests for main branch on 12/19 (#10151 ) Signed-off-by: qqiao <qqiao@nvidia.com>	2025-12-19 01:20:42 -08:00
yufeiwu-nv	52cee573ad	[TRTLLM-8830][test] Overlap scheduler enhancement perf test: Add qwen3_0,8b and llama3.1 test cases (#10114 ) Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>	2025-12-19 17:01:52 +08:00
xinhe-nv	cb0444b1b5	[TRTLLM-8638][fix] Add failed cases into waives.txt (#10132 ) Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com> Co-authored-by: Larry Xu <197874197+LarryXFly@users.noreply.github.com>	2025-12-19 16:07:56 +08:00
JunyiXu-nv	356ad4fe3a	[https://nvbugs/5722653 ][fix] Address port conflict by assigning different port section in the same node. (#10035 ) Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>	2025-12-19 15:34:04 +08:00
William Zhang	478b6b20a1	[#9230 ][refactor] Replace nemotron patches with custom model implementation (#9751 ) [#9230][refactor] Replace nemotron patches with custom model implementation * Why? Patching for nemotron H models was growing out of hand, and made certain optimizations more complex than they needed to be. * What? This commit finally gets rid of them, and replaces them with the custom model implementation in `modeling_nemotron_h.py`. Closes #9230 Closes NvBug 5747867 Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>	2025-12-18 19:36:27 -08:00
Balaram Buddharaju	72c5480dfb	[None][chore] Waive test blocking pre-merge 12/18 (#10145 ) Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>	2025-12-18 19:12:05 -08:00
Wangjue Yao	9f283f330b	[None][feat] Support Mooncake transfer engine as a cache transceiver backend (#8309 ) Signed-off-by: wjueyao <wyao123@terpmail.umd.edu> Signed-off-by: Shunkang <182541032+Shunkangz@users.noreply.github.co> Co-authored-by: Shunkang <182541032+Shunkangz@users.noreply.github.co>	2025-12-19 10:09:51 +08:00
Chuang Zhu	e0b2a94309	[None][fix] Fix ready signal in NIXL backend (#10000 ) Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>	2025-12-19 09:43:40 +08:00
Yukun He	bd5b3c2ac0	[https://nvbugs/5721912 ][chore] Unwaive the test (#10108 ) Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>	2025-12-19 09:12:25 +08:00
Anish Shanbhag	91a9ae42d2	[TRTC-71][feat] Add regression testing for config database (#9832 ) Signed-off-by: Anish Shanbhag <ashanbhag@nvidia.com>	2025-12-18 16:15:38 -08:00
Balaram Buddharaju	799a2ae311	[https://nvbugs/5741331 ][fix] Fix helix accuracy test (#10021 ) Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>	2025-12-18 15:27:53 -08:00

... 2 3 4 5 6 ...

1875 Commits