Balaram Buddharaju
ccdfa43a6e
[ https://nvbugs/5791900 ][fix] Fix HelixCpMnnvlMemory init with PP ( #10533 )
...
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2026-01-13 15:48:42 -05:00
Frida Hou
bf16fbd86c
[ #9283 ][feat] AutoDeploy: separate rms pattern detection from fusion ( #9969 )
...
Signed-off-by: Fridah-nv <201670829+Fridah-nv@users.noreply.github.com>
2026-01-13 14:57:27 -05:00
dongfengy
6ee8dbfe0b
[ https://nvbugs/5772396 ][fix] WAR: Disable TinyGEMM PDL due to accuracy issues ( #10619 )
...
Signed-off-by: Dongfeng Yu <dongfengy@nvidia.com>
2026-01-13 12:40:11 -05:00
benzh-2025
6df2c8a074
[None][feat] add fp4 gemm + allreduce ( #9729 )
...
Signed-off-by: benzh
Signed-off-by: benzh-2025
2026-01-13 21:11:13 +08:00
Guoming Zhang
c1b0b7350f
[None][test] Unwaive qwen3 next test case. ( #9877 )
...
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
2026-01-13 20:42:31 +08:00
Tailing Yuan
38296a472b
[None][feat] Layer-wise benchmarks: make model init more general and support weights loading ( #10562 )
...
Signed-off-by: Tailing Yuan <yuantailing@gmail.com>
2026-01-13 19:17:03 +08:00
Erin
55580f8ec1
[NVBUG-5670458][chore] Unwaive lp tests ( #10524 )
...
Signed-off-by: Erin Ho <14718778+hchings@users.noreply.github.com>
Signed-off-by: Erin <14718778+hchings@users.noreply.github.com>
2026-01-13 04:31:27 -05:00
Guoming Zhang
bdaee87895
[TRTLLM-10060][feat] Enable attention dp for Nemotron Super v3. ( #10347 )
...
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
2026-01-13 17:13:55 +08:00
JunyiXu-nv
e291a834db
[TRTLLM-8462][feat] Support GET/DELETE v1/responses/{response_id} ( #9937 )
...
Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
2026-01-13 03:57:14 -05:00
JennyLiu
2967d299fb
[TRTLLM-10271][test] Add Spark QA functional and performance cases ( #10564 )
...
Signed-off-by: Jenny Liu <JennyLiu-nv+JennyLiu@users.noreply.github.com>
Co-authored-by: Jenny Liu <JennyLiu-nv+JennyLiu@users.noreply.github.com>
2026-01-13 13:20:15 +08:00
fredricz-20070104
bbe535fddf
[None][chore] Fix disagg assert ( #10596 )
...
Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com>
2026-01-12 21:39:57 -05:00
Iman Tabrizian
48b09e5a25
[ https://nvbugs/5689235 ][fix] Fix cancellation+chunked prefill+disagg ( #10111 )
...
Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
2026-01-12 18:23:26 -05:00
Anish Shanbhag
dacc881993
[ https://nvbugs/5761391 ][fix] Use correct model names for config database regression tests ( #10192 )
...
Signed-off-by: Anish Shanbhag <ashanbhag@nvidia.com>
2026-01-12 10:55:07 -08:00
Suyog Gupta
a1385243e1
[ #10580 ][fix] re-enable NemotronH MOE MMLU test ( #10594 )
...
Signed-off-by: Suyog Gupta <41447211+suyoggupta@users.noreply.github.com>
2026-01-12 09:26:07 -08:00
Emma Qiao
9f044b9dd9
[None][infra] Waive failed tests for main 01/12 ( #10604 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2026-01-12 10:24:54 -05:00
mpikulski
bf7998f1b8
[TRTLLM-9522][test] cover LLM API multi_modal_embeddings ( #9963 )
...
Signed-off-by: ixlmar <206748156+ixlmar@users.noreply.github.com>
2026-01-12 11:38:22 +01:00
Wanli Jiang
11da7e3605
[None][fix] Solve pillow version conflict ( #10537 )
...
Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>
2026-01-12 04:05:54 -05:00
Zhenhuan Chen
3bd319dc8e
[ https://nvbugs/5794796 ][chore] waive test blocking premerge ( #10593 )
...
Signed-off-by: Zhenhuan Chen <zhenhuanc@nvidia.com>
2026-01-12 15:39:07 +08:00
yufeiwu-nv
8e806abac3
[None][test] Remove most TRT-backend test cases in llm_perf_nim.yml ( #10572 )
...
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>
Co-authored-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
2026-01-12 15:34:55 +08:00
yingguo-trt
c5914f9085
[None][chore] update deepseekv3.2 test parameter ( #10595 )
...
Signed-off-by: yingguo-trt <244492186+yingguo-trt@users.noreply.github.com>
2026-01-12 01:43:22 -05:00
chenfeiz0326
54459377d2
[TRTLLM-10248][feat] Support Bot to Send Perf Regression Msg to Slack Channel ( #10489 )
...
Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>
2026-01-12 14:23:23 +08:00
Jie Li
5e0dbba0c9
[None][chore]: update waive list ( #10577 )
...
Signed-off-by: Jie Li <lijie@nvidia.com>
2026-01-11 22:18:04 -05:00
Eran Geva
c5d5af9e7f
[ #8391 ][chore] removed llama and added deepseek to AutoDeploy's L0 perf test ( #10585 )
...
Signed-off-by: Eran Geva <19514940+MrGeva@users.noreply.github.com>
2026-01-11 16:31:24 -05:00
Ivy Zhang
7f018c89e9
[None][test] update core test list ( #10538 )
...
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
2026-01-11 14:08:20 -05:00
Yechan Kim
8e0d20d901
[TRTLLM-10195][feat] K-EXAONE support ( #10355 )
...
Signed-off-by: Jaedeok Kim <jaedeokk@nvidia.com>
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
Co-authored-by: Jaedeok Kim <jaedeokk@nvidia.com>
2026-01-12 00:29:51 +09:00
HuiGao-NV
3c65ec3c55
[None][chore] waive test case ( #10581 )
...
Signed-off-by: Hui Gao <huig@nvidia.com>
2026-01-10 18:53:36 -05:00
fredricz-20070104
f6045fac09
[None][chore] Fix Gitlab CI termination issues ( #10576 )
...
Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com>
Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>
Co-authored-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>
2026-01-10 07:51:18 -05:00
William Zhang
ff7eb93f31
[ https://nvbugs/5669097 ][tests] Add MMMU test for mistral small ( #10530 )
...
Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>
2026-01-09 16:09:28 -08:00
Chenghao Zhang
38f249b479
[ https://nvbugs/5548861 ][fix] AutoDeploy: Fix the test ( #10521 )
...
Signed-off-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com>
2026-01-09 13:30:24 -08:00
yingguo-trt
d80f01d205
[None][feat] Add support for DeepSeek v3.2 tests ( #10561 )
...
Signed-off-by: yingguo-trt <244492186+yingguo-trt@users.noreply.github.com>
2026-01-09 10:20:29 -05:00
Yechan Kim
7295af68ba
[None][fix] Enable AttentionDP on Qwen3-VL and fix test ( #10435 )
...
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
2026-01-10 00:13:26 +09:00
Iman Tabrizian
ced88424ef
[ https://nvbugs/5756008 ][fix] unwaive test ( #10523 )
...
Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
2026-01-09 09:40:07 -05:00
Jie Li
627d306df9
[None][chore] remove some model support; add device constraint ( #10563 )
...
Signed-off-by: Jie Li <lijie@nvidia.com>
2026-01-09 09:36:23 -05:00
ruodil
2b72d33fdc
[TRTLLM-9932][test] add kimi_k2 single node perf test ( #10436 )
...
Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com>
Co-authored-by: Ruodi Lu <ruodil@users.noreply.github.com>
2026-01-09 05:36:50 -05:00
bhsueh_NV
4a09acd012
[ https://nvbugs/5785206 ][infra] unwaive the accuracy/test_llm_api_pytorch.py::TestQwen3_30B_A3B ( #10560 )
...
Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>
2026-01-09 03:13:29 -05:00
JadoTu
4c498bfe58
[TRTLLM-9676][fix] Fix mamba_cache_manager when enabling cuda_graph_padding and let test cover this case ( #9873 )
...
Signed-off-by: JadoTu <107457950+JadoTu@users.noreply.github.com>
2026-01-09 14:50:16 +08:00
Jie Li
6fcd4e7099
[None][chore] Add failed cases into waives.txt ( #10541 )
...
Signed-off-by: Jie Li <lijie@nvidia.com>
2026-01-09 01:03:47 -05:00
ruodil
d707286ca8
[None][test] restrict max_num_tokens in disagg mtp config ( #10442 )
...
Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com>
Co-authored-by: Ruodi Lu <ruodil@users.noreply.github.com>
2026-01-08 21:53:24 -05:00
Balaram Buddharaju
56e779d09f
[None][chore] Waive tests blocking premerge 01/08 ( #10555 )
...
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2026-01-08 20:22:28 -05:00
Mike Iovine
4092a87b6f
[ https://nvbugs/5740075 ][fix] Fix sm120 speculation ( #10049 )
...
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2026-01-08 19:55:43 -05:00
William Zhang
c0ae6bbdbe
[None][feat] EPD for Qwen3 VL ( #10470 )
...
* Why?
We would like to support EPD disaggregated serving for Qwen3 VL.
* What?
This commit adds such support, and extends existing unit tests for
correctness checks.
Some minor (protected) interface changes had to be made to the
weight mapper as a side-effect.
Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>
2026-01-08 06:45:54 -05:00
bhsueh_NV
bea61bb17d
[None][fix] Mistral large 3 few code refine ( #10405 )
...
Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>
2026-01-08 06:38:49 -05:00
Emma Qiao
43839c7d9b
[TRTLLM-9642][infra] Increase pytest verbosity for failed tests ( #9657 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
Signed-off-by: Emma Qiao <qqiao@nvidia.com>
2026-01-08 02:33:48 -05:00
HuiGao-NV
22c81cb5fa
[None][chore] Enable seg fault cases since one race condition is fixed ( #10398 )
...
Signed-off-by: Hui Gao <huig@nvidia.com>
2026-01-08 02:15:30 -05:00
Barry Kang
f57aab5255
[ https://nvbugs/5775402 ][fix] Fix concurrency list in Wide-EP perf tests ( #10529 )
...
Signed-off-by: Barry Kang <43644113+Barry-Delaney@users.noreply.github.com>
2026-01-08 01:58:55 -05:00
Lucas Liebenwein
30f8455d29
[ https://nvbugs/5747878 ][fix] unwaive llama4 scout tests ( #10468 )
...
Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
2026-01-07 23:33:45 -05:00
yingguo-trt
f8b2a8fd30
[None][chore] Support multiple job submission at the same time ( #10492 )
...
Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com>
Co-authored-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com>
2026-01-07 21:51:36 -05:00
Yuxian Qiu
b85c447ceb
[ https://nvbugs/5784543 ][fix] Setup dist before using autotuner. ( #10491 )
...
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
2026-01-08 10:32:50 +08:00
xxi
81f878c279
[ https://nvbugs/5707392 ][fix] unwaive test_fused_moe_fp8_blockwise_wide_ep[NotEnabled] ( #10428 )
...
Signed-off-by: xxi <xxi@nvidia.com>
2026-01-08 09:17:59 +08:00
Lucas Liebenwein
d736c7f290
[ https://nvbugs/5761665 ][fix] AutoDeploy: handle bugs for 25.12 dlfw upgrade ( #10511 )
...
Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
2026-01-07 20:16:53 -05:00
yufeiwu-nv
b130d58c88
[None][test] Remove most TRT-backend test cases in llm_perf_nim.yml ( #10487 )
...
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>
Co-authored-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
2026-01-07 17:18:43 +08:00
xinhe-nv
872210468b
[TRTLLM-8638][fix] Add failed cases into waives.txt ( #10474 )
...
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2026-01-07 03:23:43 -05:00
yingguo-trt
cbf8357e5f
[ https://nvbugs/5726086 ][fix] update kimi-k2-1k1k dataset ( #10473 )
...
Signed-off-by: yingguo-trt <244492186+yingguo-trt@users.noreply.github.com>
2026-01-07 01:24:08 -05:00
xinhe-nv
be5579633e
[TRTLLM-8638][fix] Add failed cases into waives.txt ( #10457 )
...
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2026-01-07 00:57:03 -05:00
Fanrong Li
a34aa63685
[ https://nvbugs/5767223 ][feat] add pp support for DeepSeek-v3.2 ( #10449 )
...
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
2026-01-07 12:29:51 +08:00
xinhe-nv
1fbadd2dde
[None][chore] Add failed cases into waives.txt ( #10365 )
...
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
Signed-off-by: Jie Li <lijie@nvidia.com>
Signed-off-by: Jie Li <76780849+jieli-matrix@users.noreply.github.com>
Co-authored-by: Jie Li <lijie@nvidia.com>
Co-authored-by: Jie Li <76780849+jieli-matrix@users.noreply.github.com>
2026-01-06 22:08:06 -05:00
Ivy Zhang
4a1b2e23b3
[ https://nvbugs/5698434 ][test] add qwen3-4b accuracy test case ( #10382 )
...
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
2026-01-06 21:56:34 -05:00
Lucas Liebenwein
6095c80e56
[ https://nvbugs/5721907 ][fix] AutoDeploy: improve numerical stability of flashinfer attention test ( #10467 )
...
Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
2026-01-06 21:11:06 -05:00
Zongfei Jing
bb2f883296
[None] [feat] Add test script and raster M for gather fc1 kernel ( #10429 )
...
Signed-off-by: Zongfei Jing <20381269+zongfeijing@users.noreply.github.com>
2026-01-07 09:31:49 +08:00
Lucas Liebenwein
bb6a3973aa
[ https://nvbugs/5732942 ][fix] AutoDeploy: handle transformers 4.57.1 upgrade fixes ( #10466 )
...
Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
2026-01-06 19:55:49 -05:00
Mike Iovine
77be1b7572
[ https://nvbugs/5749988 ][fix] Remove redundant qwen3 spec dec test ( #10387 )
...
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
2026-01-06 11:46:34 -05:00
Enwei Zhu
037753f65b
[ https://nvbugs/5748600 ][ci] Unwaive disagg guided decoding test ( #10409 )
...
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
2026-01-06 11:38:12 -05:00
JunyiXu-nv
7d62773c6c
[ https://nvbugs/5760726 ][fix] Use random port in container port section ( #10432 )
...
Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
2026-01-06 23:25:46 +08:00
xinhe-nv
704f58dfbe
[TRTLLM-8638][fix] Add failed cases into waives.txt ( #10427 )
...
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2026-01-06 04:47:54 -05:00
Emma Qiao
6507087c3f
[None][infra] Waive failed cases on 1/6 ( #10440 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2026-01-06 16:54:54 +08:00
Bo Li
df0b976b99
[ https://nvbugs/5785206 ][infra] Waive TestQwen3_30B_A3B::test_fp8[latency-torch_compile=False]. ( #10441 )
...
Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>
2026-01-06 03:32:19 -05:00
William Zhang
ab58d7cac1
[ https://nvbugs/5772361 ][ci] Unwaive tests that have been fixed ( #10424 )
...
These tests were all failing due to the same issue, and were fixed
in #10394 .
Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>
2026-01-05 23:49:54 -08:00
Ivy Zhang
1e828587e5
[TRTLLM-9896][test] add vswa test cases coverage ( #10146 )
...
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
2026-01-06 02:02:29 -05:00
Yiqing Yan
5108a69fc0
[TRTLLM-9622][infra] Enable DGX_B300 multi-gpu testing in pre-merge pipeline ( #9699 )
...
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
2026-01-06 14:39:55 +08:00
xinhe-nv
998527724c
[TRTLLM-8638][fix] Add failed cases into waives.txt ( #10367 )
...
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2026-01-06 01:09:21 -05:00
Ivy Zhang
22a1d31a27
[None][test] update test case constraint ( #10381 )
...
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
2026-01-06 12:28:59 +08:00
xinhe-nv
1b1058279c
[TRTLLM-8638][fix] Add failed cases into waives.txt ( #10384 )
...
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2026-01-05 23:02:27 -05:00
kris1025
3e98265682
[None][chore] unwaive qwen3 30b test ( #10115 )
...
Signed-off-by: linquanh <linquanh@nvidia.com>
2026-01-06 11:17:08 +08:00
alel
6b8ae6fa81
[None][feat] CuteDSL MOE FC1 Enhancement ( #10088 )
...
Signed-off-by: Yuhan Li <51736452+liyuhannnnn@users.noreply.github.com>
2026-01-06 09:30:43 +08:00
chenfeiz0326
8a04c05079
[None][fix] Only Use Throughput Metrics to Check Regression ( #10404 )
...
Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>
2026-01-06 09:21:15 +08:00
Chuang Zhu
536a8f6a9c
[TRTLLM-9527][feat] Add transferAgent binding (step 1) ( #10113 )
...
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
2026-01-06 08:40:38 +08:00
Simeng Liu
3b56548fcf
[ https://nvbugs/5777044 ][chore] Remove solved bugs from waives.txt ( #10422 )
...
Signed-off-by: Simeng Liu <109828133+SimengLiu-nv@users.noreply.github.com>
2026-01-05 16:56:58 -05:00
Mike Iovine
91ff46d418
[ https://nvbugs/5745152 ][fix] Unwaive gpt oss spec decode test ( #10370 )
...
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
2026-01-05 16:06:58 -05:00
Mike Iovine
7a2dab8e85
[ https://nvbugs/5695984 ][fix] Unwaive llama3 eagle test ( #10092 )
...
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
2026-01-05 16:03:35 -05:00
Yan Chunwei
6b71b03947
[TRTLLM-9551][infra] Partition test_llm_pytorch.py for parallel execution ( #10400 )
...
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
2026-01-05 13:58:03 -05:00
Mike Iovine
db2614ef10
[ https://nvbugs/5772414 ][fix] Fix draft token tree depth=1 corner case ( #10385 )
...
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
2026-01-05 17:20:14 +01:00
Gal Hubara-Agam
e98c27ee4f
[TRTLLM-10053][feat] AutoDeploy: Add Super v3 config file, improve test runtime ( #10397 )
...
Signed-off-by: Gal Hubara Agam <96368689+galagam@users.noreply.github.com>
2026-01-05 18:17:27 +02:00
Anthony Chang
225d3a9001
[None][perf] TRTLLM MoE maps to lower tuning buckets when ep>1 ( #9998 )
...
Signed-off-by: Anthony Chang <27950904+rosenrodt@users.noreply.github.com>
2026-01-05 17:16:12 +01:00
Balaram Buddharaju
a792c23dcf
[TRTLLM-9465][fix] Swap TP-CP grouping order ( #10350 )
...
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2026-01-05 20:08:03 +08:00
xinhe-nv
b1733d56f6
[TRTLLM-9381][test] add disag-serving kimi k2 thinking tests ( #10357 )
...
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2026-01-05 05:15:52 -05:00
Fanrong Li
4931c5eb3a
[None][feat] update deepgemm to the DeepGEMM/nv_dev branch ( #9898 )
...
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
2026-01-05 16:43:42 +08:00
Yukun He
d272f1a9bc
[TRTLLM-8821][feat] Apply AutoTuner to AllReduce Op for strategy tuning. ( #8531 )
...
Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>
2026-01-05 15:44:37 +08:00
HuiGao-NV
2f768b76f8
[ https://nvbugs/5715568 ][fix] Force release torch memory when LLM is destroyed ( #10314 )
...
Signed-off-by: Hui Gao <huig@nvidia.com>
2026-01-05 15:30:18 +08:00
Emma Qiao
c63fad7d96
[None][infra] Waive failed cases again on 1/5 ( #10403 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2026-01-05 02:12:16 -05:00
Yihan Wang
e7a4486294
[ https://nvbugs/5752521 ][fix] Unwaive test_trtllm_flashinfer_symbol_collision.py ( #10227 )
...
Signed-off-by: Yihan Wang <yihwang@nvidia.com>
2026-01-05 14:37:05 +08:00
Yukun He
0937df2c68
[TRTLLM-10185][feat] AutoTuner Cache: Support cache file lock and merge all ranks into one ( #10336 )
...
Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>
2026-01-05 13:44:09 +08:00
Emma Qiao
5a8bfcbb50
[None][infra]Waive failed cases in post-merge on 1/5 ( #10399 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2026-01-05 12:30:10 +08:00
Tailing Yuan
a7fe043b13
[None][feat] Layer-wise benchmarks: support TEP balance, polish slurm scripts ( #10237 )
...
Signed-off-by: Tailing Yuan <yuantailing@gmail.com>
2026-01-05 11:23:04 +08:00
Yuxian Qiu
5773a4d775
[ https://nvbugs/5701425 ][chore] Unwaive tests. ( #10269 )
...
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
2026-01-05 09:54:26 +08:00
Fanrong Li
b5a1e10bc0
[ https://nvbugs/5779534 ][fix] fix buffer reuse for CUDA graph attention metadata ( #10393 )
...
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
2026-01-05 09:43:44 +08:00
Wanli Jiang
da0830670a
[TRTLLM-10065][feat] Add accuracy tests for super-v3 with multiple-gpus ( #10234 )
...
Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>
2026-01-05 09:41:49 +08:00
Lizhi Zhou
82c1ba84a7
[ https://nvbugs/5649010 ][fix] use 0 port as arbitrary port when disagg service discovery is enabled ( #10383 )
...
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
2026-01-05 09:40:40 +08:00
Eran Geva
e2f5455533
[ #8391 ][chore] added deepseek_r1_distill_qwen_32b AutoDeploy perf test to L0 ( #10377 )
...
Signed-off-by: Eran Geva <19514940+MrGeva@users.noreply.github.com>
2026-01-04 20:35:52 +02:00
chenfeiz0326
a65b0d4efa
[None][fix] Decrease Pre Merge Perf Tests ( #10390 )
...
Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
2026-01-04 12:21:34 -05:00
Yanchao Lu
c4f27fa4c0
[None][ci] Some tweaks for the CI pipeline ( #10359 )
...
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
2026-01-04 11:10:47 -05:00