Lucas Liebenwein
|
6095c80e56
|
[https://nvbugs/5721907][fix] AutoDeploy: improve numerical stability of flashinfer attention test (#10467)
Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
|
2026-01-06 21:11:06 -05:00 |
|
Mike Iovine
|
77be1b7572
|
[https://nvbugs/5749988][fix] Remove redundant qwen3 spec dec test (#10387)
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
|
2026-01-06 11:46:34 -05:00 |
|
Enwei Zhu
|
037753f65b
|
[https://nvbugs/5748600][ci] Unwaive disagg guided decoding test (#10409)
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
|
2026-01-06 11:38:12 -05:00 |
|
JunyiXu-nv
|
7d62773c6c
|
[https://nvbugs/5760726][fix] Use random port in container port section (#10432)
Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
|
2026-01-06 23:25:46 +08:00 |
|
xinhe-nv
|
704f58dfbe
|
[TRTLLM-8638][fix] Add failed cases into waives.txt (#10427)
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
|
2026-01-06 04:47:54 -05:00 |
|
Emma Qiao
|
6507087c3f
|
[None][infra] Waive failed cases on 1/6 (#10440)
Signed-off-by: qqiao <qqiao@nvidia.com>
|
2026-01-06 16:54:54 +08:00 |
|
Bo Li
|
df0b976b99
|
[https://nvbugs/5785206][infra] Waive TestQwen3_30B_A3B::test_fp8[latency-torch_compile=False]. (#10441)
Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>
|
2026-01-06 03:32:19 -05:00 |
|
William Zhang
|
ab58d7cac1
|
[https://nvbugs/5772361][ci] Unwaive tests that have been fixed (#10424)
These tests were all failing due to the same issue, and were fixed
in #10394.
Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>
|
2026-01-05 23:49:54 -08:00 |
|
Ivy Zhang
|
1e828587e5
|
[TRTLLM-9896][test] add vswa test cases coverage (#10146)
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
|
2026-01-06 02:02:29 -05:00 |
|
Yiqing Yan
|
5108a69fc0
|
[TRTLLM-9622][infra] Enable DGX_B300 multi-gpu testing in pre-merge pipeline (#9699)
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
|
2026-01-06 14:39:55 +08:00 |
|
xinhe-nv
|
998527724c
|
[TRTLLM-8638][fix] Add failed cases into waives.txt (#10367)
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
|
2026-01-06 01:09:21 -05:00 |
|
Ivy Zhang
|
22a1d31a27
|
[None][test] update test case constraint (#10381)
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
|
2026-01-06 12:28:59 +08:00 |
|
xinhe-nv
|
1b1058279c
|
[TRTLLM-8638][fix] Add failed cases into waives.txt (#10384)
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
|
2026-01-05 23:02:27 -05:00 |
|
kris1025
|
3e98265682
|
[None][chore] unwaive qwen3 30b test (#10115)
Signed-off-by: linquanh <linquanh@nvidia.com>
|
2026-01-06 11:17:08 +08:00 |
|
chenfeiz0326
|
8a04c05079
|
[None][fix] Only Use Throughput Metrics to Check Regression (#10404)
Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>
|
2026-01-06 09:21:15 +08:00 |
|
Simeng Liu
|
3b56548fcf
|
[https://nvbugs/5777044][chore] Remove solved bugs from waives.txt (#10422)
Signed-off-by: Simeng Liu <109828133+SimengLiu-nv@users.noreply.github.com>
|
2026-01-05 16:56:58 -05:00 |
|
Mike Iovine
|
91ff46d418
|
[https://nvbugs/5745152][fix] Unwaive gpt oss spec decode test (#10370)
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
|
2026-01-05 16:06:58 -05:00 |
|
Mike Iovine
|
7a2dab8e85
|
[https://nvbugs/5695984][fix] Unwaive llama3 eagle test (#10092)
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
|
2026-01-05 16:03:35 -05:00 |
|
Yan Chunwei
|
6b71b03947
|
[TRTLLM-9551][infra] Partition test_llm_pytorch.py for parallel execution (#10400)
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
|
2026-01-05 13:58:03 -05:00 |
|
Mike Iovine
|
db2614ef10
|
[https://nvbugs/5772414][fix] Fix draft token tree depth=1 corner case (#10385)
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
|
2026-01-05 17:20:14 +01:00 |
|
Gal Hubara-Agam
|
e98c27ee4f
|
[TRTLLM-10053][feat] AutoDeploy: Add Super v3 config file, improve test runtime (#10397)
Signed-off-by: Gal Hubara Agam <96368689+galagam@users.noreply.github.com>
|
2026-01-05 18:17:27 +02:00 |
|
Balaram Buddharaju
|
a792c23dcf
|
[TRTLLM-9465][fix] Swap TP-CP grouping order (#10350)
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
|
2026-01-05 20:08:03 +08:00 |
|
xinhe-nv
|
b1733d56f6
|
[TRTLLM-9381][test] add disag-serving kimi k2 thinking tests (#10357)
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
|
2026-01-05 05:15:52 -05:00 |
|
Fanrong Li
|
4931c5eb3a
|
[None][feat] update deepgemm to the DeepGEMM/nv_dev branch (#9898)
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
|
2026-01-05 16:43:42 +08:00 |
|
HuiGao-NV
|
2f768b76f8
|
[https://nvbugs/5715568][fix] Force release torch memory when LLM is destroyed (#10314)
Signed-off-by: Hui Gao <huig@nvidia.com>
|
2026-01-05 15:30:18 +08:00 |
|
Emma Qiao
|
c63fad7d96
|
[None][infra] Waive failed cases again on 1/5 (#10403)
Signed-off-by: qqiao <qqiao@nvidia.com>
|
2026-01-05 02:12:16 -05:00 |
|
Yihan Wang
|
e7a4486294
|
[https://nvbugs/5752521][fix] Unwaive test_trtllm_flashinfer_symbol_collision.py (#10227)
Signed-off-by: Yihan Wang <yihwang@nvidia.com>
|
2026-01-05 14:37:05 +08:00 |
|
Yukun He
|
0937df2c68
|
[TRTLLM-10185][feat] AutoTuner Cache: Support cache file lock and merge all ranks into one (#10336)
Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>
|
2026-01-05 13:44:09 +08:00 |
|
Emma Qiao
|
5a8bfcbb50
|
[None][infra]Waive failed cases in post-merge on 1/5 (#10399)
Signed-off-by: qqiao <qqiao@nvidia.com>
|
2026-01-05 12:30:10 +08:00 |
|
Yuxian Qiu
|
5773a4d775
|
[https://nvbugs/5701425][chore] Unwaive tests. (#10269)
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
|
2026-01-05 09:54:26 +08:00 |
|
Fanrong Li
|
b5a1e10bc0
|
[https://nvbugs/5779534][fix] fix buffer reuse for CUDA graph attention metadata (#10393)
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
|
2026-01-05 09:43:44 +08:00 |
|
Wanli Jiang
|
da0830670a
|
[TRTLLM-10065][feat] Add accuracy tests for super-v3 with multiple-gpus (#10234)
Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>
|
2026-01-05 09:41:49 +08:00 |
|
Lizhi Zhou
|
82c1ba84a7
|
[https://nvbugs/5649010][fix] use 0 port as arbitrary port when disagg service discovery is enabled (#10383)
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
|
2026-01-05 09:40:40 +08:00 |
|
Eran Geva
|
e2f5455533
|
[#8391][chore] added deepseek_r1_distill_qwen_32b AutoDeploy perf test to L0 (#10377)
Signed-off-by: Eran Geva <19514940+MrGeva@users.noreply.github.com>
|
2026-01-04 20:35:52 +02:00 |
|
chenfeiz0326
|
a65b0d4efa
|
[None][fix] Decrease Pre Merge Perf Tests (#10390)
Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
|
2026-01-04 12:21:34 -05:00 |
|
Yanchao Lu
|
c4f27fa4c0
|
[None][ci] Some tweaks for the CI pipeline (#10359)
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
|
2026-01-04 11:10:47 -05:00 |
|
dongfengy
|
afc533193d
|
[None][feat] Support nvfp4 for gptoss (#8956)
Signed-off-by: Dongfeng Yu <dongfengy@nvidia.com>
|
2026-01-04 08:57:44 -05:00 |
|
Jaedeok Kim
|
a4dcc6a711
|
[TRTLLM-10171][fix] Correct attention handling in ModelConfig and KVCacheManager (#10330)
Signed-off-by: Jaedeok Kim <jaedeokk@nvidia.com>
|
2026-01-04 06:07:30 -05:00 |
|
Yuxian Qiu
|
6ba04eba06
|
[https://nvbugs/5748683][fix] Use get_free_port_in_ci to avoid port conflict. (#10392)
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
|
2026-01-04 19:04:58 +08:00 |
|
Yanchao Lu
|
c0b3c2b919
|
[None][ci] Remove an invalid test waive
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
|
2026-01-03 23:34:13 +08:00 |
|
Emma Qiao
|
865992b86b
|
[None][infra] Waive failed cases on 1/3 (#10391)
Signed-off-by: qqiao <qqiao@nvidia.com>
|
2026-01-03 05:54:09 -05:00 |
|
Gal Hubara-Agam
|
f3dd6da080
|
[#10056][chore] AutoDeploy: Enable Nemo SuperV3 accuracy test (#10308)
Signed-off-by: Gal Hubara Agam <96368689+galagam@users.noreply.github.com>
|
2026-01-02 11:20:19 +02:00 |
|
chenfeiz0326
|
5e0e48144f
|
[None][fix] Minor updates on Perf Test System (#10375)
Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>
|
2026-01-02 17:17:42 +08:00 |
|
fredricz-20070104
|
f631b25c85
|
[None][test] Unified slurm extra args management and session collection logic (#10332)
Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com>
Signed-off-by: yingguo-trt <244492186+yingguo-trt@users.noreply.github.com>
Co-authored-by: yingguo-trt <244492186+yingguo-trt@users.noreply.github.com>
|
2026-01-01 21:10:51 -05:00 |
|
Balaram Buddharaju
|
4a1b742aa0
|
[TRTLLM-9467][fix] Fix PP+CP combination with helix parallelism (#10312)
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
|
2026-01-01 13:42:53 -05:00 |
|
Balaram Buddharaju
|
9f5b750a93
|
[None][chore] Waive tests blocking pre-merge 12/31 (#10373)
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
|
2026-01-01 03:00:24 -05:00 |
|
Balaram Buddharaju
|
0b75340223
|
[https://nvbugs/5744427][fix] Make Gemma3 multimodal test fp8 (#10368)
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
|
2026-01-01 01:11:34 -05:00 |
|
Yuxian Qiu
|
ff836d4f41
|
[https://nvbugs/5740359][chore] Unwaive tests. (#10260)
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
|
2026-01-01 09:53:34 +08:00 |
|
Lucas Liebenwein
|
1bbe71b3ed
|
[#10244][feat] AutoDeploy: separate prefill/decode in flashinfer (#10252)
Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
|
2025-12-31 17:01:24 -05:00 |
|
Simeng Liu
|
84d107b2f0
|
[https://nvbugs/5717993][fix] Add execution_stream across PyExecutor, KVCacheManager, PeftCacheManager to ensure proper CUDA stream synchronization between KV cache transfer operations and model forward kernels. (#10060)
Signed-off-by: SimengLiu-nv <simengl@nvidia.com>
|
2025-12-31 09:22:54 -08:00 |
|