Xiwen Yu
|
5f508b7d43
|
Merge remote-tracking branch 'origin/main' into feat/b300_cu13
Signed-off-by: Xiwen Yu <13230610+VALLIS-NERIA@users.noreply.github.com>
|
2025-09-10 07:46:25 +08:00 |
|
Chang Liu
|
faa2f46554
|
[TRTLLM-5059][feat] Enable KV-cache reuse and add E2E tests for llava-next (#7349)
Signed-off-by: Chang Liu (Enterprise Products) <9713593+chang-l@users.noreply.github.com>
|
2025-09-09 14:51:36 -04:00 |
|
QI JUN
|
a0e1604898
|
[None][ci] add DGX_H100-2_GPUs-PyTorch-Others-1 pipeline (#7629)
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
|
2025-09-09 11:06:32 -04:00 |
|
Liao Lanyu
|
af403848d7
|
[https://nvbugs/5445466][fix] unwaive DS R1 test cases with bug already fixed (#7429)
Signed-off-by: Lanyu Liao <lancelly@users.noreply.github.com>
Co-authored-by: Lanyu Liao <lancelly@users.noreply.github.com>
|
2025-09-09 17:25:49 +08:00 |
|
Perkz Zheng
|
da6cb541a2
|
[None][feat] Optimize MLA kernels with separate reduction kernels (#7597)
Signed-off-by: Perkz Zheng <67892460+PerkzZheng@users.noreply.github.com>
|
2025-09-09 16:58:44 +08:00 |
|
xinhe-nv
|
8a52015f50
|
[None][chore] Remove closed bugs (#7591)
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
|
2025-09-09 04:08:42 -04:00 |
|
Xiwen Yu
|
a8b630f178
|
Merge remote-tracking branch 'origin/main' into feat/b300_cu13
Signed-off-by: Xiwen Yu <13230610+VALLIS-NERIA@users.noreply.github.com>
|
2025-09-09 14:34:27 +08:00 |
|
Xiwen Yu
|
82833fa961
|
address comments
Signed-off-by: Xiwen Yu <13230610+VALLIS-NERIA@users.noreply.github.com>
|
2025-09-09 14:18:16 +08:00 |
|
Yiqing Yan
|
5c616da2fd
|
[TRTLLM-5877][infra] Add fmha tests and auto trigger rules (#6050)
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
|
2025-09-09 11:33:09 +08:00 |
|
Wanli Jiang
|
1e0669d27a
|
[https://nvbugs/5453709][fix] Remove transformers version limit in Qwen2VL (#7152)
Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>
|
2025-09-09 10:38:20 +08:00 |
|
Iman Tabrizian
|
d96c54d8ae
|
[None][test] Skip eagle3 test (#7627)
Signed-off-by: Iman Tabrizian <10105175+Tabrizian@users.noreply.github.com>
|
2025-09-08 17:23:53 -04:00 |
|
dongfengy
|
fdd5bd49fc
|
[https://nvbugs/5481080][fix] Fix GPTOSS W4A16 reference (#7323)
Signed-off-by: Dongfeng Yu <dongfengy@nvidia.com>
|
2025-09-08 13:59:28 -07:00 |
|
Chuang Zhu
|
77657a1c12
|
[TRTLLM-7361][feat] KV cache transfer for uneven pp (#7117)
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
|
2025-09-08 13:37:46 -04:00 |
|
Xiwen Yu
|
fdaf4e2985
|
Merge remote-tracking branch 'origin/main' into feat/b300_cu13
Signed-off-by: Xiwen Yu <13230610+VALLIS-NERIA@users.noreply.github.com>
|
2025-09-08 15:14:54 +08:00 |
|
dominicshanshan
|
c9dca69e1b
|
[None][chore] Mass integration of release/1.0 - 3rd (#7519)
Signed-off-by: Nave Assaf <nassaf@nvidia.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
Signed-off-by: qqiao <qqiao@nvidia.com>
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
Signed-off-by: Bo Deng <deemod@nvidia.com>
Signed-off-by: Jin Li <59594262+liji-nv@users.noreply.github.com>
Signed-off-by: Yifei Zhang <219273404+yifeizhang-c@users.noreply.github.com>
Signed-off-by: Amit Zuker <203509407+amitz-nv@users.noreply.github.com>
Signed-off-by: Erin Ho <14718778+hchings@users.noreply.github.com>
Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>
Signed-off-by: Christina Zhang <83400082+ChristinaZ@users.noreply.github.com>
Signed-off-by: Venky Ganesh <23023424+venkywonka@users.noreply.github.com>
Signed-off-by: Pamela <179191831+pamelap-nvidia@users.noreply.github.com>
Signed-off-by: Hui Gao <huig@nvidia.com>
Signed-off-by: Alexandre Milesi <30204471+milesial@users.noreply.github.com>
Signed-off-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
Signed-off-by: Michal Guzek <mguzek@nvidia.com>
Signed-off-by: peaceh <103117813+peaceh-nv@users.noreply.github.com>
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>
Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>
Signed-off-by: ruodil <200874449+ruodil@users.noreply.github.com>
Signed-off-by: Linda-Stadter <57756729+Linda-Stadter@users.noreply.github.com>
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
Signed-off-by: Jiagan Cheng <jiaganc@nvidia.com>
Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>
Signed-off-by: Dom Brown <3886319+DomBrown@users.noreply.github.com>
Co-authored-by: Nave Assaf <55059536+Naveassaf@users.noreply.github.com>
Co-authored-by: Yechan Kim <161688079+yechank-nvidia@users.noreply.github.com>
Co-authored-by: brb-nv <169953907+brb-nv@users.noreply.github.com>
Co-authored-by: Iman Tabrizian <10105175+Tabrizian@users.noreply.github.com>
Co-authored-by: Emma Qiao <qqiao@nvidia.com>
Co-authored-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
Co-authored-by: Bo Deng <deemod@nvidia.com>
Co-authored-by: Jin Li <59594262+liji-nv@users.noreply.github.com>
Co-authored-by: yifeizhang-c <219273404+yifeizhang-c@users.noreply.github.com>
Co-authored-by: amitz-nv <203509407+amitz-nv@users.noreply.github.com>
Co-authored-by: Erin <14718778+hchings@users.noreply.github.com>
Co-authored-by: chenfeiz0326 <chenfeiz@nvidia.com>
Co-authored-by: ChristinaZ <83400082+ChristinaZ@users.noreply.github.com>
Co-authored-by: Venky <23023424+venkywonka@users.noreply.github.com>
Co-authored-by: Pamela Peng <179191831+pamelap-nvidia@users.noreply.github.com>
Co-authored-by: HuiGao-NV <huig@nvidia.com>
Co-authored-by: milesial <milesial@users.noreply.github.com>
Co-authored-by: Shi Xiaowei <39303645+Shixiaowei02@users.noreply.github.com>
Co-authored-by: Michal Guzek <moraxu@users.noreply.github.com>
Co-authored-by: peaceh-nv <103117813+peaceh-nv@users.noreply.github.com>
Co-authored-by: Guoming Zhang <137257613+nv-guomingz@users.noreply.github.com>
Co-authored-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>
Co-authored-by: pcastonguay <55748270+pcastonguay@users.noreply.github.com>
Co-authored-by: ruodil <200874449+ruodil@users.noreply.github.com>
Co-authored-by: Linda <57756729+Linda-Stadter@users.noreply.github.com>
Co-authored-by: Zhanrui Sun <184402041+ZhanruiSunCh@users.noreply.github.com>
Co-authored-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
Co-authored-by: Jiagan Cheng <jiaganc@nvidia.com>
Co-authored-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>
Co-authored-by: Larry <197874197+LarryXFly@users.noreply.github.com>
Co-authored-by: Sharan Chetlur <116769508+schetlur-nv@users.noreply.github.com>
Co-authored-by: Dom Brown <3886319+DomBrown@users.noreply.github.com>
|
2025-09-08 14:03:04 +08:00 |
|
Xiwen Yu
|
e6bb1fe8af
|
remove non-exist cases
Signed-off-by: Xiwen Yu <13230610+VALLIS-NERIA@users.noreply.github.com>
|
2025-09-07 23:24:46 +08:00 |
|
Raayan Dhar
|
bae9560e62
|
[https://nvbugs/5448767][fix] sync termination of requests across PP ranks (#7455)
Signed-off-by: raayandhar <rdhar@nvidia.com>
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
Co-authored-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
|
2025-09-07 08:45:49 -04:00 |
|
Xiwen Yu
|
322db710dc
|
Merge remote-tracking branch 'origin/main' into feat/b300_cu13
Signed-off-by: Xiwen Yu <13230610+VALLIS-NERIA@users.noreply.github.com>
|
2025-09-06 23:58:04 +08:00 |
|
dominicshanshan
|
9a97f0a3b7
|
[None][ci] Waive qwen3 test for accuracy bug in https://nvbugs/5505402 (#7585)
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
|
2025-09-06 21:29:16 +08:00 |
|
QI JUN
|
525bb806a9
|
[None][ci] move some test cases of DGX H100 to post merge (#7569)
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
|
2025-09-06 01:03:38 -04:00 |
|
Xiwen Yu
|
5e7aa76bb4
|
Merge branch 'user/sm103_trtllmgen' into feat/b300_cu13
Signed-off-by: Xiwen Yu <13230610+VALLIS-NERIA@users.noreply.github.com>
|
2025-09-06 00:49:23 +08:00 |
|
Emma Qiao
|
d8ec546b73
|
[None][infra] Waive failed tests on main branch 0905 (#7564)
Signed-off-by: qqiao <qqiao@nvidia.com>
|
2025-09-05 22:46:46 +08:00 |
|
Xiwen Yu
|
2c3f4cbeee
|
Merge remote-tracking branch 'origin/main' into feat/b300_cu13
Signed-off-by: Xiwen Yu <13230610+VALLIS-NERIA@users.noreply.github.com>
|
2025-09-05 15:53:43 +08:00 |
|
xinhe-nv
|
8e3962d278
|
[TRTLLM-6642][feat] add gptoss 20g tests (#7361)
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
|
2025-09-05 02:20:28 -04:00 |
|
xinhe-nv
|
b3ba3d98d2
|
[None][chore] Remove closed bugs (#7408)
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
|
2025-09-05 02:11:16 -04:00 |
|
QI JUN
|
ff3704897b
|
[None][ci] remove unnecessary test_modeling_deepseek.py (#7542)
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
|
2025-09-04 20:05:27 -07:00 |
|
Jin Li
|
2189a2f3ff
|
[https://nvbugs/5483615][fix] Remove unnecessary assertion to let mai… (#7441)
Signed-off-by: Jin Li <59594262+liji-nv@users.noreply.github.com>
|
2025-09-05 10:56:21 +08:00 |
|
Ivy Zhang
|
b46e0ae5d4
|
[None][test] update nim and full test list (#7468)
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
|
2025-09-04 09:06:01 -04:00 |
|
Jin Li
|
2a2dfe273b
|
[https://nvbugs/5485102][fix] Correctly set stride for piecewise outp… (#7442)
Signed-off-by: Jin Li <59594262+liji-nv@users.noreply.github.com>
|
2025-09-04 10:48:15 +08:00 |
|
Stanley Sun
|
db8eb0a447
|
[TRTLLM-7876][test] Test trtllm-serve with --extra_llm_api_options (#7492)
Signed-off-by: Stanley Sun <190317771+StanleySun639@users.noreply.github.com>
|
2025-09-04 10:34:38 +08:00 |
|
Enwei Zhu
|
5ff3a65b23
|
[TRTLLM-7028][feat] Enable guided decoding with speculative decoding (part 2: one-model engine) (#6948)
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
|
2025-09-03 15:16:11 -07:00 |
|
Stanley Sun
|
cebbf48b74
|
[TRTLLM-7363][test] Add 8-GPU test cases for RTX6000 (#7083)
Signed-off-by: Stanley Sun <190317771+StanleySun639@users.noreply.github.com>
|
2025-09-03 08:36:52 -04:00 |
|
Mike Iovine
|
79d93f9419
|
[https://nvbugs/5488141][fix] Unwaive llama3 test_eagle3 (#7486)
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
|
2025-09-03 14:10:40 +08:00 |
|
Wanli Jiang
|
4223a9aada
|
[TRTLLM-7261][feat] Support phi-4 model in pytorch backend (#7371)
Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>
|
2025-09-03 10:27:42 +08:00 |
|
Xiwen Yu
|
5bd50d477e
|
update mha cubins and support 103a
Signed-off-by: Xiwen Yu <xiweny@nvidia.com>
|
2025-09-02 19:26:24 -07:00 |
|
Simeng Liu
|
bcc55bcdf3
|
[https://nvbugs/5470782][fix] Add specific test names for test_deepseek.py (#7318)
Signed-off-by: Simeng Liu <simengl@nvidia.com>
|
2025-09-02 10:31:40 -07:00 |
|
Emma Qiao
|
aae5d22bfe
|
[None][infra] Waive failed tests on main branch 0902 (#7482)
Signed-off-by: qqiao <qqiao@nvidia.com>
|
2025-09-02 10:16:49 -04:00 |
|
peaceh-nv
|
90479c50fb
|
[https://nvbugs/5453992][unwaive] Unwaive llama quickstart test (#7242)
Signed-off-by: peaceh <103117813+peaceh-nv@users.noreply.github.com>
|
2025-09-02 20:28:32 +08:00 |
|
JunyiXu-nv
|
eefe5f2093
|
[TRTLLM-7208][feat] Implement basic functionalities for Responses API (#7341)
Signed-off-by: Junyi Xu <junyix@nvidia.com>
|
2025-09-02 07:08:22 -04:00 |
|
HuiGao-NV
|
7279297717
|
[None][infra] waive test case failed on post-merge (#7471)
Signed-off-by: Hui Gao <huig@nvidia.com>
|
2025-09-02 06:20:08 -04:00 |
|
aalanwyr
|
c3c95736a1
|
[TRTLLM-6643][feat] Add DeepSeek-v3-0324 e2e torch test (#7413)
Signed-off-by: Yaran Wu <28771492+aalanwyr@users.noreply.github.com>
|
2025-09-02 17:21:27 +08:00 |
|
Yan Chunwei
|
f90375f37c
|
[https://nvbugs/5476580][fix] unwaive test_nvfp4_4gpus (#7454)
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
|
2025-09-02 04:17:14 -04:00 |
|
Xiwen Yu
|
62a78973a8
|
Merge remote-tracking branch 'origin/main' into user/xiweny/merge_0901
Signed-off-by: Xiwen Yu <13230610+VALLIS-NERIA@users.noreply.github.com>
|
2025-09-02 10:12:30 +08:00 |
|
Emma Qiao
|
01dfd3af1b
|
[None][infra] Waive failed case on main 0901 (#7447)
Signed-off-by: qqiao <qqiao@nvidia.com>
|
2025-09-01 23:27:24 +08:00 |
|
bhsueh_NV
|
16e9d1121c
|
[https://nvbugs/5481087][fix] fix bug of ci when we use mocker (#7332)
Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>
|
2025-09-01 16:22:45 +08:00 |
|
nvamyt
|
efaefca2c8
|
[None][test] Update case that not support passing quantization fp8 for pytorch backend (#7302)
Signed-off-by: nvamyt <amyt@nvidia.com>
|
2025-09-01 12:59:21 +08:00 |
|
Xiwen Yu
|
38ef850552
|
Merge remote-tracking branch 'gitlab/main' into user/xiweny/merge_0901
Signed-off-by: Xiwen Yu <13230610+VALLIS-NERIA@users.noreply.github.com>
|
2025-09-01 11:46:44 +08:00 |
|
Yiqing Yan
|
21291f3d8e
|
[None][chore] Remove duplicate test waives (#6999)
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
|
2025-09-01 11:02:31 +08:00 |
|
Emma Qiao
|
09bca7ca82
|
[None][infra] Waive failed tests for release branch 0818 (#6993)
Signed-off-by: qqiao <qqiao@nvidia.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
|
2025-09-01 11:02:31 +08:00 |
|
Ivy Zhang
|
29cdcdb56a
|
[None][fix] update skip config (#6891)
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
|
2025-09-01 11:02:31 +08:00 |
|