xinhe-nv
|
3e0344a53d
|
[None][chore] Add failed cases into waives.txt (#10301)
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
|
2025-12-30 14:04:28 +08:00 |
|
Yueh-Ting (eop) Chen
|
9cee32ab39
|
[https://nvbugs/5625990][fix] Respect VSWA scheme when doing block store for reuse and load block for reuse in KV cache manager (#10183)
Signed-off-by: eopXD <yuehtingc@nvidia.com>
|
2025-12-29 14:29:14 +08:00 |
|
Jin Li
|
c04563657e
|
[TRTLLM-7735][feat] Attention NVFP4 out support for torch compile (#9740)
Signed-off-by: Jin Li <59594262+liji-nv@users.noreply.github.com>
|
2025-12-27 00:07:20 +08:00 |
|
dongfengy
|
bfc591994c
|
[https://nvbugs/5745152][fix] Fix some GPTOSS test setups (#10085)
Signed-off-by: Dongfeng Yu <dongfengy@nvidia.com>
|
2025-12-26 17:52:40 +08:00 |
|
bhsueh_NV
|
db3430f589
|
[None][feat] Support VLM part for Mistral Large 3 (#10188)
Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>
|
2025-12-25 11:20:58 -05:00 |
|
Balaram Buddharaju
|
8c1cfc872b
|
[TRTLLM-9493][feat] Custom AllToAll for helix parallelism (#9986)
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
|
2025-12-23 18:14:30 -08:00 |
|
Perkz Zheng
|
c87f1a6b39
|
[https://nvbugs/5503479][fix] update trtllm-gen kernels to address few bugs (#10089)
Signed-off-by: Perkz Zheng <67892460+PerkzZheng@users.noreply.github.com>
|
2025-12-22 04:45:33 -05:00 |
|
Chuang Zhu
|
914dd39127
|
[None][fix] disable cuda ipc on device without nvlink (L40s) for disagg test (#9735)
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
|
2025-12-22 09:29:24 +08:00 |
|
Balaram Buddharaju
|
5266475014
|
[None][feat] Cudagraph updates for helix parallelism (#10141)
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
|
2025-12-21 15:21:52 -05:00 |
|
bhsueh_NV
|
cd4b4f43fa
|
[None][feat] Support Eagle3 on Mistral Large3 (#9971)
Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>
|
2025-12-21 10:25:45 -05:00 |
|
Balaram Buddharaju
|
dcd3f7b5ea
|
[https://nvbugs/5744427][fix] Fix accuracy test OOM (#10173)
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
|
2025-12-21 02:03:38 -05:00 |
|
Gal Hubara-Agam
|
20b69a982a
|
[#10056][test] AutoDeploy: Add accuracy test for Nemotron SuperV3 (#10131)
Signed-off-by: Gal Hubara Agam <96368689+galagam@users.noreply.github.com>
Signed-off-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com>
Co-authored-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com>
|
2025-12-19 13:28:42 -08:00 |
|
Chang Liu
|
5489d188a4
|
[None][fix] Revert the change and remove device count guard for DSv32 (#9631)
Signed-off-by: Chang Liu (Enterprise Products) <9713593+chang-l@users.noreply.github.com>
|
2025-12-19 15:00:55 -05:00 |
|
Venky
|
dfa11d810e
|
[TRTC-102][docs] --extra_llm_api_options->--config in docs/examples/tests (#10005)
|
2025-12-19 13:48:43 -05:00 |
|
Balaram Buddharaju
|
799a2ae311
|
[https://nvbugs/5741331][fix] Fix helix accuracy test (#10021)
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
|
2025-12-18 15:27:53 -08:00 |
|
Wanli Jiang
|
601c29ca73
|
[https://nvbugs/5721644][fix] Update tests for nemotron_h (#9993)
Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>
|
2025-12-18 12:38:02 +08:00 |
|
xinhe-nv
|
c1cfb61b1b
|
[TRTLLM-9381][feat] Add kimi k2 fp4 tests (#9906)
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
|
2025-12-17 18:15:27 -08:00 |
|
Enwei Zhu
|
609d1d0383
|
[None][fix] Fix Illegal Memory Access for CuteDSL Grouped GEMM (#10008)
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
|
2025-12-16 04:06:49 -08:00 |
|
Yechan Kim
|
8ba8699f66
|
[TRTLLM-8310][feat] Add Qwen3-VL-MoE (#9689)
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
|
2025-12-15 20:05:20 -08:00 |
|
Balaram Buddharaju
|
dfc8799352
|
[https://nvbugs/5669114][fix] Switch to MMMU benchmark for Gemma3 27B (#9966)
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
|
2025-12-14 21:23:59 -08:00 |
|
Fanrong Li
|
8f144d9282
|
[TRTLLM-9416][feat] Skip DS-v3.2 indexer MQA and Top-K for short sequences. (#9524)
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
|
2025-12-15 12:42:25 +08:00 |
|
xxi
|
f5696df285
|
[TRTLLM-8961][feat] ConfigurableMoE support DeepGemm (#9858)
|
2025-12-15 10:47:15 +08:00 |
|
nvxuanyuc
|
a5a37227d6
|
[None][feat] Fused kernels (qknormrope + moe routing) and two-model MTP support for glm4moe (#9852)
Signed-off-by: Xuanyu Chen <xuanyuc@nvidia.com>
|
2025-12-14 10:47:24 +08:00 |
|
Mike Iovine
|
383b13e0e5
|
[None][feat] Implement sampling on 1-model EAGLE3 (#9885)
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
|
2025-12-13 07:38:22 -08:00 |
|
Balaram Buddharaju
|
6a6e41f802
|
[TRTLLM-9468][chore] Update disagg benchmarking scripts to support context parallelism (#9720)
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
|
2025-12-12 22:29:41 -08:00 |
|
bhsueh_NV
|
e49c70f6df
|
[None][feat] Support Mistral Large3 LLM part (#9820)
Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>
|
2025-12-13 11:44:27 +08:00 |
|
Ivy Zhang
|
fded6c393d
|
[TRTLLM-9262][test] add groupgemm ada case for rcca (#9833)
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
|
2025-12-12 13:23:33 +08:00 |
|
xinhe-nv
|
e8efeb765d
|
[TRTLLM-9717][fix] fix multi nodes tests cases (#9736)
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
|
2025-12-12 10:14:23 +08:00 |
|
xxi
|
488d38f88d
|
[TRTLLM-8959][feat] ConfigurableMoE support CUTLASS (#9772)
|
2025-12-12 00:22:13 +08:00 |
|
dhansen-nvidia
|
2d33ae94d5
|
[https://nvbugs/5508301][feat] Move D->H copies to a worker thread whe… (#8463)
Signed-off-by: Dan Hansen <1+dhansen-nvidia@users.noreply.github.com>
Signed-off-by: dhansen-nvidia <218031328+dhansen-nvidia@users.noreply.github.com>
Co-authored-by: Dan Hansen <1+dhansen-nvidia@users.noreply.github.com>
|
2025-12-09 18:51:31 -05:00 |
|
QI JUN
|
252769c930
|
[TRTLLM-9794][ci] remove duplicated test cases in DGX B200 (#9817)
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
|
2025-12-08 21:51:30 -08:00 |
|
Shi Xiaowei
|
b050804b63
|
[TRTLLM-6537][infra] extend multi-gpu tests related file list (#9614)
Signed-off-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
|
2025-12-09 12:54:53 +08:00 |
|
Jhao-Ting Chen
|
0a09465089
|
[https://nvbugs/5567586][feat] Ampere xqa swa specdec for GPT-OSS Eagle3-one-model (#8383)
Signed-off-by: Jhao-Ting Chen <jhaotingc@nvidia.com>
|
2025-12-08 11:16:05 -08:00 |
|
Fanrong Li
|
2f526583fb
|
[None][chore] Move the rocketkv e2e test to post-merge (#9768)
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
|
2025-12-08 13:22:16 +08:00 |
|
xxi
|
8e27ce7084
|
[TRTLLM-9603][feat] Enable ConfigurableMoE test in the CI (#9645)
|
2025-12-08 10:19:40 +08:00 |
|
JunyiXu-nv
|
b210f22c7e
|
[https://nvbugs/5703953][fix] Preserving ip:port for trtllm-serve before initializing llm (#9646)
Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
|
2025-12-06 20:13:48 -08:00 |
|
Jin Li
|
87e0c8a749
|
[TRTLLM-7073][feat] Support torch compile for PP for Llama and DeepSeekV3 (#7838)
Signed-off-by: Jin Li <59594262+liji-nv@users.noreply.github.com>
|
2025-12-04 13:32:11 +08:00 |
|
Guoming Zhang
|
79e872de31
|
[None][test] Update Qwen3-next accuracy testing by setting the cuda … (#9613)
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
|
2025-12-03 20:52:53 +08:00 |
|
xinhe-nv
|
3a748b166b
|
[None][chore] Add failed cases into waives.txt (#9593)
Signed-off-by: Jie Li <lijie@nvidia.com>
Co-authored-by: Jie Li <lijie@nvidia.com>
|
2025-12-03 16:26:06 +08:00 |
|
brb-nv
|
43f6ad7813
|
[https://nvbugs/5708475][fix] Fix e2e eval accuracy for helix parallelism (#9647)
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
|
2025-12-03 15:13:59 +08:00 |
|
heyuhhh
|
a08eb81cce
|
[None][feat] Add RocketKV usage doc and e2e accuracy test on LongBenchV2 (#9572)
Signed-off-by: yuhangh <58161490+heyuhhh@users.noreply.github.com>
|
2025-12-03 11:33:46 +08:00 |
|
Shi Xiaowei
|
227d42e492
|
[https://nvbugs/5651854][fix] Fix dist-serving perf by clearing CPU affinity (#9549)
Signed-off-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
|
2025-12-03 01:17:03 +08:00 |
|
Mike Iovine
|
d5b7f0c8ad
|
[TRTLLM-8980][test] Clean up spec dec tests in test_llm_api_pytorch (#8889)
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
|
2025-12-02 10:32:02 -05:00 |
|
brb-nv
|
be48cdf1d1
|
[TRTLLM-9466][test] Evaluate helix parallelism with DSV3 Lite (#9597)
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
|
2025-12-02 20:10:07 +08:00 |
|
JunyiXu-nv
|
9a6df980cd
|
[https://nvbugs/5703953][fix] Use random port for disagg tests (#9582)
Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
|
2025-12-02 11:40:14 +08:00 |
|
Iman Tabrizian
|
356a52edf5
|
[None][feat] Add support for KVCache reuse for DSv32 (#9383)
Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
|
2025-12-02 11:14:30 +08:00 |
|
Shi Xiaowei
|
48b1d31895
|
[https://nvbugs/5651854][infra] Enable perf metrics during accuracy testing (#9140)
|
2025-12-01 20:15:32 +08:00 |
|
Pengbo Wang
|
aa3310f64f
|
[https://nvbugs/5503479][fix] Temporarily lower reference accuracy to stabilize CI (#9398)
Signed-off-by: Pengbo Wang <221450789+pengbowang-nv@users.noreply.github.com>
|
2025-12-01 11:49:14 +08:00 |
|
JunyiXu-nv
|
3f588198dc
|
[None][fix] Fix port conflict in disagg tests (#9474)
Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
|
2025-11-30 17:33:22 +08:00 |
|
dominicshanshan
|
6345074686
|
[None][chore] Weekly mass integration of release/1.1 -- rebase (#9522)
Signed-off-by: yunruis <205571022+yunruis@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
Signed-off-by: qgai <qgai@nvidia.com>
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
Signed-off-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
Signed-off-by: Simeng Liu <simengl@nvidia.com>
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
Signed-off-by: Jin Li <59594262+liji-nv@users.noreply.github.com>
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
Signed-off-by: Vincent Zhang <vinczhang@nvidia.com>
Signed-off-by: peaceh <103117813+peaceh-nv@users.noreply.github.com>
Signed-off-by: Michal Guzek <mguzek@nvidia.com>
Signed-off-by: Michal Guzek <moraxu@users.noreply.github.com>
Signed-off-by: Chang Liu (Enterprise Products) <9713593+chang-l@users.noreply.github.com>
Signed-off-by: leslie-fang25 <leslief@nvidia.com>
Signed-off-by: Shunkang <182541032+Shunkangz@users.noreply.github.co>
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
Co-authored-by: yunruis <205571022+yunruis@users.noreply.github.com>
Co-authored-by: sunnyqgg <159101675+sunnyqgg@users.noreply.github.com>
Co-authored-by: brb-nv <169953907+brb-nv@users.noreply.github.com>
Co-authored-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
Co-authored-by: JunyiXu-nv <219237550+JunyiXu-nv@users.noreply.github.com>
Co-authored-by: Simeng Liu <109828133+SimengLiu-nv@users.noreply.github.com>
Co-authored-by: Guoming Zhang <137257613+nv-guomingz@users.noreply.github.com>
Co-authored-by: Jin Li <59594262+liji-nv@users.noreply.github.com>
Co-authored-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
Co-authored-by: Vincent Zhang <vcheungyi@163.com>
Co-authored-by: peaceh-nv <103117813+peaceh-nv@users.noreply.github.com>
Co-authored-by: Michal Guzek <moraxu@users.noreply.github.com>
Co-authored-by: Chang Liu <9713593+chang-l@users.noreply.github.com>
Co-authored-by: Leslie Fang <leslief@nvidia.com>
Co-authored-by: Shunkangz <182541032+Shunkangz@users.noreply.github.com>
Co-authored-by: Shunkang <182541032+Shunkangz@users.noreply.github.co>
Co-authored-by: QI JUN <22017000+QiJune@users.noreply.github.com>
|
2025-11-29 21:48:48 +08:00 |
|