ChristinaZ
|
13cfd70f57
|
[None][feat] Add unit tests and revision in block_level kernel for invalid input (#8718)
Signed-off-by: Christina Zhang <83400082+ChristinaZ@users.noreply.github.com>
|
2025-10-30 16:42:18 +08:00 |
|
WeiHaocheng
|
cc286687c4
|
[None][feat] Refactor scaffolding streaming feature and fix openai wo… (#8622)
Signed-off-by: Fred Wei <20514172+WeiHaocheng@users.noreply.github.com>
|
2025-10-30 16:02:40 +08:00 |
|
xinhe-nv
|
a4f75399b9
|
[https://nvbugs/5481206][fix] update waives (#8774)
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
|
2025-10-30 00:43:38 -07:00 |
|
Leslie Fang
|
2072185d76
|
[https://nvbugs/5608461][fix] exclude InductorSubproc from thread leak check (#8704)
Signed-off-by: leslie-fang25 <leslief@nvidia.com>
|
2025-10-30 15:35:15 +08:00 |
|
Void
|
6b755fd9f8
|
[None][fix] fix runtime error that bf16 input is not quantized to nvfp4 when use bf16 dispatch (#8507)
Signed-off-by: Yilin Zhang <18275976+yilin-void@users.noreply.github.com>
|
2025-10-30 15:06:54 +08:00 |
|
yuanjingx87
|
e689a73c83
|
[None][infra] fix slurm results path (#8751)
Signed-off-by: Yuanjing Xue <197832395+yuanjingx87@users.noreply.github.com>
|
2025-10-30 13:09:46 +08:00 |
|
Emma Qiao
|
7d3cebf34e
|
[None][infra] Unwaive the tests passed in latest CI and disable a perf stage (#8775)
Signed-off-by: qqiao <qqiao@nvidia.com>
|
2025-10-30 12:48:23 +08:00 |
|
Yi Zhang
|
496b419791
|
[None][doc] Add doc for torch.compile & piecewise cuda graph (#8527)
Signed-off-by: yizhang-nv <187001205+yizhang-nv@users.noreply.github.com>
|
2025-10-29 21:15:46 -07:00 |
|
Emma Qiao
|
db99a936b0
|
[TRTLLM-8971][infra] Update gpu key for B300/GB300 (#8724)
Signed-off-by: qqiao <qqiao@nvidia.com>
|
2025-10-29 20:36:44 -07:00 |
|
Yuxian Qiu
|
3176bd3815
|
[None][fix] Fix UnboundLocalError. (#8756)
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
|
2025-10-29 19:41:37 -07:00 |
|
HuiGao-NV
|
ae57738bae
|
[https://nvbugs/5547414][fix] Use cached models (#8755)
Signed-off-by: Hui Gao <huig@nvidia.com>
|
2025-10-29 19:10:10 -07:00 |
|
Sharan Chetlur
|
a2e964d9a8
|
[None][doc] Minor doc update to disagg-serving (#8768)
Signed-off-by: Sharan Chetlur <116769508+schetlur-nv@users.noreply.github.com>
|
2025-10-29 17:38:06 -07:00 |
|
Simeng Liu
|
834a780655
|
[https://nvbugs/5599086][fix] Fix FP8 Linear module for spark (#8707)
Signed-off-by: Simeng Liu <simengl@nvidia.com>
|
2025-10-29 13:58:19 -07:00 |
|
yuanjingx87
|
45b36cc069
|
[None][infra] Check in most recent lock file from nightly pipeline (#8739)
Signed-off-by: TensorRT LLM <90828364+tensorrt-cicd@users.noreply.github.com>
Signed-off-by: Yuanjing Xue <197832395+yuanjingx87@users.noreply.github.com>
Co-authored-by: TensorRT LLM <90828364+tensorrt-cicd@users.noreply.github.com>
|
2025-10-29 12:30:36 -07:00 |
|
Iman Tabrizian
|
ae6875fe10
|
[TRTLLM-8976][feat] Move indexer-k-cache to KVCacheManager (#8699)
Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
|
2025-10-29 08:04:26 -07:00 |
|
Emma Qiao
|
579e1067bf
|
[None][infra] Waive failed tests on main (#8759)
Signed-off-by: qqiao <qqiao@nvidia.com>
|
2025-10-29 21:32:23 +08:00 |
|
Leslie Fang
|
451959c60d
|
[TRTLLM-8763][chore] Deprecate pybind based GuidedDecodingConfig usage in torch backend (#8717)
Signed-off-by: leslie-fang25 <leslief@nvidia.com>
|
2025-10-29 20:37:14 +08:00 |
|
Yan Chunwei
|
fc3b6f5331
|
[None][ci] waive test_rpc.py (#8745)
Signed-off-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
|
2025-10-29 05:17:40 -07:00 |
|
Fanrong Li
|
a21697ead9
|
[None][fix] fix config loading for DeepSeek-V3.2 in trtllm-bench (#8729)
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
|
2025-10-29 05:17:16 -07:00 |
|
kris1025
|
e2c5a38879
|
[https://nvbugs/5534574][fix] disable spec decoding forever once the request spec decoding is disabled (#8446)
Signed-off-by: linquanh <linquanh@nvidia.com>
|
2025-10-29 19:28:43 +08:00 |
|
Chang Liu
|
81eb861df0
|
[None][chore] Enable GPQA in CI for DeepSeek V3.2 (#8712)
Signed-off-by: Chang Liu (Enterprise Products) <9713593+chang-l@users.noreply.github.com>
|
2025-10-29 04:22:22 -07:00 |
|
Yi Zhang
|
a69bd2a6fa
|
[https://nvbugs/5550409][fix] Disable torch compile in piecewise attention part to Avoid host overhead (#8708)
Signed-off-by: yizhang-nv <187001205+yizhang-nv@users.noreply.github.com>
|
2025-10-29 18:12:58 +08:00 |
|
Zheng Duan
|
d626d13d37
|
[https://nvbugs/5607238][test] fix working dir in disagg worker test (#8648)
Signed-off-by: zhengd-nv <200704041+zhengd-nv@users.noreply.github.com>
|
2025-10-29 16:13:52 +08:00 |
|
Pengyun Lin
|
2aade46d18
|
[TRTLLM-8214][feat] Support Qwen3 tool parser (#8216)
Signed-off-by: Pengyun Lin <81065165+LinPoly@users.noreply.github.com>
|
2025-10-29 15:48:29 +08:00 |
|
Yiteng Niu
|
741183917c
|
[None][infra] update ci allow list 2025/10/29 (#8749)
Signed-off-by: Yiteng Niu <6831097+niukuo@users.noreply.github.com>
|
2025-10-29 15:34:44 +08:00 |
|
Faraz
|
585733f113
|
[None][fix] add readme copy to wheel stage to avoid setup.py failure (#8736)
Signed-off-by: Faraz Khoubsirat <58580514+farazkh80@users.noreply.github.com>
|
2025-10-29 14:43:03 +08:00 |
|
dongxuy04
|
00eaf5f883
|
[None][feat] add flag for EPLB to force using GDRCopy (#8650)
Signed-off-by: Dongxu Yang <78518666+dongxuy04@users.noreply.github.com>
|
2025-10-29 13:33:26 +08:00 |
|
Stefan Niebler
|
19ca7b15c7
|
[https://nvbugs/5593199][test] Enhance beam search tests deterministic dummy model (#8625)
Signed-off-by: Stefan Niebler <82932102+stnie@users.noreply.github.com>
|
2025-10-29 06:12:22 +01:00 |
|
Chang Liu
|
5f737b8dbe
|
[None][perf] Use fp8 quant kernel in DS3.2 indexer module (#8701)
Signed-off-by: Chang Liu (Enterprise Products) <9713593+chang-l@users.noreply.github.com>
|
2025-10-29 12:45:09 +08:00 |
|
Cheng Hang
|
15c293a90b
|
[None][feat] Enable nvfp4 cuda core for sm120 (#8620)
Signed-off-by: Cheng Hang <chang@nvidia.com>
|
2025-10-29 12:39:03 +08:00 |
|
Yechan Kim
|
bc26f4ce7c
|
[https://nvbugs/5549829][fix] Qwen2.5-VL TP > 1 + Quantized weight load fix (#8680)
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
|
2025-10-29 13:38:42 +09:00 |
|
xinhe-nv
|
7ba98a6b20
|
[None][chore] Add failed cases into waives.txt (#8684)
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
|
2025-10-28 20:30:01 -07:00 |
|
Yan Chunwei
|
f2faf2809f
|
[None][ci] waive test_rpc.py temporarily (#8743)
Signed-off-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
|
2025-10-28 19:20:27 -07:00 |
|
Zheng Duan
|
fea5bfbda7
|
[None][feat] add detailed KV cache transfer time breakdown (#8521)
Signed-off-by: zhengd-nv <200704041+zhengd-nv@users.noreply.github.com>
|
2025-10-29 10:11:09 +08:00 |
|
ruodil
|
f444fe2deb
|
[None][test] fix a typo in perf test sampler config (#8726)
Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com>
Signed-off-by: ruodil <200874449+ruodil@users.noreply.github.com>
Co-authored-by: Ruodi Lu <ruodil@users.noreply.github.com>
|
2025-10-29 09:53:53 +08:00 |
|
Chuang Zhu
|
b828b6445b
|
[https://nvbugs/5612529][fix] Fix transferAgent_test (#8710)
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
|
2025-10-29 09:14:34 +08:00 |
|
Yechan Kim
|
cf8a1d2ef9
|
[https://nvbugs/5596377][fix] Fix mm dummy calculation (#8498)
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
|
2025-10-29 09:45:21 +09:00 |
|
Lizhi Zhou
|
24167d00eb
|
[TRTLLM-8431][doc] update public doc and example, add etcd auto-scaling tests (#8602)
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
|
2025-10-28 17:04:53 -07:00 |
|
Kaiyu Xie
|
227c288441
|
[TRTLLM-8827] [feat] Enable low precision alltoall for Cutlass and TRTLLMGen backends (#8675)
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
|
2025-10-29 07:56:48 +08:00 |
|
Mike Iovine
|
00161b315f
|
[https://nvbugs/5549111][fix] Fix 2-model overlap scheduler accuracy on very long prompts (#8076)
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Michael Iovine <miovine@nvidia.com>
|
2025-10-28 14:55:34 -07:00 |
|
dongfengy
|
083f3637f1
|
[https://nvbugs/5596343][test] Update test waive to get back some coverage (#8702)
Signed-off-by: Dongfeng Yu <dongfengy@nvidia.com>
Signed-off-by: dongfengy <99041270+dongfengy@users.noreply.github.com>
|
2025-10-28 14:05:48 -07:00 |
|
Lucas Liebenwein
|
0ee71d95ec
|
[https://nvbugs/5606166][fix] AutoDeploy: use tuples for cudagraph shape lookup (#8658)
Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
|
2025-10-28 10:52:43 -07:00 |
|
Anish Shanbhag
|
a09b38a862
|
[TRTLLM-8684][chore] Migrate BuildConfig to Pydantic, add a Python wrapper for KVCacheType enum (#8330)
Signed-off-by: Anish Shanbhag <ashanbhag@nvidia.com>
|
2025-10-28 09:17:26 -07:00 |
|
William Zhang
|
cdc9e5e645
|
[None][fix] Properly raise error for nemotron H models (#8697)
Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>
|
2025-10-28 08:59:42 -07:00 |
|
dongfengy
|
5a01f382c1
|
[https://nvbugs/5575913][fix] Use separate thresholds for 120b/20b gptoss (#8664)
Signed-off-by: Dongfeng Yu <dongfengy@nvidia.com>
Signed-off-by: dongfengy <99041270+dongfengy@users.noreply.github.com>
|
2025-10-28 10:35:07 -04:00 |
|
Robin Kobus
|
e8e2b0697a
|
[None][chore] Revert "[TRTLLM-7835][test] add default sample config for perf test (#8523) (#8725)
Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
|
2025-10-28 14:23:38 +01:00 |
|
Eran Geva
|
e051a05e6c
|
[#8694][fix] fix AutoDeploy cuda memory access failure in nvidia/NVIDIA-Nemotron-Nano-31B-A3-v3 (#8696)
Signed-off-by: Eran Geva <19514940+MrGeva@users.noreply.github.com>
|
2025-10-28 13:21:43 +02:00 |
|
dongxuy04
|
b37a8a9a74
|
[None][fix] fix EPLB init hang (#8649)
Signed-off-by: Dongxu Yang <78518666+dongxuy04@users.noreply.github.com>
|
2025-10-28 05:22:49 -04:00 |
|
ruodil
|
6b9b73ee27
|
[https://nvbugs/5564465][test] ensure deepseek_v3_lite isl + osl < max_seq_len (#8565)
Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com>
Co-authored-by: Ruodi Lu <ruodil@users.noreply.github.com>
|
2025-10-28 15:25:52 +08:00 |
|
ruodil
|
bf72eb045e
|
[TRTLLM-7835][test] add default sample config for perf test (#8523)
Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com>
Co-authored-by: Ruodi Lu <ruodil@users.noreply.github.com>
|
2025-10-28 02:22:47 -04:00 |
|