Zhenhuan Chen
|
603ec03fb1
|
[https://nvbugs/5575687][fix] fix moe_gemm's preexit position that cause illegal memory access (#8786)
Signed-off-by: Zhenhuan Chen <zhenhuanc@nvidia.com>
|
2025-10-31 09:08:23 +08:00 |
|
yuanjingx87
|
fe670af65f
|
[None][infra] Update allow list 20251030 (#8808)
Signed-off-by: Yuanjing Xue <197832395+yuanjingx87@users.noreply.github.com>
|
2025-10-30 16:41:52 -07:00 |
|
Mike Iovine
|
b87448b009
|
[TRTLLM-8978][test] Remove llama 4 spec dec tests (#8766)
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
|
2025-10-30 15:47:04 -04:00 |
|
Chenghao Zhang
|
71c5576a44
|
[TRTLLM-8734][feat] AutoDeploy: Enable the nvfp4 for Nemotron MOE (#8737)
Signed-off-by: nvchenghaoz <211069071+nvchenghaoz@users.noreply.github.com>
Co-authored-by: Suyog Gupta <41447211+suyoggupta@users.noreply.github.com>
|
2025-10-30 12:33:08 -07:00 |
|
Tailing Yuan
|
ec31363a86
|
[None][fix] Layer wise benchmarks: use local models, lint (#8799)
Signed-off-by: Tailing Yuan <yuantailing@gmail.com>
|
2025-10-30 09:47:46 -07:00 |
|
Emma Qiao
|
9112cffaf3
|
[None][infra] Waive failed case for main branch (#8797)
Signed-off-by: qqiao <qqiao@nvidia.com>
|
2025-10-30 07:57:35 -07:00 |
|
Zhanrui Sun
|
547d799111
|
[TRTLLM-8930][infra] Force Blossom perf test stages to use 'tensorrt/test_type: perf' in the K8S template (#8752)
Signed-off-by: ZhanruiSunCh <184402041+ZhanruiSunCh@users.noreply.github.com>
Signed-off-by: Zhanrui Sun <184402041+ZhanruiSunCh@users.noreply.github.com>
|
2025-10-30 06:30:10 -07:00 |
|
Tailing Yuan
|
f9c7786dc8
|
[None][feat] Add layer wise benchmarks (#8777)
Signed-off-by: Tailing Yuan <yuantailing@gmail.com>
|
2025-10-30 20:29:34 +08:00 |
|
Anthony Chang
|
f666ad2f6b
|
[None][feat] Autotuner can iterate through all tactics for test purposes (#8663)
Signed-off-by: Anthony Chang <27950904+rosenrodt@users.noreply.github.com>
|
2025-10-30 13:11:25 +01:00 |
|
Emma Qiao
|
a5cc9fe0aa
|
[TRTLLM-5453][infra] Check all steps for test name and also check the test in waives.txt also exists in l0 or qa test list. (#6256)
Signed-off-by: qqiao <qqiao@nvidia.com>
Signed-off-by: Emma Qiao <qqiao@nvidia.com>
|
2025-10-30 01:56:04 -07:00 |
|
ChristinaZ
|
13cfd70f57
|
[None][feat] Add unit tests and revision in block_level kernel for invalid input (#8718)
Signed-off-by: Christina Zhang <83400082+ChristinaZ@users.noreply.github.com>
|
2025-10-30 16:42:18 +08:00 |
|
WeiHaocheng
|
cc286687c4
|
[None][feat] Refactor scaffolding streaming feature and fix openai wo… (#8622)
Signed-off-by: Fred Wei <20514172+WeiHaocheng@users.noreply.github.com>
|
2025-10-30 16:02:40 +08:00 |
|
xinhe-nv
|
a4f75399b9
|
[https://nvbugs/5481206][fix] update waives (#8774)
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
|
2025-10-30 00:43:38 -07:00 |
|
Leslie Fang
|
2072185d76
|
[https://nvbugs/5608461][fix] exclude InductorSubproc from thread leak check (#8704)
Signed-off-by: leslie-fang25 <leslief@nvidia.com>
|
2025-10-30 15:35:15 +08:00 |
|
Void
|
6b755fd9f8
|
[None][fix] fix runtime error that bf16 input is not quantized to nvfp4 when use bf16 dispatch (#8507)
Signed-off-by: Yilin Zhang <18275976+yilin-void@users.noreply.github.com>
|
2025-10-30 15:06:54 +08:00 |
|
yuanjingx87
|
e689a73c83
|
[None][infra] fix slurm results path (#8751)
Signed-off-by: Yuanjing Xue <197832395+yuanjingx87@users.noreply.github.com>
|
2025-10-30 13:09:46 +08:00 |
|
Emma Qiao
|
7d3cebf34e
|
[None][infra] Unwaive the tests passed in latest CI and disable a perf stage (#8775)
Signed-off-by: qqiao <qqiao@nvidia.com>
|
2025-10-30 12:48:23 +08:00 |
|
Yi Zhang
|
496b419791
|
[None][doc] Add doc for torch.compile & piecewise cuda graph (#8527)
Signed-off-by: yizhang-nv <187001205+yizhang-nv@users.noreply.github.com>
|
2025-10-29 21:15:46 -07:00 |
|
Emma Qiao
|
db99a936b0
|
[TRTLLM-8971][infra] Update gpu key for B300/GB300 (#8724)
Signed-off-by: qqiao <qqiao@nvidia.com>
|
2025-10-29 20:36:44 -07:00 |
|
Yuxian Qiu
|
3176bd3815
|
[None][fix] Fix UnboundLocalError. (#8756)
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
|
2025-10-29 19:41:37 -07:00 |
|
HuiGao-NV
|
ae57738bae
|
[https://nvbugs/5547414][fix] Use cached models (#8755)
Signed-off-by: Hui Gao <huig@nvidia.com>
|
2025-10-29 19:10:10 -07:00 |
|
Sharan Chetlur
|
a2e964d9a8
|
[None][doc] Minor doc update to disagg-serving (#8768)
Signed-off-by: Sharan Chetlur <116769508+schetlur-nv@users.noreply.github.com>
|
2025-10-29 17:38:06 -07:00 |
|
Simeng Liu
|
834a780655
|
[https://nvbugs/5599086][fix] Fix FP8 Linear module for spark (#8707)
Signed-off-by: Simeng Liu <simengl@nvidia.com>
|
2025-10-29 13:58:19 -07:00 |
|
yuanjingx87
|
45b36cc069
|
[None][infra] Check in most recent lock file from nightly pipeline (#8739)
Signed-off-by: TensorRT LLM <90828364+tensorrt-cicd@users.noreply.github.com>
Signed-off-by: Yuanjing Xue <197832395+yuanjingx87@users.noreply.github.com>
Co-authored-by: TensorRT LLM <90828364+tensorrt-cicd@users.noreply.github.com>
|
2025-10-29 12:30:36 -07:00 |
|
Iman Tabrizian
|
ae6875fe10
|
[TRTLLM-8976][feat] Move indexer-k-cache to KVCacheManager (#8699)
Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
|
2025-10-29 08:04:26 -07:00 |
|
Emma Qiao
|
579e1067bf
|
[None][infra] Waive failed tests on main (#8759)
Signed-off-by: qqiao <qqiao@nvidia.com>
|
2025-10-29 21:32:23 +08:00 |
|
Leslie Fang
|
451959c60d
|
[TRTLLM-8763][chore] Deprecate pybind based GuidedDecodingConfig usage in torch backend (#8717)
Signed-off-by: leslie-fang25 <leslief@nvidia.com>
|
2025-10-29 20:37:14 +08:00 |
|
Yan Chunwei
|
fc3b6f5331
|
[None][ci] waive test_rpc.py (#8745)
Signed-off-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
|
2025-10-29 05:17:40 -07:00 |
|
Fanrong Li
|
a21697ead9
|
[None][fix] fix config loading for DeepSeek-V3.2 in trtllm-bench (#8729)
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
|
2025-10-29 05:17:16 -07:00 |
|
kris1025
|
e2c5a38879
|
[https://nvbugs/5534574][fix] disable spec decoding forever once the request spec decoding is disabled (#8446)
Signed-off-by: linquanh <linquanh@nvidia.com>
|
2025-10-29 19:28:43 +08:00 |
|
Chang Liu
|
81eb861df0
|
[None][chore] Enable GPQA in CI for DeepSeek V3.2 (#8712)
Signed-off-by: Chang Liu (Enterprise Products) <9713593+chang-l@users.noreply.github.com>
|
2025-10-29 04:22:22 -07:00 |
|
Yi Zhang
|
a69bd2a6fa
|
[https://nvbugs/5550409][fix] Disable torch compile in piecewise attention part to Avoid host overhead (#8708)
Signed-off-by: yizhang-nv <187001205+yizhang-nv@users.noreply.github.com>
|
2025-10-29 18:12:58 +08:00 |
|
Zheng Duan
|
d626d13d37
|
[https://nvbugs/5607238][test] fix working dir in disagg worker test (#8648)
Signed-off-by: zhengd-nv <200704041+zhengd-nv@users.noreply.github.com>
|
2025-10-29 16:13:52 +08:00 |
|
Pengyun Lin
|
2aade46d18
|
[TRTLLM-8214][feat] Support Qwen3 tool parser (#8216)
Signed-off-by: Pengyun Lin <81065165+LinPoly@users.noreply.github.com>
|
2025-10-29 15:48:29 +08:00 |
|
Yiteng Niu
|
741183917c
|
[None][infra] update ci allow list 2025/10/29 (#8749)
Signed-off-by: Yiteng Niu <6831097+niukuo@users.noreply.github.com>
|
2025-10-29 15:34:44 +08:00 |
|
Faraz
|
585733f113
|
[None][fix] add readme copy to wheel stage to avoid setup.py failure (#8736)
Signed-off-by: Faraz Khoubsirat <58580514+farazkh80@users.noreply.github.com>
|
2025-10-29 14:43:03 +08:00 |
|
dongxuy04
|
00eaf5f883
|
[None][feat] add flag for EPLB to force using GDRCopy (#8650)
Signed-off-by: Dongxu Yang <78518666+dongxuy04@users.noreply.github.com>
|
2025-10-29 13:33:26 +08:00 |
|
Stefan Niebler
|
19ca7b15c7
|
[https://nvbugs/5593199][test] Enhance beam search tests deterministic dummy model (#8625)
Signed-off-by: Stefan Niebler <82932102+stnie@users.noreply.github.com>
|
2025-10-29 06:12:22 +01:00 |
|
Chang Liu
|
5f737b8dbe
|
[None][perf] Use fp8 quant kernel in DS3.2 indexer module (#8701)
Signed-off-by: Chang Liu (Enterprise Products) <9713593+chang-l@users.noreply.github.com>
|
2025-10-29 12:45:09 +08:00 |
|
Cheng Hang
|
15c293a90b
|
[None][feat] Enable nvfp4 cuda core for sm120 (#8620)
Signed-off-by: Cheng Hang <chang@nvidia.com>
|
2025-10-29 12:39:03 +08:00 |
|
Yechan Kim
|
bc26f4ce7c
|
[https://nvbugs/5549829][fix] Qwen2.5-VL TP > 1 + Quantized weight load fix (#8680)
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
|
2025-10-29 13:38:42 +09:00 |
|
xinhe-nv
|
7ba98a6b20
|
[None][chore] Add failed cases into waives.txt (#8684)
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
|
2025-10-28 20:30:01 -07:00 |
|
Yan Chunwei
|
f2faf2809f
|
[None][ci] waive test_rpc.py temporarily (#8743)
Signed-off-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
|
2025-10-28 19:20:27 -07:00 |
|
Zheng Duan
|
fea5bfbda7
|
[None][feat] add detailed KV cache transfer time breakdown (#8521)
Signed-off-by: zhengd-nv <200704041+zhengd-nv@users.noreply.github.com>
|
2025-10-29 10:11:09 +08:00 |
|
ruodil
|
f444fe2deb
|
[None][test] fix a typo in perf test sampler config (#8726)
Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com>
Signed-off-by: ruodil <200874449+ruodil@users.noreply.github.com>
Co-authored-by: Ruodi Lu <ruodil@users.noreply.github.com>
|
2025-10-29 09:53:53 +08:00 |
|
Chuang Zhu
|
b828b6445b
|
[https://nvbugs/5612529][fix] Fix transferAgent_test (#8710)
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
|
2025-10-29 09:14:34 +08:00 |
|
Yechan Kim
|
cf8a1d2ef9
|
[https://nvbugs/5596377][fix] Fix mm dummy calculation (#8498)
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
|
2025-10-29 09:45:21 +09:00 |
|
Lizhi Zhou
|
24167d00eb
|
[TRTLLM-8431][doc] update public doc and example, add etcd auto-scaling tests (#8602)
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
|
2025-10-28 17:04:53 -07:00 |
|
Kaiyu Xie
|
227c288441
|
[TRTLLM-8827] [feat] Enable low precision alltoall for Cutlass and TRTLLMGen backends (#8675)
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
|
2025-10-29 07:56:48 +08:00 |
|
Mike Iovine
|
00161b315f
|
[https://nvbugs/5549111][fix] Fix 2-model overlap scheduler accuracy on very long prompts (#8076)
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Michael Iovine <miovine@nvidia.com>
|
2025-10-28 14:55:34 -07:00 |
|