Xianjie Qiao
|
d145e87f6f
|
[None][chore] Update disagg benchmark configs (#8289)
Signed-off-by: Xianjie <5410381+qiaoxj07@users.noreply.github.com>
Signed-off-by: Xianjie Qiao <5410381+qiaoxj07@users.noreply.github.com>
|
2025-10-13 18:15:46 +08:00 |
|
Cao Dong
|
d882c92a84
|
[None][fix] Fix EventLoopShutdownError (#8260)
Signed-off-by: Dong Cao <docao@nvidia.com>
|
2025-10-13 17:31:33 +08:00 |
|
Po-Han Huang (NVIDIA)
|
6fc6f70a68
|
[https://nvbugs/5441729][test] Fix test_modeling_llama_min_latency.py failures (#7478)
Signed-off-by: Po-Han Huang <pohanh@nvidia.com>
|
2025-10-13 15:35:02 +08:00 |
|
xinhe-nv
|
9fe63dd8db
|
[None][chore] Add failed cases into waives.txt (#8290)
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
|
2025-10-13 00:07:00 -07:00 |
|
Emma Qiao
|
fe17e78f27
|
[None][infra] Add back gb200 multi-node test stage to pre-merge (#8281)
Signed-off-by: qqiao <qqiao@nvidia.com>
|
2025-10-12 23:56:07 -07:00 |
|
Leslie Fang
|
8d1b068b1a
|
[TRTLLM-8477][chore] Replace KvCacheConfigCpp with KvCacheConfig inside PyExecutor (#8259)
Signed-off-by: leslie-fang25 <leslief@nvidia.com>
|
2025-10-13 14:55:36 +08:00 |
|
Yilin Fan
|
1a9044949f
|
[None][fix] Fix bench_serving import error (#8296)
Signed-off-by: nv-yilinf <206948969+nv-yilinf@users.noreply.github.com>
|
2025-10-12 22:46:31 -07:00 |
|
xiweny
|
5ce9719759
|
[https://nvbugs/5503138] [fix] Remove compile warnings (#8167)
Signed-off-by: Xiwen Yu <13230610+VALLIS-NERIA@users.noreply.github.com>
|
2025-10-13 13:24:23 +08:00 |
|
xinhe-nv
|
72fcff1044
|
[None][fix] add timeout for llama4 (#8254)
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
|
2025-10-12 21:04:20 -07:00 |
|
DylanChen-NV
|
d6e315e9ff
|
[None][feat] Add torch compile support for cuda core GEMM OP (#8261)
Signed-off-by: Dylan Chen <191843203+DylanChen-NV@users.noreply.github.com>
|
2025-10-12 20:57:17 -07:00 |
|
Guoming Zhang
|
989c25fcba
|
[None][doc] Add qwen3-next doc into deployment guid and test case into L0. (#8288)
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
Co-authored-by: Faradawn Yang <faradawny@gmail.com>
Co-authored-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
|
2025-10-13 10:25:45 +08:00 |
|
Guoming Zhang
|
656d73087e
|
[None][doc] Fix several invalid ref links in deployment guide sections. (#8287)
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
|
2025-10-13 10:22:32 +08:00 |
|
amitz-nv
|
fac47e2826
|
[https://nvbugs/5510879][fix] Fix pytorch & TRT-python flows fused LoRA adapter modules weight split with TP>1 (#8063)
Signed-off-by: Amit Zuker <203509407+amitz-nv@users.noreply.github.com>
|
2025-10-12 12:29:52 -07:00 |
|
Eran Geva
|
a1ed03fe8a
|
[None][fix] AD test_trtllm_bench to use small model config and skip loading weights (#8149)
Signed-off-by: Eran Geva <19514940+MrGeva@users.noreply.github.com>
|
2025-10-12 18:30:20 +03:00 |
|
Emma Qiao
|
fdbeea51d3
|
[None][infra] Skip failed cases for main branch (#8293)
Signed-off-by: qqiao <qqiao@nvidia.com>
|
2025-10-12 08:04:09 -07:00 |
|
kris1025
|
a7ea544dbe
|
[TRTLLM-7384][feat] enable rejection sampling for CDL (#7731)
Signed-off-by: linquanh <linquanh@nvidia.com>
|
2025-10-12 20:38:48 +08:00 |
|
Zhanrui Sun
|
5798a12199
|
[None][infra] Remove WAR code for GH200 node (#8266)
Signed-off-by: ZhanruiSunCh <184402041+ZhanruiSunCh@users.noreply.github.com>
|
2025-10-11 20:33:14 -07:00 |
|
brb-nv
|
56a539cd37
|
[None][chore] Waive failing pre-merge test on main (#8282)
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
|
2025-10-10 23:52:05 -07:00 |
|
Ziyi Xiong
|
efd4ffa03b
|
[https://nvbugs/5534705][fix] Skip unnecessary CUDA graph capture (#8050)
Signed-off-by: ziyixiong-nv <219238287+ziyixiong-nv@users.noreply.github.com>
|
2025-10-11 13:26:55 +08:00 |
|
Zhenhuan Chen
|
84d2f12818
|
[TRTLLM-6748][feat] add PDL support for more kernels (#7977)
Signed-off-by: Zhenhuan Chen <chenzhh3671@gmail.com>
|
2025-10-11 08:32:05 +08:00 |
|
Yilin Fan
|
2695d70d42
|
[None][feat] Add request timing breakdown option in benchmark_serving (#8128)
Signed-off-by: nv-yilinf <206948969+nv-yilinf@users.noreply.github.com>
|
2025-10-10 09:24:54 -07:00 |
|
Chuang Zhu
|
85f157f389
|
[None][fix] Add Lock to protect mReqeustToSession (#8085)
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
Co-authored-by: Xianjie Qiao <5410381+qiaoxj07@users.noreply.github.com>
|
2025-10-10 21:51:50 +08:00 |
|
QI JUN
|
48c15d805c
|
[https://nvbugs/5558167][fix] update canceled_req_ids correctly for canceled requests (#8207)
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
|
2025-10-10 18:58:26 +08:00 |
|
xinhe-nv
|
2655995a09
|
[None][fix] add gc for test fixture (#8220)
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
|
2025-10-10 02:50:25 -07:00 |
|
bhsueh_NV
|
d3059dbd8a
|
[https://nvbugs/5547416][fix] unwaive no_cache test (#8213)
Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>
|
2025-10-10 01:50:13 -07:00 |
|
xinhe-nv
|
b555f1ff98
|
[None][chore] Add failed cases into waives.txt (#8229)
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
|
2025-10-09 23:45:28 -07:00 |
|
HuiGao-NV
|
795a051765
|
[None][chore] Print log with time for starting to load safetensor weights (#8218)
Signed-off-by: Hui Gao <huig@nvidia.com>
|
2025-10-10 13:54:54 +08:00 |
|
xinhe-nv
|
e8c9bae37e
|
[None][chore] Remove closed bugs (#8151)
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
|
2025-10-10 16:39:40 +11:00 |
|
Jonas Li
|
76a47c7bef
|
[None][fix] Enable FP8 ContextMLA on GB300 (#8080)
Signed-off-by: Jonas Li <6110159+longlee0622@users.noreply.github.com>
|
2025-10-10 10:20:46 +08:00 |
|
Pengbo Wang
|
7da4b05289
|
[https://nvbugs/5501820][fix] Add requirements for numba-cuda version to WAR mem corruption (#7992)
Signed-off-by: Pengbo Wang <221450789+pengbowang-nv@users.noreply.github.com>
|
2025-10-10 10:18:27 +08:00 |
|
mpikulski
|
7b6803b6e9
|
[TRTLLM-7769][chore] document the role of 'd2t' (#8174)
Signed-off-by: ixlmar <206748156+ixlmar@users.noreply.github.com>
|
2025-10-09 13:13:50 -04:00 |
|
Emma Qiao
|
ccd949ea5b
|
[None][infra] Waive failed tests on main 10/09 (#8230)
Signed-off-by: qqiao <qqiao@nvidia.com>
|
2025-10-09 22:46:07 +08:00 |
|
amitz-nv
|
d560054e1b
|
[None][chore] Restore asserts in pytorch flow LoRA tests (#8227)
Signed-off-by: Amit Zuker <203509407+amitz-nv@users.noreply.github.com>
|
2025-10-09 17:10:38 +03:00 |
|
QI JUN
|
e10121345e
|
[None][ci] pin flashinfer-python version (#8217)
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
|
2025-10-09 02:48:49 -07:00 |
|
Guoming Zhang
|
a193867f8f
|
[None][doc] Refine deployment guide by renaming TRT-LLM to TensorRT L… (#8214)
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
|
2025-10-09 17:11:24 +08:00 |
|
bhsueh_NV
|
27677a36f5
|
[https://nvbugs/5516666][fix] unwaive some Qwen3 CI tests (#8130)
Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>
|
2025-10-09 09:44:58 +08:00 |
|
Lizhi Zhou
|
fdf29ab8fa
|
[TRTLLM-7846][feat] Http disagg-cluster management implemention (#7869)
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
|
2025-10-09 09:44:01 +08:00 |
|
QI JUN
|
6884d06aed
|
[None][ci] move some llama4 test cases to pre merge (#8189)
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
|
2025-10-08 18:34:08 -07:00 |
|
dongfengy
|
9f2a3ae88c
|
[None][fix] Restrict tinygemm use to certain SMs (#8182)
Signed-off-by: Dongfeng Yu <dongfengy@nvidia.com>
Signed-off-by: dongfengy <99041270+dongfengy@users.noreply.github.com>
|
2025-10-08 17:55:57 -07:00 |
|
Liao Lanyu
|
ed8e00ad4a
|
[https://nvbugs/5522746][fix] unwaive tests caused by node issues after rebooting (#8193)
Signed-off-by: Lanyu Liao <lancelly@users.noreply.github.com>
Co-authored-by: Lanyu Liao <lancelly@users.noreply.github.com>
|
2025-10-09 08:45:56 +08:00 |
|
Mike Iovine
|
c88913dc03
|
[https://nvbugs/5541545][fix] Remove test_llama4 (#8031)
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
|
2025-10-08 15:20:15 -07:00 |
|
brb-nv
|
80517b7812
|
[None][chore] Waive some tests failing on main post merge (#8186)
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
|
2025-10-08 06:52:30 -07:00 |
|
mpikulski
|
8298e93bd8
|
[TRTLLM-8414][chore] BREAKING CHANGE: refine sampling strategy selection (#8132)
Signed-off-by: ixlmar <206748156+ixlmar@users.noreply.github.com>
|
2025-10-08 15:46:50 +02:00 |
|
xxi
|
e98616512f
|
[https://nvbugs/5550283][fix] update test case to the latest MoE API (#8165)
|
2025-10-07 22:54:34 -07:00 |
|
Liao Lanyu
|
d57b8f0951
|
[https://nvbugs/5455140][fix] unwaive tests related to GB200 OOM (#8159)
Signed-off-by: Lanyu Liao <lancelly@users.noreply.github.com>
Co-authored-by: Lanyu Liao <lancelly@users.noreply.github.com>
|
2025-10-08 13:14:12 +08:00 |
|
ruodil
|
971610e3ff
|
[None][test] add test-model-suites option in integration conftest.py (#8016)
Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com>
Co-authored-by: Ruodi Lu <ruodil@users.noreply.github.com>
|
2025-10-08 10:38:31 +08:00 |
|
Sergey Klevtsov
|
017583a949
|
[https://nvbugs/5488576][fix] Propagate disable_finalize_fusion config flag in WIDEEP MoE backend (#8141)
Signed-off-by: Sergey Klevtsov <sklevtsov@nvidia.com>
|
2025-10-07 14:44:54 -07:00 |
|
Mike Iovine
|
7facac077b
|
[None][fix] Fix MTP illegal memory access (#8161)
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
|
2025-10-07 14:02:55 -04:00 |
|
Emma Qiao
|
ca9da1f1c2
|
[None][infra] Skip failed cases for main (#8176)
Signed-off-by: qqiao <qqiao@nvidia.com>
|
2025-10-07 06:37:51 -07:00 |
|
xiweny
|
9298f1bdcc
|
[None] [test] Add B300 cases to CI (#8056)
Signed-off-by: Xiwen Yu <13230610+VALLIS-NERIA@users.noreply.github.com>
|
2025-10-06 19:23:31 -07:00 |
|