Jin Li
|
4bac6b337e
|
[https://nvbugs/5537348][fix] Use device tensor index for MTP (#8062)
Signed-off-by: Jin Li <59594262+liji-nv@users.noreply.github.com>
|
2025-10-14 05:51:45 -07:00 |
|
Yiqing Yan
|
7b5ba7ca66
|
[https://nvbugs/5565541][fix] Add timeout threshold for H100 FHMA test (#8354)
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
|
2025-10-14 01:23:08 -07:00 |
|
bhsueh_NV
|
66aa88739b
|
[https://nvbugs/5574556][fix] fix bug of Qwen3_235B_A22B::test_fp8 CI (#8351)
Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>
|
2025-10-14 15:26:15 +08:00 |
|
Ziyi Xiong
|
9ecc6db5b4
|
[https://nvbugs/5537878][fix] Reserve an extra slot for padded batch … (#8231)
Signed-off-by: ziyixiong-nv <219238287+ziyixiong-nv@users.noreply.github.com>
|
2025-10-13 23:34:22 -07:00 |
|
Lizhi Zhou
|
553ff3402a
|
[https://nvbugs/5550671][fix] fix disagg-serving multinodes test failure (#8307)
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
|
2025-10-14 08:01:00 +02:00 |
|
Chuang Zhu
|
6a73f079fe
|
[https://nvbugs/5465642][fix] Increase server timeout to wait weight loading (#8297)
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
|
2025-10-14 07:55:31 +02:00 |
|
Jin Li
|
3860a674d5
|
[https://nvbugs/5543770][fix] Update to Cutlass v4.2.1 (#8055)
Signed-off-by: Jin Li <59594262+liji-nv@users.noreply.github.com>
|
2025-10-13 22:39:25 -07:00 |
|
yuanjingx87
|
e065ff21d2
|
[None][infra] cherry pick numexpr fix to release/1.1 (#8333)
Signed-off-by: Yuanjing Xue <197832395+yuanjingx87@users.noreply.github.com>
|
2025-10-13 21:20:09 -07:00 |
|
Lizhi Zhou
|
2c44e8198a
|
[https://nvbugs/5470769][chore] unwaive test for PR7338 (#8258)
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
|
2025-10-14 11:17:03 +08:00 |
|
William Zhang
|
dc052b663f
|
[https://nvbugs/5565530][fix] Unwaive test (#8273)
Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>
|
2025-10-13 17:59:32 +02:00 |
|
Patrice Castonguay
|
fd7a11e11d
|
[https://nvbugs/5534837][fix] Fix KV cache split on long context (#8247)
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>
Co-authored-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
|
2025-10-13 11:48:49 -04:00 |
|
Enwei Zhu
|
598e88594c
|
[https://nvbugs/5568951][fix] Fix guided decoding disagg tests (#8311)
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
|
2025-10-13 18:55:28 +08:00 |
|
Zhanrui Sun
|
02080e199d
|
[https://nvbugs/5563653][infra] reduce docker image layers (#8250)
Signed-off-by: ZhanruiSunCh <184402041+ZhanruiSunCh@users.noreply.github.com>
Signed-off-by: Zhanrui Sun <184402041+ZhanruiSunCh@users.noreply.github.com>
|
2025-10-13 01:38:27 -07:00 |
|
Chuang Zhu
|
ad0e91a174
|
[https://nvbugs/5546202][fix] Fix concurrent bug for NIXL cache transceiver (#8147)
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
|
2025-10-13 09:40:56 +02:00 |
|
xiweny
|
6545d541bb
|
[https://nvbugs/5532789] [doc] Add documents about CUDA 12.9 (#8192)
Signed-off-by: Xiwen Yu <13230610+VALLIS-NERIA@users.noreply.github.com>
|
2025-10-13 00:35:36 -07:00 |
|
Yechan Kim
|
745cf55ff3
|
[https://nvbugs/5550722][fix] Fix image load (#8093)
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
|
2025-10-13 14:12:39 +08:00 |
|
Yechan Kim
|
3d3d49434a
|
[https://nvbugs/5547434][fix] Fix Qwen2.5-VL device_path error (#8057)
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
|
2025-10-13 14:12:27 +08:00 |
|
Ivy Zhang
|
6a42a9649b
|
[None][chore] Update test configs for release (#8224)
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
|
2025-10-13 14:07:33 +08:00 |
|
Liao Lanyu
|
8f2e48a981
|
[https://nvbugs/5522746][fix] unwaive tests caused by node issues after rebooting (#8268)
Signed-off-by: Lanyu Liao <lancelly@users.noreply.github.com>
Co-authored-by: Lanyu Liao <lancelly@users.noreply.github.com>
|
2025-10-13 13:31:52 +08:00 |
|
Ivy Zhang
|
bcf9cb1f58
|
[TRTLLM-8246][test] add multimodal kvcache+chunked_prefil cases in to QA test list (#8212)
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
|
2025-10-13 11:38:38 +08:00 |
|
Ivy Zhang
|
bca5e29387
|
[None][chore] Update constaintfor release (#8211)
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
|
2025-10-13 11:14:24 +08:00 |
|
brb-nv
|
04bded7c40
|
[None][chore] Waive test failing on pre-merge CI (#8295)
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
|
2025-10-12 16:54:56 -07:00 |
|
Emma Qiao
|
d857cd47a0
|
[None][infra] Update and waive failed tests for release branch (#8291)
Signed-off-by: qqiao <qqiao@nvidia.com>
|
2025-10-12 21:51:54 +08:00 |
|
Zhanrui Sun
|
4c36bba2ec
|
[None][infra] Remove WAR code for GH200 node (#8267)
Signed-off-by: ZhanruiSunCh <184402041+ZhanruiSunCh@users.noreply.github.com>
|
2025-10-11 20:40:16 -07:00 |
|
Yan Chunwei
|
4ebc443fa9
|
[https://nvbugs/5565590][fix] test_request_perf_metrics_draft (#8257)
Signed-off-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
|
2025-10-12 10:01:20 +08:00 |
|
Yan Chunwei
|
7771669651
|
[https://nvbugs/5532023][fix] unwaive GenerationExecutor tests (#8251)
Signed-off-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
|
2025-10-11 10:43:04 +08:00 |
|
Patrice Castonguay
|
2e787d73ea
|
[https://nvbugs/5538098][fix] Checking connection to etcd server in unit test (#8269)
Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>
|
2025-10-10 14:31:36 -07:00 |
|
Zhanrui Sun
|
f72058264f
|
[None][fix] cherry-pick !8217 pin flashinfer-python version (#8217) (#8252)
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
Co-authored-by: QI JUN <22017000+QiJune@users.noreply.github.com>
|
2025-10-09 23:48:21 -07:00 |
|
xxi
|
ea640a186b
|
[https://nvbugs/5550283][fix] update test case to call post quantization explicitly due to code refactor (#8188)
Signed-off-by: xxi <xxi@nvidia.com>
|
2025-10-09 09:41:47 +08:00 |
|
brb-nv
|
a9a0969de7
|
[None][chore] Waive tests failing on release/1.1 post merge (#8185)
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
|
2025-10-08 09:59:50 -07:00 |
|
Yukun He
|
1ca84e1a25
|
[https://nvbugs/5536131][fix] Fix illegal access issue when scale is not provided in Llama3/4. (#7960)
Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>
|
2025-10-07 23:47:00 -07:00 |
|
xxi
|
647080e3d5
|
[https://nvbugs/5550283][fix] update to the latest MoE API (#8169)
Signed-off-by: xxi <xxi@nvidia.com>
|
2025-10-07 21:12:20 +08:00 |
|
xiweny
|
72144a40d2
|
[https://nvbugs/5541494] [fix] Fix missing sm100f/103a kernels and add tests (#8098)
Signed-off-by: Xiwen Yu <13230610+VALLIS-NERIA@users.noreply.github.com>
|
2025-10-07 08:27:55 +08:00 |
|
Jin Li
|
b4e6a1648b
|
[https://nvbugs/5451280][fix] Reduce memory fraction problem by warmu… (#7999)
Signed-off-by: Jin Li <59594262+liji-nv@users.noreply.github.com>
|
2025-10-03 18:14:13 -07:00 |
|
Jin Li
|
ef8e2173d4
|
[None][ci] Waive failing tests on release/1.1 (#8088)
Signed-off-by: Jin Li <59594262+liji-nv@users.noreply.github.com>
|
2025-09-30 04:10:22 -04:00 |
|
Zheyu Fu
|
e87c89c03f
|
[https://nvbugs/5548098][fix] Fix flakey unit test for dynamic spec decode (#8078)
Signed-off-by: Zheyu Fu <zheyuf@NVIDIA.com>
|
2025-09-30 15:36:32 +08:00 |
|
Enwei Zhu
|
a64d9b69e5
|
[None][fix] Fix chunked prefill state of draft request (#8067)
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
|
2025-09-30 09:51:21 +08:00 |
|
Guoming Zhang
|
0c47925600
|
[None][doc] Refine perf overview.md and correct the error link in per… (#8036)
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
|
2025-09-28 16:14:31 +08:00 |
|
Yiqing Yan
|
4d5465a575
|
[None][chore] Bump version to 1.1.0 (#7942)
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
|
2025-09-26 13:17:36 +08:00 |
|
sunnyqgg
|
2e5850c28a
|
[TRTLLM-7330][feat] Eagle3 cuda graph support for the first draft model inference (#7363)
Signed-off-by: qgai <qgai@nvidia.com>
|
2025-09-26 11:28:05 +08:00 |
|
Chuang Zhu
|
f98fa0cf8b
|
[None][feat] Optimize kv cache transfer TEP (#7613)
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
|
2025-09-25 20:20:04 -07:00 |
|
QI JUN
|
4c0f8482f1
|
[None][ci] Waive test_mm_encoder_standalone.py::test_multi_request_batch_chat[llava-v1.6-mistral-7b-hf] (#8010)
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
|
2025-09-26 11:07:54 +08:00 |
|
Yuan Tong
|
fae83c387b
|
[#6102][fix] support non-system python installation (#7763)
Signed-off-by: Yuan Tong <13075180+tongyuantongyu@users.noreply.github.com>
|
2025-09-26 10:16:15 +08:00 |
|
Enwei Zhu
|
d650320de4
|
[None][infra] Improve the failure message for accuracy test suite (#7994)
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
|
2025-09-26 10:04:47 +08:00 |
|
Yiqing Yan
|
108248ece1
|
[TRTLLM-7999][infra] Add B300/GB300 single gpu test (#7951)
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
|
2025-09-26 09:59:11 +08:00 |
|
Yanchao Lu
|
7e2521a7f0
|
[None][chore] Some clean-ups for CUDA 13.0 dependencies (#7979)
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
|
2025-09-26 08:46:11 +08:00 |
|
dongfengy
|
1eb653146a
|
[https://nvbugs/5525951][fix] Clarify that PP is not supported for GPTOSS (#7911)
Signed-off-by: Dongfeng Yu <dongfengy@nvidia.com>
|
2025-09-25 12:54:18 -07:00 |
|
QI JUN
|
1529a6f22d
|
[None][chore] extract weights loading related logic to model loader (#7579)
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
|
2025-09-25 10:19:22 -07:00 |
|
Emma Qiao
|
2dc93c6371
|
[None][infra] Waive failed tests on main (#8001)
Signed-off-by: qqiao <qqiao@nvidia.com>
|
2025-09-25 08:13:39 -07:00 |
|
WeiHaocheng
|
4b0570a0d6
|
[None][doc] Add acknowledgements in scaffolding tech blog (#7983)
Signed-off-by: Fred Wei <20514172+WeiHaocheng@users.noreply.github.com>
|
2025-09-25 08:07:13 -07:00 |
|