Commit Graph

4034 Commits

Author SHA1 Message Date
TensorRT LLM
67e487bac6 [None][infra] Check in most recent lock file from nightly pipeline
Signed-off-by: TensorRT LLM <90828364+tensorrt-cicd@users.noreply.github.com>
2025-12-12 04:05:57 +00:00
Zhanrui Sun
1e6bd1d95e
[TRTLLM-9811][infra] Update urllib3 version >= 2.6.0 to fix High vulnerability issue (#9828)
Signed-off-by: ZhanruiSunCh <184402041+ZhanruiSunCh@users.noreply.github.com>
Signed-off-by: Zhanrui Sun <184402041+ZhanruiSunCh@users.noreply.github.com>
2025-12-12 11:12:55 +08:00
TensorRT LLM
b383692f7f [None][infra] Check in most recent lock file from nightly pipeline
Signed-off-by: TensorRT LLM <90828364+tensorrt-cicd@users.noreply.github.com>
2025-12-10 03:47:30 +00:00
Jonas Li
b05481107c
[None][fix] Update branch with the correct extra_attrs fix (#9857)
Signed-off-by: Jonas Li <6110159+longlee0622@users.noreply.github.com>
Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>
Co-authored-by: Yukun He <23156053+hyukn@users.noreply.github.com>
2025-12-10 10:42:35 +08:00
Yukun He
92997d608f [None][fix] Add environment variable to overridenv fp4 gemm backends
Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>
Signed-off-by: Jonas Li <6110159+longlee0622@users.noreply.github.com>
2025-12-09 16:46:36 +08:00
Jiagan Cheng
8d9baa4623 [https://nvbugs/5677746][fix] Use first PP rank's schedule result in other PP ranks to fix PP hang (#9659)
Signed-off-by: Jiagan Cheng <jiaganc@nvidia.com>
2025-12-09 16:46:36 +08:00
yuanjingx87
84f7a7fd3c [None][infra] Correct the waived test names due to a merge conflict (#9803)
Signed-off-by: Yuanjing Xue <197832395+yuanjingx87@users.noreply.github.com>
2025-12-09 16:46:36 +08:00
Lizhi Zhou
de63a15fb6 [http://nvbugs/5649010][fix] fix test_auto_scaling.py::test_worker_restart timeout (#9775)
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
2025-12-09 16:46:36 +08:00
Fanrong Li
2253d64d1b [None][chore] Move the rocketkv e2e test to post-merge (#9768)
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
2025-12-09 16:46:36 +08:00
Emma Qiao
6b48293d6d [None][infra] Waive failed cases for main on 12/08 (#9773)
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-12-09 16:46:36 +08:00
Emma Qiao
d5257575ac [None][infra] Waive failed cases for main branch on 12/07 (#9769)
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-12-09 16:46:36 +08:00
Yiqing Yan
dd908ae753
[None][chore] Bump version to 1.2.0rc4.post1 (#9826)
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
2025-12-09 15:06:12 +08:00
Yan Chunwei
e4c707845f
[None][fix] enable hmac in RPC (#9745)
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
2025-12-07 08:24:46 +08:00
Jonas Li
2645a78f34
[TRTLLM-9660][feat] Convert cuteDSL GEMM to opt-in feature (#9682)
Signed-off-by: Jonas Li <6110159+longlee0622@users.noreply.github.com>
Co-authored-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
2025-12-06 02:24:51 -08:00
mpikulski
8d2178d321
[TRTLLM-9522][chore] implement default attach_multimodal_embeddings (#9664)
Signed-off-by: ixlmar <206748156+ixlmar@users.noreply.github.com>
2025-12-05 22:12:16 -08:00
Enwei Zhu
7cd5a67e25
[TRTLLM-9372][feat] Enable CuteDSL MoE with Large EP (#9592)
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
2025-12-05 22:08:52 -08:00
xxi
c2f2add6df
[None][fix] fix a bug: deepseek_fp8_block_scales in TRTLLMGEN-MoE use 2D x_sf instead of 1D (#9658)
Signed-off-by: xxi <xxi@nvidia.com>
2025-12-05 21:01:39 -08:00
shuyixiong
df5b32966d
[None][fix] Fix triton moe load_weight (#9649)
Signed-off-by: shuyix <219646547+shuyixiong@users.noreply.github.com>
2025-12-06 11:17:04 +08:00
TensorRT LLM
74ed9f0468 [None][infra] Check in most recent lock file from nightly pipeline
Signed-off-by: TensorRT LLM <90828364+tensorrt-cicd@users.noreply.github.com>
2025-12-06 03:10:18 +00:00
QI JUN
d4f68195c3 [TRTLLM-9092][doc] link to modelopt checkpoints in quick start guide (#9571)
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-12-05 17:50:12 -05:00
QI JUN
0406949f32 [TRTLLM-9093][doc] update hyper links in overview (#9568)
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-12-05 17:50:12 -05:00
Yan Chunwei
b7a255d67e [TRTLLM-9075][doc] refine the slurm examples (#9548)
Signed-off-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-12-05 17:50:12 -05:00
Yiqing Yan
6ebdf1c304 [None][infra] Updated Linux installation guide (#9485)
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-12-05 17:50:12 -05:00
Enwei Zhu
b46e78e263 [TRTLLM-9157][doc] Guided decoding doc improvement (#9359)
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-12-05 17:50:12 -05:00
QI JUN
0915c4e3a1 [TRTLLM-9086][doc] Clean up TODOs in documentation (#9292)
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-12-05 17:50:12 -05:00
Pengyun Lin
c6dc68a28e [None][doc] VDR 1.0 trtllm-serve doc enhancement (#9443)
Signed-off-by: Pengyun Lin <81065165+LinPoly@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-12-05 17:50:12 -05:00
Yan Chunwei
3e442922a3 [TRTLLM-9160][doc] add doc to llm_runtime.py (#9482)
Signed-off-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-12-05 17:50:12 -05:00
jthomson04
6332bf27e6 [TRTLLM-9199][docs] KV Connector Docs (#9325)
Signed-off-by: jthomson04 <jwillthomson19@gmail.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-12-05 17:50:12 -05:00
Iman Tabrizian
9425f7fe3a [https://nvbugs/5601682][fix] Fix cacheTransceiver hang (#9311)
Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-12-05 17:50:12 -05:00
Mike Iovine
31ab367576
[None][chore] Waive flakey disagg tests (#9749)
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-12-05 13:07:05 -08:00
Chenghao Zhang
d6f95a4363
[None][feat] AutoDeploy: Perf optimization for Attention and rmsnorm (#9719)
Signed-off-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com>
2025-12-05 12:59:04 -08:00
yuanjingx87
c7b5e3ea8f
[None][infra] Update allowed list 20251204 (#9718)
Signed-off-by: Yuanjing Xue <197832395+yuanjingx87@users.noreply.github.com>
2025-12-05 11:55:56 -08:00
jthomson04
299601aebf
[https://nvbugs/5670672][fix] Fix flaky KV connector tests (#9676)
Signed-off-by: jthomson04 <jwillthomson19@gmail.com>
2025-12-05 10:04:54 -08:00
Robin Kobus
eb0b426e5d
[None][refactor] Improve request processing function in sampler (#9671)
Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
2025-12-05 16:41:49 +01:00
Robin Kobus
faf682b8bc
[TRTLLM-7136][feat] Update load_weights method to include mapping parameter in checkpoint loaders (#9583)
Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
2025-12-05 16:07:20 +01:00
yufeiwu-nv
68253d9d29
[https://nvbugs/5518713][test] Refactor core test lists by merging with llm_perf_cluster.yml (#9714)
Signed-off-by: yufeiwu <230315618+yufeiwu-nv@users.noreply.github.com>
2025-12-05 01:15:37 -08:00
Kaiyu Xie
e06c582648
[None] [tests] Unwaive EPLB tests (#9625)
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
2025-12-05 00:13:24 -08:00
TensorRT LLM
a736226abd [None][infra] Check in most recent lock file from nightly pipeline
Signed-off-by: TensorRT LLM <90828364+tensorrt-cicd@users.noreply.github.com>
2025-12-05 03:26:00 +00:00
gramnarayan
74df9b180b
[#9602][feat] AutoDeploy: Support TRTLLM Sampler (#9641)
Signed-off-by: Govind Ramnarayan <105831528+govind-ramnarayan@users.noreply.github.com>
2025-12-04 19:24:11 -08:00
Kaiyu Xie
cb87c44912
[TRTLLM-9562] [doc] Add Deployment Guide for Kimi K2 Thinking on TensorRT LLM - Blackwell (#9711)
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
2025-12-04 19:20:06 -08:00
Lizhi Zhou
dc766fc126
[https://nvbugs/5633340][fix] start disagg workers and servers on free ports (#9694)
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
2025-12-05 10:51:29 +08:00
Lizhi Zhou
0d0a16fff4
[TRTLLM-8920][feat] decouple disagg service from fastapi (#8714)
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
2025-12-05 10:44:16 +08:00
Thor Johnsen
33224560b8
[None][doc] Added line about partial reuse (#7846)
Signed-off-by: thorjohnsen <41591019+thorjohnsen@users.noreply.github.com>
2025-12-04 18:19:32 -08:00
Yiqing Yan
e834f04238
[TRTLLM-9579][infra] Set mergeWaiveList stage UNSTABLE when there is any issue (#9692)
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
2025-12-05 10:18:31 +08:00
brb-nv
5d6edc3944
[None][doc] Add feature docs for helix parallelism (#9684)
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-12-04 18:08:40 -08:00
Yiqing Yan
731b2eb4ef
[TRTLLM-5312][infra] Add triton trigger rules (#6440)
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
2025-12-05 07:35:04 +08:00
pdrake-nv
cee7071e27
[None][infra] Add container notices and documentation (#9185)
Signed-off-by: Parker Drake <pdrake@nvidia.com>
2025-12-04 10:08:55 -08:00
Aurelien Chartier
041bb32151
[None][fix] Fix TLLM_SPEC_DECODE_FORCE_NUM_ACCEPTED_TOKENS for MTP/EAGLE (#9608)
Signed-off-by: Aurelien Chartier <2567591+achartier@users.noreply.github.com>
2025-12-04 08:23:57 -08:00
xinhe-nv
530af1a98e
[None][chore] Add failed cases into waives.txt (#9662)
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
2025-12-04 22:33:22 +08:00
Anthony Chang
60cdca3740
[None][fix] Recover TRTLLM MoE Perf for DEP (#9562)
Signed-off-by: Anthony Chang <27950904+rosenrodt@users.noreply.github.com>
2025-12-04 22:10:25 +08:00