xinhe-nv
|
3c98b25005
|
[None][chore] Add failed cases into waives.txt (#9941)
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
|
2025-12-14 23:14:24 -08:00 |
|
Kaiyu Xie
|
504ede707e
|
[None] [fix] Fix nsys_on argument for slurm scripts (#9995)
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
|
2025-12-14 22:41:30 -08:00 |
|
Void
|
dda7658306
|
[https://nvbugs/5655885][fix] fix invalid instruction error in 2shot ar kernel on Ampere (#9394)
Signed-off-by: Yilin Zhang <18275976+yilin-void@users.noreply.github.com>
|
2025-12-15 14:22:56 +08:00 |
|
Yuxian Qiu
|
7588029763
|
[None][feat] Async pp send for PPCommTorch. (#9976)
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
|
2025-12-15 14:03:46 +08:00 |
|
JunyiXu-nv
|
af899d2fe7
|
[TRTLLM-9860][doc] Add docs and examples for Responses API (#9946)
Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
|
2025-12-14 21:46:13 -08:00 |
|
Ziyi Xiong
|
f2aee0db03
|
[TRTLLM-9854][feat] Optimize the host overhead of _sample_async (#9935)
Signed-off-by: ziyixiong-nv <219238287+ziyixiong-nv@users.noreply.github.com>
|
2025-12-15 13:28:54 +08:00 |
|
shuyixiong
|
25db9e7b3e
|
[https://nvbugs/5741060][chore] Waive all pg operator tests (#9991)
Signed-off-by: Shuyi Xiong <219646547+shuyixiong@users.noreply.github.com>
|
2025-12-14 21:24:43 -08:00 |
|
Balaram Buddharaju
|
dfc8799352
|
[https://nvbugs/5669114][fix] Switch to MMMU benchmark for Gemma3 27B (#9966)
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
|
2025-12-14 21:23:59 -08:00 |
|
Fanrong Li
|
8f144d9282
|
[TRTLLM-9416][feat] Skip DS-v3.2 indexer MQA and Top-K for short sequences. (#9524)
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
|
2025-12-15 12:42:25 +08:00 |
|
Kaiyu Xie
|
0788635d6c
|
[TRTLLM-9762] [doc] Update documents for GB300 NVL72 (#9987)
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
|
2025-12-14 19:30:28 -08:00 |
|
QI JUN
|
b57650f1e6
|
[TRTLLM-9794][ci] move test cases of gpt-oss to gb200 (#9934)
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
|
2025-12-14 19:21:54 -08:00 |
|
xxi
|
f5696df285
|
[TRTLLM-8961][feat] ConfigurableMoE support DeepGemm (#9858)
|
2025-12-15 10:47:15 +08:00 |
|
Yan Chunwei
|
355e06d66d
|
[None][doc] update readme for rpc (#9972)
Signed-off-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
|
2025-12-15 10:16:50 +08:00 |
|
dominicshanshan
|
4bf42f8fa8
|
[https://nvbugs/5580297][fix] Skip capture request error test from Ray stage (#9947)
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
|
2025-12-15 10:03:16 +08:00 |
|
Anthony Chang
|
3be5f3abcf
|
[None][fix] Fix regex pattern for cubin filtering (#9914)
Signed-off-by: Anthony Chang <27950904+rosenrodt@users.noreply.github.com>
|
2025-12-15 10:02:48 +08:00 |
|
Zongfei Jing
|
bf923a1074
|
[None] [chore] Comments cleanup (#9978)
Signed-off-by: Zongfei Jing <20381269+zongfeijing@users.noreply.github.com>
|
2025-12-15 09:46:37 +08:00 |
|
Simeng Liu
|
f21e2b3329
|
[TRTLLM-9601][feat] Expose mmKeys for multimodal to integrate with dynamo. (#9604)
Signed-off-by: SimengLiu-nv <simengl@nvidia.com>
|
2025-12-15 08:42:30 +08:00 |
|
Balaram Buddharaju
|
9a1750c8f9
|
[TRTLLM-9493][noop] Refactor fusedMoeCommKernels to enable code sharing (#9922)
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
|
2025-12-14 11:29:30 -08:00 |
|
Emma Qiao
|
e0a4b72279
|
[None][infra] Waive failed tests for main branch on 12/14 (#9982)
Signed-off-by: qqiao <qqiao@nvidia.com>
|
2025-12-14 22:48:34 +08:00 |
|
Matt Lefebvre
|
1375910f1b
|
[None][infra] Delete container before attempting import (#9967)
Signed-off-by: Matt Lefebvre <mlefebvre@nvidia.com>
|
2025-12-14 00:09:33 -08:00 |
|
Mike Iovine
|
96d654029d
|
[https://nvbugs/5666816][fix] Unwaive llama3 eagle3 test (#9964)
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
|
2025-12-14 15:07:35 +08:00 |
|
Yuxian Qiu
|
fcda1a1442
|
[None][fix] disable async pp send for ray cases. (#9959)
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
|
2025-12-13 20:22:36 -08:00 |
|
TensorRT LLM
|
f6b0ddd61d
|
[None][infra] Check in most recent lock file from nightly pipeline
Signed-off-by: TensorRT LLM <90828364+tensorrt-cicd@users.noreply.github.com>
|
2025-12-14 03:29:59 +00:00 |
|
nvxuanyuc
|
a5a37227d6
|
[None][feat] Fused kernels (qknormrope + moe routing) and two-model MTP support for glm4moe (#9852)
Signed-off-by: Xuanyu Chen <xuanyuc@nvidia.com>
|
2025-12-14 10:47:24 +08:00 |
|
Faraz
|
64d7796234
|
[None][chore] Add namespace to header to fix tot failure (#9973)
|
2025-12-13 12:18:10 -05:00 |
|
Mike Iovine
|
383b13e0e5
|
[None][feat] Implement sampling on 1-model EAGLE3 (#9885)
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
|
2025-12-13 07:38:22 -08:00 |
|
jellysnack
|
079ef8ae77
|
[None][feat] Graceful Error Handling for Guided Decoder (#9078)
Signed-off-by: jellysnack <oleg.jellysnack@gmail.com>
Signed-off-by: jellysnack <158609015+jellysnack@users.noreply.github.com>
Co-authored-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
|
2025-12-13 19:57:59 +08:00 |
|
Yan Chunwei
|
85406f9dda
|
[https://nvbugs/5720482][fix] Fix test rpc streaming (#9902)
Signed-off-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
|
2025-12-13 01:14:43 -08:00 |
|
shuyixiong
|
8cbf2d958c
|
[TRTLLM-9738][chore] Guard accuracy with nccl allreduce strategy (#9793)
Signed-off-by: Shuyi Xiong <219646547+shuyixiong@users.noreply.github.com>
|
2025-12-13 01:02:11 -08:00 |
|
Balaram Buddharaju
|
6a6e41f802
|
[TRTLLM-9468][chore] Update disagg benchmarking scripts to support context parallelism (#9720)
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
|
2025-12-12 22:29:41 -08:00 |
|
shuyixiong
|
7fc720a397
|
[TRTLLM-9784][fix] Resolve port conflicts (#9780)
Signed-off-by: Shuyi Xiong <219646547+shuyixiong@users.noreply.github.com>
|
2025-12-12 22:10:01 -08:00 |
|
bhsueh_NV
|
e49c70f6df
|
[None][feat] Support Mistral Large3 LLM part (#9820)
Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>
|
2025-12-13 11:44:27 +08:00 |
|
Faraz
|
98d72c7648
|
[None][feat] spark cublas LUT table for llama-8b-bf16 perf (#9811)
Signed-off-by: Faraz Khoubsirat <58580514+farazkh80@users.noreply.github.com>
|
2025-12-12 22:37:56 -05:00 |
|
TensorRT LLM
|
e4e09867d1
|
[None][infra] Check in most recent lock file from nightly pipeline
Signed-off-by: TensorRT LLM <90828364+tensorrt-cicd@users.noreply.github.com>
|
2025-12-13 03:26:42 +00:00 |
|
Balaram Buddharaju
|
461446045e
|
[TRTLLM-9493][feat] Add helixPostProcessNative kernel for cp_dim=2 (#9924)
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
|
2025-12-12 16:49:25 -08:00 |
|
tburt-nv
|
6147452158
|
[https://nvbugs/4141427][chore] Add more details to LICENSE file (#9881)
Signed-off-by: Tyler Burt <195370667+tburt-nv@users.noreply.github.com>
|
2025-12-13 08:35:31 +08:00 |
|
yuanjingx87
|
246a877571
|
[None][infra] Remove generate lockfile schedule for 1.2.0rc4.post1 branch (#9945)
Signed-off-by: Yuanjing Xue <197832395+yuanjingx87@users.noreply.github.com>
|
2025-12-12 09:10:32 -08:00 |
|
Yuxian Qiu
|
cd4e639536
|
[None][feat] Async pp send. (#9952)
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
|
2025-12-13 00:52:30 +08:00 |
|
Chuang Zhu
|
4cc4cbe926
|
[https://nvbugs/5716787][fix] terminate nixl running when exiting (#9785)
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
Co-authored-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>
|
2025-12-12 11:15:02 -05:00 |
|
Chuang Zhu
|
9c59c9f920
|
[https://nvbugs/5643787][fix] remove the war path for notify to itself (#9834)
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
|
2025-12-12 11:10:05 -05:00 |
|
JunyiXu-nv
|
2fec53dfa5
|
[TRTLLM-9637][feat] Support tool parser for Kimi K2 (#9830)
Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
|
2025-12-12 23:32:39 +08:00 |
|
Yihan Wang
|
9df4dad3b6
|
[None][fix] Introduce inline namespace to avoid symbol collision (#9541)
Signed-off-by: Yihan Wang <yihwang@nvidia.com>
|
2025-12-12 23:32:15 +08:00 |
|
Balaram Buddharaju
|
af315d8ef1
|
[TRTLLM-5972][chore] Load balance decode token KV cache with helix parallelism (#9757)
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
|
2025-12-12 22:29:05 +08:00 |
|
zackyoray
|
d5b9ad91c9
|
[None][feat] Upgrade NIXL to v0.8.0 (#9707)
Signed-off-by: Yoray Zack <62789610+zackyoray@users.noreply.github.com>
Signed-off-by: zackyoray
Signed-off-by: Bo Deng
Co-authored-by: Bo Deng
|
2025-12-12 20:21:10 +08:00 |
|
Lucas Liebenwein
|
e767fc649a
|
[None][feat] AutoDeploy: prepare_metadata revisited (#9764)
Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
|
2025-12-12 20:14:14 +08:00 |
|
Yukun He
|
a6263a127f
|
[None][chore] Degrade log level in cublas fp4 runner when using default configs (#9951)
Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>
|
2025-12-12 18:53:54 +08:00 |
|
ruodil
|
9b3e5e90ee
|
[None][test] fix a typo in model name in script (#9867)
Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com>
Co-authored-by: Ruodi Lu <ruodil@users.noreply.github.com>
|
2025-12-12 17:35:55 +08:00 |
|
chenfeiz0326
|
61745f034a
|
[https://nvbugs/5727481][ci] Fix Port Conflict in Perf-Sanity CI Test (#9896)
Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>
|
2025-12-12 17:16:50 +08:00 |
|
kris1025
|
2fc94e5dd7
|
[None][chore] unwaive qwen3 accuracy test (#9895)
Signed-off-by: linquanh <linquanh@nvidia.com>
|
2025-12-12 16:30:09 +08:00 |
|
yufeiwu-nv
|
fd3d3a553d
|
[None][chore] Modify python ipc_util to align with C++ path (#9894)
Signed-off-by: yufeiwu <230315618+yufeiwu-nv@users.noreply.github.com>
Co-authored-by: ruodil <200874449+ruodil@users.noreply.github.com>
|
2025-12-12 15:55:22 +08:00 |
|