QI JUN
|
267c850792
|
[TRTLLM-9086][doc] Clean up TODOs in documentation (#9292)
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
|
2025-11-27 14:13:00 +08:00 |
|
Pengyun Lin
|
41c903d6a7
|
[None][doc] VDR 1.0 trtllm-serve doc enhancement (#9443)
Signed-off-by: Pengyun Lin <81065165+LinPoly@users.noreply.github.com>
|
2025-11-27 13:08:26 +08:00 |
|
Yan Chunwei
|
eb7c6d9301
|
[TRTLLM-9160][doc] add doc to llm_runtime.py (#9482)
Signed-off-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
|
2025-11-27 10:10:17 +08:00 |
|
Yukun He
|
816e4d73b1
|
[https://nvbugs/5676748][fix] Cherry-pick #9336: Fix mismatched nvfp4 gemm sf shape. (#9437)
Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>
|
2025-11-26 11:57:54 +08:00 |
|
TensorRT LLM
|
dbb58bac25
|
[None][infra] Check in most recent lock file from nightly pipeline
Signed-off-by: TensorRT LLM <90828364+tensorrt-cicd@users.noreply.github.com>
|
2025-11-24 18:23:53 +00:00 |
|
jthomson04
|
b9d92380da
|
[TRTLLM-9199][docs] KV Connector Docs (#9325)
Signed-off-by: jthomson04 <jwillthomson19@gmail.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
|
2025-11-24 18:07:50 +01:00 |
|
Jin Li
|
0339255103
|
[https://nvbugs/5545522][fix] Correct Cutlass with PDL support (#9335)
Signed-off-by: Jin Li <59594262+liji-nv@users.noreply.github.com>
|
2025-11-22 09:05:13 -08:00 |
|
Iman Tabrizian
|
4180417b8c
|
[https://nvbugs/5601682][fix] Fix cacheTransceiver hang (#9311)
Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
|
2025-11-20 15:19:23 -08:00 |
|
JunyiXu-nv
|
838df92e21
|
[https://nvbugs/5670793][fix] Solve trtllm-serve launch_disaggregated… (#9324)
Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
|
2025-11-20 19:31:35 +08:00 |
|
dominicshanshan
|
2cde4e41da
|
[https://nvbugs/5648685][fix] Fix openAI server waiting time to avoid large model weight loading out time (#9254)
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
|
2025-11-19 09:46:02 +08:00 |
|
QI JUN
|
a49fdb36df
|
[TRTLLM-9092][doc] Add a pre-quantized example in quick start guide (#9223)
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
|
2025-11-18 17:36:01 -08:00 |
|
sunnyqgg
|
35b176ae78
|
[https://nvbugs/5461796][fix] Unwaive and extend time for test_llmapi_speculative_decoding_mtp (#9092)
Signed-off-by: qgai <qgai@nvidia.com>
|
2025-11-18 19:20:07 +08:00 |
|
Chuang Zhu
|
1c4c737206
|
[https://nvbugs/5582133][fix] unwaive nixl test (#9244)
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
|
2025-11-18 13:07:30 +08:00 |
|
Wanli Jiang
|
6640aed0c2
|
[None][fix] Bypass key-word matching for multimodal tests (#9170)
Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>
|
2025-11-18 10:33:07 +08:00 |
|
sunnyqgg
|
55a9771ff0
|
[https://nvbugs/5649826][fix] Unwaive test test_llm_commandr_plus_4gpus_summary (#9201)
Signed-off-by: qgai <qgai@nvidia.com>
|
2025-11-16 23:11:44 -08:00 |
|
Shunkangz
|
fd0e2e4e79
|
[TRTLLM-9159][doc] Add KV Connector docs (#9043)
Signed-off-by: Shunkang <182541032+Shunkangz@users.noreply.github.co>
Co-authored-by: Shunkang <182541032+Shunkangz@users.noreply.github.co>
|
2025-11-17 10:44:49 +08:00 |
|
brb-nv
|
6d28e6c3a6
|
[https://nvbugs/5568836][fix] Skip keyword matching for Gemma3 e2e test (#9158)
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
|
2025-11-14 02:18:24 -08:00 |
|
Kaiyu Xie
|
e5c1cd41cd
|
[None] [fix] Disable UCC as WAR to MPI allgather issue before NGC PyTorch 25.12 upgrade (#9127)
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
|
2025-11-14 01:18:04 -08:00 |
|
Leslie Fang
|
d43036e3fd
|
[https://nvbugs/5652552][fix] Log the llm args (#9119)
Signed-off-by: leslie-fang25 <leslief@nvidia.com>
|
2025-11-14 12:02:41 +08:00 |
|
Chang Liu
|
4661820d05
|
[TRTLLM-7971][doc] Doc update for multimodal in v1.1 (#9015)
Signed-off-by: Chang Liu (Enterprise Products) <9713593+chang-l@users.noreply.github.com>
|
2025-11-13 14:58:14 -08:00 |
|
Michal Guzek
|
8e9409ce04
|
[https://nvbugs/5628204][fix] Stop token IDs - fast path optimization for single stop token IDs only (#9014)
Signed-off-by: Michal Guzek <mguzek@nvidia.com>
Signed-off-by: Michal Guzek <moraxu@users.noreply.github.com>
|
2025-11-13 14:17:20 +01:00 |
|
Chuang Zhu
|
12fa81c679
|
[https://nvbugs/5628952][fix] avoid cudaFree overlap with cuda graph (#8903)
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
|
2025-11-12 09:08:05 +01:00 |
|
peaceh-nv
|
f1d02b5664
|
[https://nvbugs/5570575][fix] : Use less kv cache memory on SM120 (#9054)
Signed-off-by: peaceh <103117813+peaceh-nv@users.noreply.github.com>
|
2025-11-11 15:42:08 +08:00 |
|
Vincent Zhang
|
08f8f96cbd
|
[https://nvbugs/5284463][fix] fix ada fp8 group gemm lacks shared memory (#9044)
Signed-off-by: Vincent Zhang <vinczhang@nvidia.com>
|
2025-11-11 13:00:47 +08:00 |
|
Lizhi Zhou
|
0649b77d16
|
[https://nvbugs/5608743][chore] unwaive test (#8994)
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
|
2025-11-10 05:59:29 -08:00 |
|
Zhanrui Sun
|
7ff0b13de3
|
[TRTLLM-9080][infra] upgrade tritonserver DLFW 25.10 (#8877)
Signed-off-by: ZhanruiSunCh <184402041+ZhanruiSunCh@users.noreply.github.com>
|
2025-11-09 22:36:56 -08:00 |
|
Guoming Zhang
|
5192af14ea
|
[TRTLLM-9073][doc] Add the missing content for model support section and fix… (#9033)
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
|
2025-11-10 13:44:16 +08:00 |
|
Yiqing Yan
|
572f9be06f
|
[None][chore] Lock onnx version <1.20.0 and remove WAR for TRT 10.13 (#9007)
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
|
2025-11-10 12:50:37 +08:00 |
|
Emma Qiao
|
a74ce266d3
|
[None][infra] Waive failed tests for release branch 11/07 (#9026)
Signed-off-by: qqiao <qqiao@nvidia.com>
|
2025-11-09 18:18:49 +08:00 |
|
dominicshanshan
|
def2ad5107
|
[https://nvbugs/5575920][fix] Fix cublas/cublasLt handle creation memory not sufficient error (#8900)
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
|
2025-11-07 10:14:00 -08:00 |
|
Guoming Zhang
|
70e4d97c37
|
[None][doc] Replace the relative links with absolute links in README.md. (#8997)
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
|
2025-11-08 00:27:12 +08:00 |
|
Zhanrui Sun
|
fcfe6f86f9
|
[TRTLLM-9213][infra] Fix boost issue (#9005)
Signed-off-by: ZhanruiSunCh <184402041+ZhanruiSunCh@users.noreply.github.com>
|
2025-11-07 02:08:44 -08:00 |
|
Ivy Zhang
|
5cf3f0c981
|
[https://nvbugs/5636946][fix] Update test model (#8993)
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
|
2025-11-07 15:13:29 +08:00 |
|
Emma Qiao
|
ede230cb3a
|
[None][infra] Waive failed tests for release branch 11/06 (#8966)
Signed-off-by: qqiao <qqiao@nvidia.com>
|
2025-11-07 09:01:26 +08:00 |
|
Lucas Liebenwein
|
991e507e11
|
[https://nvbugs/5642736][fix] fix AutoDeploy pattern matcher for torch 2.9 (#8920) (#8958)
Signed-off-by: Fridah-nv <201670829+Fridah-nv@users.noreply.github.com>
Co-authored-by: Frida Hou <201670829+Fridah-nv@users.noreply.github.com>
|
2025-11-06 10:21:32 -08:00 |
|
Shiyu Li
|
519eda29bd
|
[https://nvbugs/5597647][fix] Fix MNNVL unit test failed due to accuracy issue on Hopper (#8891)
Signed-off-by: Shiyu Li <shili@nvidia.com>
Signed-off-by: Shiyu Li <timlee0212@outlook.com>
|
2025-11-06 18:28:06 +01:00 |
|
Jin Li
|
1ef38f24f4
|
[https://nvbugs/5570599][fix] Set KVCache free_gpu_memory_fraction fo… (#8780)
Signed-off-by: Jin Li <59594262+liji-nv@users.noreply.github.com>
|
2025-11-06 05:58:07 -08:00 |
|
shuyixiong
|
69dec201bd
|
[https://nvbugs/5630700][chore] Unwaive Qwen3_235B_A22B test (#8901)
Signed-off-by: Shuyi Xiong <219646547+shuyixiong@users.noreply.github.com>
|
2025-11-06 15:32:39 +08:00 |
|
Jin Li
|
f040ef9ffd
|
[https://nvbugs/5467531][fix] Fix moe test and wide ep fake impl (#8883)
Signed-off-by: Jin Li <59594262+liji-nv@users.noreply.github.com>
|
2025-11-06 11:40:50 +08:00 |
|
sunnyqgg
|
c2fe686e3e
|
[https://nvbugs/5608930][fix] Wavie TestQwen3_8B::test_chunked_prefill for bug 5608930 (#8940)
Signed-off-by: qgai <qgai@nvidia.com>
|
2025-11-05 01:52:09 -08:00 |
|
Emma Qiao
|
6db74e8a0a
|
[TRTLLM-8813][infra] Reduce GB200 multi-node test stages for release (#8860)
Signed-off-by: qqiao <qqiao@nvidia.com>
|
2025-11-04 23:29:28 -08:00 |
|
Guoming Zhang
|
b941d7acbb
|
[https://nvbugs/5634220][fix] Add developer guide back and fix some i… (#8911)
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
|
2025-11-05 10:17:01 +08:00 |
|
Bo Deng
|
43843778a7
|
[https://nvbugs/5601682][fix] unwaive test_disaggregated_deepseek_v3_… (#8888)
Signed-off-by: Bo Deng <deemod@nvidia.com>
|
2025-11-05 09:33:57 +08:00 |
|
Simeng Liu
|
0206d8d0fc
|
[https://nvbugs/5606136][fix] Fix torch.onnx.export with pytorch upgrade to fallback to dynamo=False. (#8917)
Signed-off-by: Simeng Liu <simengl@nvidia.com>
|
2025-11-04 14:11:48 -08:00 |
|
JunyiXu-nv
|
c329f5f78b
|
[https://nvbugs/5569754][chore] Adjust max batch size to prevent OOM (#8876)
Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
|
2025-11-04 18:34:26 +01:00 |
|
Yan Chunwei
|
cacb8a84f2
|
[https://nvbugs/5606266][test] move qwen3 multi-node test to the qa list (#8908)
Signed-off-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
|
2025-11-04 02:12:02 -08:00 |
|
Shi Xiaowei
|
324f63f26a
|
[https://nvbugs/5451272][fix] unwaive the test (#8608)
Signed-off-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
|
2025-11-04 01:31:41 -08:00 |
|
xiweny
|
7d8a913406
|
[https://nvbugs/5596343] [test] Update accuracy baseline for GPT-OSS-20B (#8842)
Signed-off-by: Dongfeng Yu <dongfengy@nvidia.com>
Signed-off-by: dongfengy <99041270+dongfengy@users.noreply.github.com>
Signed-off-by: Xiwen Yu <13230610+VALLIS-NERIA@users.noreply.github.com>
|
2025-11-04 16:04:11 +08:00 |
|
Ivy Zhang
|
baa6ba0d69
|
[None][chore] Update test list (#8835)
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
|
2025-11-03 21:42:01 -08:00 |
|
brb-nv
|
095b7a3ad5
|
[https://nvbugs/5521253][fix] Enable Gemma3 12B & 27B on SM100 (#8666)
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
|
2025-11-03 14:49:36 -08:00 |
|