Guoming Zhang
64f7cca5fa
[ https://nvbugs/5519525 ][fix] fix doc invalid link for bug 5519525 ( #7753 )
...
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
2025-09-16 16:27:04 +08:00
Shi Xiaowei
7cf1d5a518
[None][doc] Fix the link in the doc ( #7754 )
...
Signed-off-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
2025-09-16 15:49:44 +08:00
Yi Zhang
7df515e335
[ https://nvbugs/5355219 ][fix] Fix trtllm moe backend test config and Qwen3 MoE multi node ( #7724 )
...
Signed-off-by: Yi Zhang <187001205+yizhang-nv@users.noreply.github.com>
2025-09-16 10:33:35 +08:00
Ivy Zhang
aaa381d169
[ https://nvbugs/5512734 ][fix] Update kv cache config for maverick ( #7710 )
...
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
2025-09-15 22:53:30 +08:00
bhsueh_NV
2d40adb874
[ https://nvbugs/5437405 ][fix] cherry-pick PR 7000 (qwen3 235b eagle3 ci) ( #7702 )
...
Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>
2025-09-15 16:03:36 +08:00
Guoming Zhang
9d719dd6d2
[None][doc] Add labels description note into llm api section ( #7696 )
...
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
2025-09-15 14:15:09 +08:00
Yanchao Lu
41a341a1dc
[None][ci] Test waives for the release/1.0 branch 09/15 ( #7700 )
...
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
2025-09-15 09:24:04 +08:00
Yilin Fan
e5ba99c6de
[ https://nvbugs/5398180 ][feat] Improve Llama4 performance for small max_seqlen cases ( #7681 )
...
Signed-off-by: Yilin Fan <206948969+nv-yilinf@users.noreply.github.com>
2025-09-15 09:04:32 +08:00
brb-nv
c0e4fce03f
[ https://nvbugs/5501557 ][fix] Fix out-of-bounds vector access for model with multiple layer types ( #7636 )
...
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-09-10 20:34:25 -07:00
Guoming Zhang
541fd3ecb8
[ https://nvbugs/5474409 ][fix] Disable concurrent loading by default ( #7663 )
...
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
2025-09-11 00:11:17 +08:00
Leslie Fang
9ca8662b26
[ https://nvbugs/5436461 ][fix] Adjust free_gpu_memory_fraction of test_eagle3 ( #7673 )
...
Signed-off-by: leslie-fang25 <leslief@nvidia.com>
2025-09-10 23:16:25 +08:00
WeiHaocheng
68b7bad447
[ https://nvbugs/5477730 ][fix] Fix the alltoall case when tp_size larg… ( #7671 )
...
Signed-off-by: Fred Wei <20514172+WeiHaocheng@users.noreply.github.com>
2025-09-10 20:21:09 +08:00
Guoming Zhang
7c2f04ffec
[None][doc] Use hash id for external link ( #7641 )
...
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
2025-09-09 04:34:05 -04:00
Guoming Zhang
49dcc0df53
[None][doc] Fix a invalid link and a typo. ( #7634 )
...
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
2025-09-09 00:49:28 -04:00
Simeng Liu
f4736aec8e
[ https://nvbugs/5470782 ][chore] Remove the skip statement in 1.0 rele… ( #7573 )
...
Signed-off-by: Simeng Liu <simengl@nvidia.com>
2025-09-08 21:04:14 -07:00
Liao Lanyu
af3f03cbff
[ https://nvbugs/5455140 ][fix] unwaive release/1.0 DS R1 test cases with bug already fixed ( #7432 )
...
Signed-off-by: Lanyu Liao <lancelly@users.noreply.github.com>
Co-authored-by: Lanyu Liao <lancelly@users.noreply.github.com>
2025-09-09 09:48:59 +08:00
peaceh-nv
7784b3327f
[ https://nvbugs/5503423 ][waive] Waive Llama3.1-70B-FP8 test on RTX PRO 6000 ( #7603 )
...
Signed-off-by: peaceh <103117813+peaceh-nv@users.noreply.github.com>
2025-09-09 09:29:07 +08:00
HuiGao-NV
5206f1ce47
[ https://nvbugs/5474169 ][fix] seq_len mismatch between kv cache manager and graph attn metadata ( #7606 )
...
Signed-off-by: Hui Gao <huig@nvidia.com>
2025-09-09 08:32:31 +08:00
Guoming Zhang
f6365e654f
[None][doc] Fix a invalid link. ( #7617 )
...
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
2025-09-08 20:33:35 +08:00
Yan Chunwei
12041338a4
[ https://nvbugs/5416501 ][doc] add known issues to llmapi doc ( #7560 )
...
Signed-off-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
Co-authored-by: Ryan McCormick <mccormick.codes@gmail.com>
2025-09-08 04:42:54 -04:00
Yiteng Niu
88d1bde4d3
[None][infra] update nspect version ( #7552 )
...
Signed-off-by: Yiteng Niu <6831097+niukuo@users.noreply.github.com>
2025-09-06 18:16:55 +08:00
Yanchao Lu
2cb5b9f31b
[None][ci] Increase the number of retries in docker image generation ( #7557 )
...
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
2025-09-06 18:16:36 +08:00
Yanchao Lu
275a09d0a2
Revert "[ https://nvbugs/5461761 ][fix] Remove the waiver ( #7427 )"
...
This reverts commit 4612906b67 .
2025-09-06 18:11:34 +08:00
Guoming Zhang
01c4ece911
[None][doc] Rename TensorRT-LLM to TensorRT LLM. ( #7554 )
...
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
2025-09-05 16:54:57 +08:00
Guoming Zhang
f9187b2fda
[None][doc] Update kvcache part ( #7549 )
...
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
2025-09-05 03:46:13 -04:00
Yukun He
e07fa9ddc5
[ https://nvbugs/5496960 ][fix] Fix Gemma model forward. ( #7509 )
...
Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>
2025-09-04 19:09:43 +08:00
Guoming Zhang
cabda243f1
[TRTLLM-5930][doc] 1.0 Documentation. ( #6696 )
...
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
2025-09-04 05:29:43 -04:00
Ziyi Xiong
4612906b67
[ https://nvbugs/5461761 ][fix] Remove the waiver ( #7427 )
...
Signed-off-by: ziyixiong-nv <219238287+ziyixiong-nv@users.noreply.github.com>
2025-09-04 11:34:25 +08:00
Yan Chunwei
ad80819ef0
[ https://nvbugs/5351244 ][fix] test_mpi_session ( #7501 )
...
Signed-off-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
2025-09-04 10:10:43 +08:00
dongxuy04
9eecdf2ee9
[TRTLLM-7008][fix] cherrypick fix to 1.0 Add automatic shared memory delete if already exist ( #7433 )
...
Signed-off-by: Dongxu Yang <78518666+dongxuy04@users.noreply.github.com>
2025-09-02 11:23:53 +08:00
Guoming Zhang
95e0318647
[None][doc] add blackwell information into support matrix ( #6740 )
...
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
2025-09-01 14:04:45 -04:00
Emma Qiao
991b83af81
[None][infra] Waive failed tests on release branch 0901 ( #7448 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-09-01 23:24:51 +08:00
Yuxian Qiu
559762f185
[ https://nvbugs/5448754 ][fix] Download HF model for all nodes. ( #6824 )
...
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
2025-09-01 16:00:43 +08:00
HuiGao-NV
860589aa0c
[ https://nvbugs/5474169 ][fix]Adjust max seq len for kvcache for memory estimation ( #7391 )
...
Signed-off-by: Hui Gao <huig@nvidia.com>
2025-09-01 14:40:58 +08:00
Chang Liu
050db0e46f
[ https://nvbugs/5445466 ][fix] Eliminate race when loading HF dynamic modules ( #7268 ) ( #7379 )
...
Signed-off-by: Chang Liu (Enterprise Products) <9713593+chang-l@users.noreply.github.com>
2025-08-30 17:44:24 +08:00
Lizhi Zhou
7e4dad4dbb
[ https://nvbugs/5448767 ][fix] disable kv cache reuse for disagg pp>1 tests ( #7354 )
...
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
2025-08-29 09:33:16 +02:00
Bo Li
ef0f65b353
[ https://nvbugs/5467548 ][fix] DeepSeek illegal memory access. ( #7298 )
...
Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>
2025-08-29 12:19:03 +08:00
amitz-nv
66f0657716
[TRTLLM-7346][fix] Improve performance of PyTorchModelEngine._get_lora_params_from_requests ( #7203 )
...
Signed-off-by: Amit Zuker <203509407+amitz-nv@users.noreply.github.com>
2025-08-28 16:06:32 +08:00
Wanli Jiang
4ae40cbacf
[ https://nvbugs/5480415 ][fix] Fix phi4mm multi-gpu test ( #7275 )
...
Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>
Co-authored-by: Sharan Chetlur <116769508+schetlur-nv@users.noreply.github.com>
2025-08-27 22:24:19 -04:00
Iman Tabrizian
91c4af3f01
[ https://nvbugs/5434320 ][bug] Fix disagg pp bug ( #7099 )
...
Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
2025-08-27 13:38:01 -04:00
brb-nv
4b1898e82e
[ https://nvbugs/5480550 ][fix] Increase timeout for Gemma3 27B test ( #7271 )
...
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-08-27 08:45:05 -07:00
Venky
6cc168a5d3
[ https://nvbugs/5463720 ][fix] tp-split the inferred mlp_hidden_size for nemotron-nas ( #7231 )
...
Signed-off-by: Venky Ganesh <23023424+venkywonka@users.noreply.github.com>
2025-08-27 15:04:42 +03:00
Lizhi Zhou
0fa49c5e2b
[ https://nvbugs/5448767 ][fix] fix mpi4py deadlocks in pp event-loop ( #6976 )
...
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
2025-08-27 02:01:48 -04:00
Jin Li
877e1f44d3
[ https://nvbugs/5451426 ][fix] Avoid torch compile on full eagle3 worker ( #7245 )
...
Signed-off-by: Jin Li <59594262+liji-nv@users.noreply.github.com>
2025-08-27 09:59:06 +08:00
brb-nv
201fd257cc
[ https://nvbugs/5478151 ][fix] Add missing spec for Llama-3.3 70B ( #7267 )
...
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
Co-authored-by: Larry <197874197+LarryXFly@users.noreply.github.com>
2025-08-27 09:56:58 +08:00
William Zhang
b6eba85dfc
[ https://nvbugs/5430125 ][ci] Unwaive test case for mistral 3.1 small ( #7265 )
...
Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>
2025-08-26 17:32:02 -04:00
William Zhang
34c1e9c341
[None][feat] Skip prefetching consolidated safetensors when appropriate ( #7225 )
...
* Why?
Some models (e.g. anything produced by Mistral) can have both sharded
safetensors and a consolidated safetensor in the same checkpoint
directory. In such cases, prefetching both to memory is a waste of time,
and memory.
* What?
This commit skips over consolidated safetensors when they are not the
only safetensor file present in the checkpoint directory.
Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>
2025-08-26 09:40:17 -07:00
Jiagan Cheng
85b4ae26b7
[ https://nvbugs/5451342 ][fix] Use runtime max_batch_size when cuda_graph_config.max_batch_size is not provided in trtllm-bench ( #7031 )
...
Signed-off-by: Jiagan Cheng <jiaganc@nvidia.com>
2025-08-26 08:10:35 -04:00
Emma Qiao
7409d56053
[None][infra] Waive failed cases for release/1.0 ( #7258 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-08-26 19:50:28 +08:00
Yuxian Qiu
2fb16ad328
[None][fix] fix log_once usage ( #7210 )
...
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
2025-08-26 19:13:03 +08:00