Xiwen Yu
|
d16d98ccdf
|
fix missing change
Signed-off-by: Xiwen Yu <13230610+VALLIS-NERIA@users.noreply.github.com>
|
2025-09-09 20:35:58 +08:00 |
|
Xiwen Yu
|
11d603bc84
|
fix
Signed-off-by: Xiwen Yu <13230610+VALLIS-NERIA@users.noreply.github.com>
|
2025-09-09 17:31:31 +08:00 |
|
Xiwen Yu
|
2c287d58b0
|
don't throw in ctor
Signed-off-by: Xiwen Yu <13230610+VALLIS-NERIA@users.noreply.github.com>
|
2025-09-09 17:21:03 +08:00 |
|
Xiwen Yu
|
a8b630f178
|
Merge remote-tracking branch 'origin/main' into feat/b300_cu13
Signed-off-by: Xiwen Yu <13230610+VALLIS-NERIA@users.noreply.github.com>
|
2025-09-09 14:34:27 +08:00 |
|
Xiwen Yu
|
8cc5ea331a
|
add comment
Signed-off-by: Xiwen Yu <13230610+VALLIS-NERIA@users.noreply.github.com>
|
2025-09-09 14:32:47 +08:00 |
|
William Tambellini
|
6ba1c8421c
|
[#6529][feat] CMake option to link statically with cublas/curand (#7178)
Close #6529.
Signed-off-by: William Tambellini <wtambellini@sdl.com>
|
2025-09-09 14:26:45 +08:00 |
|
Xiwen Yu
|
82833fa961
|
address comments
Signed-off-by: Xiwen Yu <13230610+VALLIS-NERIA@users.noreply.github.com>
|
2025-09-09 14:18:16 +08:00 |
|
Zhanrui Sun
|
7a62df5f0b
|
[TRTLLM-4366][infra] Don't call reinstall_rockylinux_cuda when the base CUDA image is up to dated (#5980)
Signed-off-by: ZhanruiSunCh <184402041+ZhanruiSunCh@users.noreply.github.com>
Signed-off-by: Zhanrui Sun <184402041+ZhanruiSunCh@users.noreply.github.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
|
2025-09-09 02:15:39 -04:00 |
|
Tomer Shmilovich
|
ecc0e687c6
|
[None][feat] Nixl support for GDS (#5488)
Signed-off-by: Tomer Shmilovich <tshmilovich@nvidia.com>
Signed-off-by: Guy Lev <glev@nvidia.com>
Co-authored-by: Guy Lev <glev@nvidia.com>
|
2025-09-09 13:00:38 +08:00 |
|
Guoming Zhang
|
7f3f658d5f
|
[None][doc] Rename TensorRT-LLM to TensorRT LLM. (#7554)
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
|
2025-09-09 12:16:03 +08:00 |
|
Guoming Zhang
|
35dac55716
|
[None][doc] Update kvcache part (#7549)
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
|
2025-09-09 12:16:03 +08:00 |
|
Guoming Zhang
|
f53fb4c803
|
[TRTLLM-5930][doc] 1.0 Documentation. (#6696)
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
|
2025-09-09 12:16:03 +08:00 |
|
Zhanrui Sun
|
b573e07f3e
|
[None][infra] Disable CU12 build to save build time (cost > 5 hours on SBSA) (#7633)
Signed-off-by: ZhanruiSunCh <184402041+ZhanruiSunCh@users.noreply.github.com>
|
2025-09-09 11:38:34 +08:00 |
|
Yiqing Yan
|
5c616da2fd
|
[TRTLLM-5877][infra] Add fmha tests and auto trigger rules (#6050)
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
|
2025-09-09 11:33:09 +08:00 |
|
Wanli Jiang
|
1e0669d27a
|
[https://nvbugs/5453709][fix] Remove transformers version limit in Qwen2VL (#7152)
Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>
|
2025-09-09 10:38:20 +08:00 |
|
Iman Tabrizian
|
d96c54d8ae
|
[None][test] Skip eagle3 test (#7627)
Signed-off-by: Iman Tabrizian <10105175+Tabrizian@users.noreply.github.com>
|
2025-09-08 17:23:53 -04:00 |
|
dongfengy
|
fdd5bd49fc
|
[https://nvbugs/5481080][fix] Fix GPTOSS W4A16 reference (#7323)
Signed-off-by: Dongfeng Yu <dongfengy@nvidia.com>
|
2025-09-08 13:59:28 -07:00 |
|
zhanghaotong
|
96af324ff1
|
[None][fix] Add try-catch in stream generator (#7467)
Signed-off-by: Zhang Haotong <zhanghaotong.zht@antgroup.com>
Co-authored-by: Zhang Haotong <zhanghaotong.zht@antgroup.com>
|
2025-09-08 16:09:26 -04:00 |
|
yuanjingx87
|
1d243a8503
|
[None][infra] Try to fix docker container failed to be killed issue (#7388)
Signed-off-by: Yuanjing Xue <197832395+yuanjingx87@users.noreply.github.com>
|
2025-09-08 11:28:01 -07:00 |
|
Chuang Zhu
|
77657a1c12
|
[TRTLLM-7361][feat] KV cache transfer for uneven pp (#7117)
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
|
2025-09-08 13:37:46 -04:00 |
|
Leslie Fang
|
3e0073e86b
|
[None][chore] remove executor config in instantiate sampler (#7516)
Signed-off-by: leslie-fang25 <leslief@nvidia.com>
|
2025-09-08 09:02:40 -07:00 |
|
Xiwen Yu
|
4cf9fed1e7
|
Merge commit 'ed27a72bcf71f7ab0e7137f7999988c9de82386f' into feat/b300_cu13
Signed-off-by: Xiwen Yu <13230610+VALLIS-NERIA@users.noreply.github.com>
|
2025-09-08 21:58:43 +08:00 |
|
Xiwen Yu
|
e30e0c8693
|
waive
Signed-off-by: Xiwen Yu <13230610+VALLIS-NERIA@users.noreply.github.com>
|
2025-09-08 21:02:35 +08:00 |
|
Eran Geva
|
5f2a42b3df
|
[TRTLLM-6142][feat] AutoDeploy: set torch recompile_limit based on cuda_graph_batch_sizes and refactored (#7219)
Signed-off-by: Eran Geva <19514940+MrGeva@users.noreply.github.com>
|
2025-09-08 08:45:58 -04:00 |
|
Chang Liu
|
4a1e13897f
|
[None][feat] Update multimodal utility get_num_tokens_per_image for better generalization (#7544)
Signed-off-by: Chang Liu (Enterprise Products) <9713593+chang-l@users.noreply.github.com>
|
2025-09-08 07:42:46 -04:00 |
|
Emma Qiao
|
dd9627d9f9
|
[None][infra] Add back rtx-pro-6000 stages since the node is available (#7601)
Signed-off-by: qqiao <qqiao@nvidia.com>
|
2025-09-08 05:45:11 -04:00 |
|
Yanchao Lu
|
ed27a72bcf
|
[None][ci] Fix a typo in the Slurm command
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
|
2025-09-08 17:07:09 +08:00 |
|
bhsueh_NV
|
219e95569a
|
[https://nvbugs/5506683][fix] adjust the CI (#7604)
Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>
|
2025-09-08 15:41:41 +08:00 |
|
Xiwen Yu
|
fdaf4e2985
|
Merge remote-tracking branch 'origin/main' into feat/b300_cu13
Signed-off-by: Xiwen Yu <13230610+VALLIS-NERIA@users.noreply.github.com>
|
2025-09-08 15:14:54 +08:00 |
|
Xiwen Yu
|
019b1db438
|
fix 5505835
Signed-off-by: Xiwen Yu <13230610+VALLIS-NERIA@users.noreply.github.com>
|
2025-09-08 14:52:00 +08:00 |
|
dominicshanshan
|
c9dca69e1b
|
[None][chore] Mass integration of release/1.0 - 3rd (#7519)
Signed-off-by: Nave Assaf <nassaf@nvidia.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
Signed-off-by: qqiao <qqiao@nvidia.com>
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
Signed-off-by: Bo Deng <deemod@nvidia.com>
Signed-off-by: Jin Li <59594262+liji-nv@users.noreply.github.com>
Signed-off-by: Yifei Zhang <219273404+yifeizhang-c@users.noreply.github.com>
Signed-off-by: Amit Zuker <203509407+amitz-nv@users.noreply.github.com>
Signed-off-by: Erin Ho <14718778+hchings@users.noreply.github.com>
Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>
Signed-off-by: Christina Zhang <83400082+ChristinaZ@users.noreply.github.com>
Signed-off-by: Venky Ganesh <23023424+venkywonka@users.noreply.github.com>
Signed-off-by: Pamela <179191831+pamelap-nvidia@users.noreply.github.com>
Signed-off-by: Hui Gao <huig@nvidia.com>
Signed-off-by: Alexandre Milesi <30204471+milesial@users.noreply.github.com>
Signed-off-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
Signed-off-by: Michal Guzek <mguzek@nvidia.com>
Signed-off-by: peaceh <103117813+peaceh-nv@users.noreply.github.com>
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>
Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>
Signed-off-by: ruodil <200874449+ruodil@users.noreply.github.com>
Signed-off-by: Linda-Stadter <57756729+Linda-Stadter@users.noreply.github.com>
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
Signed-off-by: Jiagan Cheng <jiaganc@nvidia.com>
Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>
Signed-off-by: Dom Brown <3886319+DomBrown@users.noreply.github.com>
Co-authored-by: Nave Assaf <55059536+Naveassaf@users.noreply.github.com>
Co-authored-by: Yechan Kim <161688079+yechank-nvidia@users.noreply.github.com>
Co-authored-by: brb-nv <169953907+brb-nv@users.noreply.github.com>
Co-authored-by: Iman Tabrizian <10105175+Tabrizian@users.noreply.github.com>
Co-authored-by: Emma Qiao <qqiao@nvidia.com>
Co-authored-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
Co-authored-by: Bo Deng <deemod@nvidia.com>
Co-authored-by: Jin Li <59594262+liji-nv@users.noreply.github.com>
Co-authored-by: yifeizhang-c <219273404+yifeizhang-c@users.noreply.github.com>
Co-authored-by: amitz-nv <203509407+amitz-nv@users.noreply.github.com>
Co-authored-by: Erin <14718778+hchings@users.noreply.github.com>
Co-authored-by: chenfeiz0326 <chenfeiz@nvidia.com>
Co-authored-by: ChristinaZ <83400082+ChristinaZ@users.noreply.github.com>
Co-authored-by: Venky <23023424+venkywonka@users.noreply.github.com>
Co-authored-by: Pamela Peng <179191831+pamelap-nvidia@users.noreply.github.com>
Co-authored-by: HuiGao-NV <huig@nvidia.com>
Co-authored-by: milesial <milesial@users.noreply.github.com>
Co-authored-by: Shi Xiaowei <39303645+Shixiaowei02@users.noreply.github.com>
Co-authored-by: Michal Guzek <moraxu@users.noreply.github.com>
Co-authored-by: peaceh-nv <103117813+peaceh-nv@users.noreply.github.com>
Co-authored-by: Guoming Zhang <137257613+nv-guomingz@users.noreply.github.com>
Co-authored-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>
Co-authored-by: pcastonguay <55748270+pcastonguay@users.noreply.github.com>
Co-authored-by: ruodil <200874449+ruodil@users.noreply.github.com>
Co-authored-by: Linda <57756729+Linda-Stadter@users.noreply.github.com>
Co-authored-by: Zhanrui Sun <184402041+ZhanruiSunCh@users.noreply.github.com>
Co-authored-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
Co-authored-by: Jiagan Cheng <jiaganc@nvidia.com>
Co-authored-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>
Co-authored-by: Larry <197874197+LarryXFly@users.noreply.github.com>
Co-authored-by: Sharan Chetlur <116769508+schetlur-nv@users.noreply.github.com>
Co-authored-by: Dom Brown <3886319+DomBrown@users.noreply.github.com>
|
2025-09-08 14:03:04 +08:00 |
|
Xiwen Yu
|
d4d9e778a1
|
reset build memory
Signed-off-by: Xiwen Yu <13230610+VALLIS-NERIA@users.noreply.github.com>
|
2025-09-08 12:04:30 +08:00 |
|
Xiwen Yu
|
caea58aba4
|
increase build memory
Signed-off-by: Xiwen Yu <13230610+VALLIS-NERIA@users.noreply.github.com>
|
2025-09-08 11:28:39 +08:00 |
|
JunyiXu-nv
|
504bb7ffa9
|
[TRTLLM-7779][feat] Support multiple postprocess workers for chat completions API (#7508)
Signed-off-by: Junyi Xu
Co-authored-by: Pengyun Lin <81065165+LinPoly@users.noreply.github.com>
|
2025-09-08 11:11:35 +08:00 |
|
Xiwen Yu
|
d42201e235
|
remove waivers and cleanup
Signed-off-by: Xiwen Yu <13230610+VALLIS-NERIA@users.noreply.github.com>
|
2025-09-08 10:24:52 +08:00 |
|
binghanc
|
14ee43e254
|
[None][docs] refine docs for accuracy evaluation of gpt-oss models (#7252)
Signed-off-by: 176802681+binghanc@users.noreply.github.com
|
2025-09-08 09:56:23 +08:00 |
|
Xiwen Yu
|
77657de972
|
fix build args
Signed-off-by: Xiwen Yu <13230610+VALLIS-NERIA@users.noreply.github.com>
|
2025-09-08 09:52:41 +08:00 |
|
Yan Chunwei
|
205c3a144c
|
[None][chore] expose tokens_per_block into KvCacheConfig (#5911)
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
Signed-off-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
|
2025-09-07 21:14:10 -04:00 |
|
BatshevaBlack
|
7c76dde76d
|
[TRTLLM-7187][fix] Build wheel with NIXL (#7472)
Signed-off-by: BatshevaBlack <132911331+BatshevaBlack@users.noreply.github.com>
|
2025-09-07 19:05:37 -04:00 |
|
Raayan Dhar
|
8f3121ac81
|
[None][fix] chore: fixing the math on asymmetric tp+pp tests (#7098)
Signed-off-by: raayandhar <rdhar@nvidia.com>
|
2025-09-07 14:27:46 -04:00 |
|
Yanchao Lu
|
045d2cf761
|
[None][ci] Block some nodes to avoid unstable network access (#7593)
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
|
2025-09-08 00:25:38 +08:00 |
|
Xiwen Yu
|
e6bb1fe8af
|
remove non-exist cases
Signed-off-by: Xiwen Yu <13230610+VALLIS-NERIA@users.noreply.github.com>
|
2025-09-07 23:24:46 +08:00 |
|
Netanel Haber
|
0fee8cd028
|
[TRTLLM-7153] [feat] Move stop_criteria to sample_async (#7041)
Signed-off-by: Netanel Haber <nhaber@nvidia.com>
|
2025-09-07 17:36:49 +03:00 |
|
Emma Qiao
|
5c4711fb2b
|
[None][infra] Skip RTX Pro 6000 test stages due to HW are offline (#7592)
Signed-off-by: qqiao <qqiao@nvidia.com>
|
2025-09-07 09:49:06 -04:00 |
|
Raayan Dhar
|
bae9560e62
|
[https://nvbugs/5448767][fix] sync termination of requests across PP ranks (#7455)
Signed-off-by: raayandhar <rdhar@nvidia.com>
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
Co-authored-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
|
2025-09-07 08:45:49 -04:00 |
|
Emma Qiao
|
aea8ac1649
|
[TRTLLM-5950][infra] Removing remaining turtle keywords from the code base (#7086)
Signed-off-by: qqiao <qqiao@nvidia.com>
|
2025-09-07 14:26:18 +08:00 |
|
Mike Iovine
|
45390402fc
|
[https://nvbugs/5502352][fix] Fix 2-model CDL path (#7543)
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
|
2025-09-06 23:53:27 -04:00 |
|
Chang Liu
|
99b98f1374
|
[TRTLLM-7440][fix] Split fused_input_embed to separate out host sync (#7280)
Signed-off-by: Chang Liu (Enterprise Products) <9713593+chang-l@users.noreply.github.com>
|
2025-09-06 23:11:39 -04:00 |
|
Xiwen Yu
|
291290851a
|
Merge remote-tracking branch 'origin/main' into feat/b300_cu13
Signed-off-by: Xiwen Yu <13230610+VALLIS-NERIA@users.noreply.github.com>
|
2025-09-07 10:28:24 +08:00 |
|
Xiwen Yu
|
8f8766a6fa
|
waive
Signed-off-by: Xiwen Yu <13230610+VALLIS-NERIA@users.noreply.github.com>
|
2025-09-07 10:26:08 +08:00 |
|