Chuang Zhu
536a8f6a9c
[TRTLLM-9527][feat] Add transferAgent binding (step 1) ( #10113 )
...
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
2026-01-06 08:40:38 +08:00
Tailing Yuan
a7fe043b13
[None][feat] Layer-wise benchmarks: support TEP balance, polish slurm scripts ( #10237 )
...
Signed-off-by: Tailing Yuan <yuantailing@gmail.com>
2026-01-05 11:23:04 +08:00
Lucas Liebenwein
937f8f78a1
[None][doc] promote AutoDeploy to beta feature in docs ( #10372 )
...
Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
2026-01-02 18:46:31 -05:00
Venky
dfa11d810e
[TRTC-102][docs] --extra_llm_api_options->--config in docs/examples/tests ( #10005 )
2025-12-19 13:48:43 -05:00
Yihan Wang
9df4dad3b6
[None][fix] Introduce inline namespace to avoid symbol collision ( #9541 )
...
Signed-off-by: Yihan Wang <yihwang@nvidia.com>
2025-12-12 23:32:15 +08:00
Venky
fd1270b9ab
[TRTC-43] [feat] Add config db and docs ( #9420 )
...
Signed-off-by: Frank Di Natale <3429989+FrankD412@users.noreply.github.com>
Signed-off-by: Venky Ganesh <23023424+venkywonka@users.noreply.github.com>
Co-authored-by: Frank Di Natale <3429989+FrankD412@users.noreply.github.com>
2025-12-12 04:00:03 +08:00
Frank
f6df9eb2a6
[TRTLLM-9089][chore] Port prepare_dataset into trtllm-bench ( #9250 )
2025-12-08 10:37:40 -08:00
Tailing Yuan
51ef0379d2
[None][feat] Add a parser to layer-wise benchmarks ( #9440 )
...
Signed-off-by: Tailing Yuan <yuantailing@gmail.com>
2025-11-25 05:45:16 -08:00
elvischenv
62a30bca25
[None][chore] Add tensorrt_llm/scripts to .gitignore ( #8895 )
...
Signed-off-by: elvischenv <219235043+elvischenv@users.noreply.github.com>
2025-11-11 11:10:02 +01:00
Tailing Yuan
f9c7786dc8
[None][feat] Add layer wise benchmarks ( #8777 )
...
Signed-off-by: Tailing Yuan <yuantailing@gmail.com>
2025-10-30 20:29:34 +08:00
Chang Liu
e47c787dd7
[TRTLLM-8535][feat] Support DeepSeek V3.2 with FP8 + BF16 KV cache/NVFP4 + BF16 KV cache ( #8405 )
...
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
Signed-off-by: Chang Liu <9713593+chang-l@users.noreply.github.com>
Signed-off-by: Tracin <10434017+Tracin@users.noreply.github.com>
2025-10-24 13:40:41 -04:00
Jonas Yang CN
88ea2c4ee9
[TRTLLM-7349][feat] Adding new orchestrator type -- ray ( #7520 )
...
Signed-off-by: Erin Ho <14718778+hchings@users.noreply.github.com>
Co-authored-by: Yuan Tong <13075180+tongyuantongyu@users.noreply.github.com>
Co-authored-by: Erin Ho <14718778+hchings@users.noreply.github.com>
2025-10-04 08:12:24 +08:00
QI JUN
7f87b278bc
[None][chore] remove generated fmha_cubin.h from source tree ( #7836 )
...
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-09-18 20:10:04 +08:00
yuanjingx87
eeb89a167c
[None][infra] Add nightly pipeline to generate lock files ( #5798 )
...
Signed-off-by: Yuanjing Xue <197832395+yuanjingx87@users.noreply.github.com>
2025-09-16 15:00:03 -07:00
v-shobhit
0652514c6d
[None][feat] Use a shell context to install dependancies ( #7383 )
...
Signed-off-by: Shobhit Verma <shobhitv@nvidia.com>
Signed-off-by: v-shobhit <161510941+v-shobhit@users.noreply.github.com>
Co-authored-by: Zhihan Jiang <68881590+nvzhihanj@users.noreply.github.com>
2025-09-10 09:57:37 -07:00
William Tambellini
a6ed0d17d6
[ #6798 ][fix] fix compilation error in ub_allocator in single device build ( #6874 )
...
Signed-off-by: William Tambellini <wtambellini@sdl.com>
2025-09-09 07:13:53 -04:00
Zongfei Jing
0ff8df95b7
[ https://nvbugs/5433581 ][fix] DeepGEMM installation on SBSA ( #6588 )
...
Signed-off-by: Zongfei Jing <20381269+zongfeijing@users.noreply.github.com>
2025-08-06 16:44:21 +08:00
Tailing Yuan
85b4a6808d
Refactor: move DeepEP from Docker images to wheel building ( #5534 )
...
Signed-off-by: Tailing Yuan <yuantailing@gmail.com>
2025-07-07 22:57:03 +09:00
ixlmar
04fa6c0cfc
[TRTLLM-6143] feat: Improve dev container tagging ( #5551 )
...
Signed-off-by: ixlmar <206748156+ixlmar@users.noreply.github.com>
2025-07-02 14:56:34 +02:00
qsang-nv
0fd59d64ab
infra: open source fmha v2 kernels ( #4185 )
...
* add fmha repo
Signed-off-by: Qidi Sang <200703406+qsang-nv@users.noreply.github.com>
* fix format
Signed-off-by: Qidi Sang <200703406+qsang-nv@users.noreply.github.com>
* fix code style
Signed-off-by: Qidi Sang <200703406+qsang-nv@users.noreply.github.com>
* fix header
Signed-off-by: Qidi Sang <200703406+qsang-nv@users.noreply.github.com>
* fix header kernel_traits.h
Signed-off-by: Qidi Sang <200703406+qsang-nv@users.noreply.github.com>
* add .gitignore file
Signed-off-by: Qidi Sang <200703406+qsang-nv@users.noreply.github.com>
* add SLIDING_WINDOW_ATTENTION
Signed-off-by: Qidi Sang <200703406+qsang-nv@users.noreply.github.com>
* fix style
Signed-off-by: Qidi Sang <200703406+qsang-nv@users.noreply.github.com>
* fix format
Signed-off-by: Qidi Sang <200703406+qsang-nv@users.noreply.github.com>
* update setup.py
Signed-off-by: Qidi Sang <200703406+qsang-nv@users.noreply.github.com>
* update build_wheel.py
Signed-off-by: Qidi Sang <200703406+qsang-nv@users.noreply.github.com>
---------
Signed-off-by: Qidi Sang <200703406+qsang-nv@users.noreply.github.com>
Signed-off-by: qsang-nv <200703406+qsang-nv@users.noreply.github.com>
2025-05-15 10:56:34 +08:00
nv-guomingz
62cfe74f5f
chore:update .gitignore for doc building task. ( #3993 )
...
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
2025-05-07 17:45:18 +08:00
tburt-nv
7053d0ad5a
infra: add conan ( #3744 )
...
This MR integrates Conan into the build system, so that it can be used to fetch dependencies in future changes.
Also installs all requirements-dev.txt inside a virtualenv instead of the system, since some of Conan's dependencies may conflict with the system packages. Virtualenv is used instead of venv because the triton server backend container has only virtualenv installed. This also allows developers to cache the requirements-dev.txt packages between container launches.
Signed-off-by: Tyler Burt <195370667+tburt-nv@users.noreply.github.com>
2025-04-30 11:53:14 -07:00
Kaiyu Xie
258ae9c58c
Revert "infra: move nvrtc_wrapper to conan ( #3282 )" ( #3573 )
...
This reverts commit c0dd6cbce0 .
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
2025-04-15 22:45:13 +08:00
tburt-nv
c0dd6cbce0
infra: move nvrtc_wrapper to conan ( #3282 )
...
* add pip scripts dir to path
* move nvrtc_wrapper to conan
* support building nvrtc wrapper from source
---------
Signed-off-by: Tyler Burt <195370667+tburt-nv@users.noreply.github.com>
2025-04-15 05:31:01 +08:00
Kaiyu Xie
3aa6b11d13
Update TensorRT-LLM ( #2936 )
...
* Update TensorRT-LLM
---------
Co-authored-by: changcui <cuichang147@gmail.com>
2025-03-18 21:25:19 +08:00
Dan Blanaru
16d2467ea8
Update TensorRT-LLM ( #2755 )
...
* Update TensorRT-LLM
---------
Co-authored-by: Denis Kayshev <topenkoff@gmail.com>
Co-authored-by: akhoroshev <arthoroshev@gmail.com>
Co-authored-by: Patrick Reiter Horn <patrick.horn@gmail.com>
Update
2025-02-11 03:01:00 +00:00
石晓伟
548b5b7310
Update TensorRT-LLM ( #2532 )
...
* blossom-ci.yml: run vulnerability scan on blossom
* open source efb18c1256f8c9c3d47b7d0c740b83e5d5ebe0ec
---------
Co-authored-by: niukuo <6831097+niukuo@users.noreply.github.com>
Co-authored-by: pei0033 <59505847+pei0033@users.noreply.github.com>
Co-authored-by: Kyungmin Lee <30465912+lkm2835@users.noreply.github.com>
Co-authored-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
2024-12-04 21:16:56 +08:00
Kaiyu Xie
b7868dd1bd
Update TensorRT-LLM ( #2413 )
2024-11-05 16:27:06 +08:00
Kaiyu Xie
1730a587d8
Update TensorRT-LLM ( #2363 )
...
* Update TensorRT-LLM
---------
Co-authored-by: tonylek <137782967+tonylek@users.noreply.github.com>
2024-10-22 20:27:35 +08:00
Kaiyu Xie
78f5c2936b
Update TensorRT-LLM ( #2184 )
2024-09-03 12:14:23 +02:00
石晓伟
32ed92e449
Update TensorRT-LLM
...
Co-authored-by: Rong Zhou <130957722+ReginaZh@users.noreply.github.com>
Co-authored-by: Onur Galoglu <33498883+ogaloglu@users.noreply.github.com>
Co-authored-by: Fabian Joswig <fjosw@users.noreply.github.com>
2024-08-20 18:55:15 +08:00
Kaiyu Xie
bca9a33b02
Update TensorRT-LLM ( #2008 )
...
* Update TensorRT-LLM
---------
Co-authored-by: Timur Abishev <abishev.timur@gmail.com>
Co-authored-by: MahmoudAshraf97 <hassouna97.ma@gmail.com>
Co-authored-by: Saeyoon Oh <saeyoon.oh@furiosa.ai>
Co-authored-by: hattizai <hattizai@gmail.com>
2024-07-23 23:05:09 +08:00
Kaiyu Xie
2d234357c6
Update TensorRT-LLM ( #1954 )
...
* Update TensorRT-LLM
---------
Co-authored-by: Altair-Alpha <62340011+Altair-Alpha@users.noreply.github.com>
2024-07-16 15:30:25 +08:00
Kaiyu Xie
b777bd6475
Update TensorRT-LLM ( #1725 )
...
* Update TensorRT-LLM
---------
Co-authored-by: RunningLeon <mnsheng@yeah.net>
Co-authored-by: Tlntin <TlntinDeng01@Gmail.com>
Co-authored-by: ZHENG, Zhen <zhengzhen.z@qq.com>
Co-authored-by: Pham Van Ngoan <ngoanpham1196@gmail.com>
Co-authored-by: Nathan Price <nathan@abridge.com>
Co-authored-by: Tushar Goel <tushar.goel.ml@gmail.com>
Co-authored-by: Mati <132419219+matichon-vultureprime@users.noreply.github.com>
2024-06-04 20:26:32 +08:00
Kaiyu Xie
f430a4b447
Update TensorRT-LLM ( #1688 )
...
* Update TensorRT-LLM
---------
Co-authored-by: IbrahimAmin <ibrahimamin532@gmail.com>
Co-authored-by: Fabian Joswig <fjosw@users.noreply.github.com>
Co-authored-by: Pzzzzz <hello-cd.plus@hotmail.com>
Co-authored-by: CoderHam <hemant@cohere.com>
Co-authored-by: Konstantin Lopuhin <kostia.lopuhin@gmail.com>
2024-05-28 20:07:49 +08:00
Kaiyu Xie
bf0a5afc92
Update TensorRT-LLM ( #1598 )
...
* Update TensorRT-LLM
2024-05-14 16:43:41 +08:00
Kaiyu Xie
89ba1b1a67
Update TensorRT-LLM ( #1554 )
2024-05-07 23:34:28 +08:00
Kaiyu Xie
66ef1df492
Update TensorRT-LLM ( #1492 )
...
* Update TensorRT-LLM
---------
Co-authored-by: Loki <lokravi@amazon.com>
2024-04-24 14:44:22 +08:00
Kaiyu Xie
4bb65f216f
Update TensorRT-LLM ( #1274 )
...
* Update TensorRT-LLM
---------
Co-authored-by: meghagarwal <16129366+megha95@users.noreply.github.com>
Co-authored-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
2024-03-12 18:15:52 +08:00
Kaiyu Xie
71f60f6df0
Update TensorRT-LLM ( #524 )
2023-12-01 22:27:51 +08:00
Kaiyu Xie
7736d528a1
Add 3rd party dependency
2023-09-20 00:50:59 -07:00