Jhao-Ting Chen
|
92d90fa29a
|
[None][feat] Expose enable_trt_overlap in Triton_backend brings 1.05x OTPS (#10018)
Signed-off-by: Jhao-Ting Chen <jhaotingc@nvidia.com>
|
2025-12-23 11:41:31 -06:00 |
|
xinhe-nv
|
e822184cd7
|
[None][feat] add waive by sm version (#8928)
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
|
2025-11-05 19:20:43 -08:00 |
|
HuiGao-NV
|
ae57738bae
|
[https://nvbugs/5547414][fix] Use cached models (#8755)
Signed-off-by: Hui Gao <huig@nvidia.com>
|
2025-10-29 19:10:10 -07:00 |
|
Iman Tabrizian
|
c510b67fa0
|
[https://nvbugs/5547414][fix] avoid downloading Tiny llama from HF (#8071)
Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
|
2025-09-30 13:47:59 -04:00 |
|
Emma Qiao
|
aea8ac1649
|
[TRTLLM-5950][infra] Removing remaining turtle keywords from the code base (#7086)
Signed-off-by: qqiao <qqiao@nvidia.com>
|
2025-09-07 14:26:18 +08:00 |
|
Dimitrios Bariamis
|
f49dafe0da
|
[https://nvbugs/5394409][feat] Support Mistral Small 3.1 multimodal in Triton Backend (#6714)
Signed-off-by: Dimitrios Bariamis <12195802+dbari@users.noreply.github.com>
Signed-off-by: Dimitrios Bariamis <dbari@users.noreply.github.com>
Co-authored-by: Dimitrios Bariamis <12195802+dbari@users.noreply.github.com>
Co-authored-by: Iman Tabrizian <10105175+Tabrizian@users.noreply.github.com>
|
2025-08-21 18:08:38 +02:00 |
|
Aurelien Chartier
|
b13a5a99b2
|
[None][chore] Add tests for non-existent and completed request cancellation (#6840)
Signed-off-by: Aurelien Chartier <2567591+achartier@users.noreply.github.com>
|
2025-08-14 15:57:01 -07:00 |
|
Aurelien Chartier
|
56bfc3a6d2
|
[None][chore] Find LLM_ROOT and LLM_BACKEND_ROOT dynamically (#6763)
Signed-off-by: Aurelien Chartier <2567591+achartier@users.noreply.github.com>
|
2025-08-11 15:18:19 -07:00 |
|
Yiqing Yan
|
3f7abf87bc
|
[TRTLLM-6224][infra] Upgrade dependencies to DLFW 25.06 and CUDA 12.9.1 (#5678)
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
|
2025-08-03 11:18:59 +08:00 |
|
pcastonguay
|
310bdd9830
|
fix: Fix triton backend build [nvbug 5396469] (#6098)
Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>
|
2025-07-22 12:48:00 +08:00 |
|
Simeng Liu
|
4a0951f85c
|
[Chore] Replace MODEL_CACHE_DIR with LLM_MODELS_ROOT and unwaive triton_server/test_triton.py::test_gpt_ib[gpt-ib] (#5859)
Signed-off-by: Simeng Liu <simengl@nvidia.com>
|
2025-07-21 15:46:37 -07:00 |
|
Aurelien Chartier
|
6a47cac981
|
feat: Add support for Triton request cancellation (#5898)
Signed-off-by: Aurelien Chartier <2567591+achartier@users.noreply.github.com>
|
2025-07-15 20:52:43 -04:00 |
|
Chang Liu
|
308776442a
|
[nvbug/5308432] fix: extend triton exit time for test_llava (#5971)
Signed-off-by: Chang Liu <9713593+chang-l@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
|
2025-07-12 12:56:37 +09:00 |
|
Iman Tabrizian
|
26b953e29a
|
[nvbugs/5309940] Add support for input output token counts (#5445)
Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
|
2025-06-28 04:39:39 +08:00 |
|
Iman Tabrizian
|
49af791f66
|
Add testing for trtllm-llmapi-launch with tritonserver (#5528)
Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
|
2025-06-27 11:19:52 +08:00 |
|
amirkl94
|
8451a87742
|
chore: Mass integration of release/0.20 (#5082)
Signed-off-by: Stanley Sun <190317771+StanleySun639@users.noreply.github.com>
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
Signed-off-by: Erin Ho <14718778+hchings@users.noreply.github.com>
Signed-off-by: Frank Di Natale <3429989+FrankD412@users.noreply.github.com>
Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
Co-authored-by: Stanley Sun <190317771+StanleySun639@users.noreply.github.com>
Co-authored-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
Co-authored-by: Erin <14718778+hchings@users.noreply.github.com>
Co-authored-by: Frank <3429989+FrankD412@users.noreply.github.com>
Co-authored-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
Co-authored-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
Co-authored-by: Yechan Kim <161688079+yechank-nvidia@users.noreply.github.com>
Co-authored-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
|
2025-06-17 14:32:02 +03:00 |
|
Omer Ullman Argov
|
8731f5f14f
|
chore: Mass integration of release/0.20 (#4898)
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
Signed-off-by: Hui Gao <huig@nvidia.com>
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>
Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
Signed-off-by: Ruodi <200874449+ruodil@users.noreply.github.com>
Signed-off-by: ruodil <200874449+ruodil@users.noreply.github.com>
Signed-off-by: Stanley Sun <190317771+StanleySun639@users.noreply.github.com>
Signed-off-by: Pamela Peng <179191831+pamelap-nvidia@users.noreply.github.com>
Signed-off-by: Anurag Mukkara <134339030+amukkara@users.noreply.github.com>
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
Signed-off-by: Faraz Khoubsirat <58580514+farazkh80@users.noreply.github.com>
Signed-off-by: moraxu <mguzek@nvidia.com>
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
Co-authored-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
Co-authored-by: Yiqing Yan <yiqingy@nvidia.com>
Co-authored-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
Co-authored-by: HuiGao-NV <huig@nvidia.com>
Co-authored-by: brb-nv <169953907+brb-nv@users.noreply.github.com>
Co-authored-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
Co-authored-by: Bo Li <22713281+bobboli@users.noreply.github.com>
Co-authored-by: Iman Tabrizian <10105175+Tabrizian@users.noreply.github.com>
Co-authored-by: ruodil <200874449+ruodil@users.noreply.github.com>
Co-authored-by: Stanley Sun <190317771+StanleySun639@users.noreply.github.com>
Co-authored-by: Pamela Peng <179191831+pamelap-nvidia@users.noreply.github.com>
Co-authored-by: Anurag Mukkara <134339030+amukkara@users.noreply.github.com>
Co-authored-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
Co-authored-by: Faraz <58580514+farazkh80@users.noreply.github.com>
Co-authored-by: Michal Guzek <moraxu@users.noreply.github.com>
Co-authored-by: Larry <197874197+LarryXFly@users.noreply.github.com>
Co-authored-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
Co-authored-by: Yechan Kim <161688079+yechank-nvidia@users.noreply.github.com>
|
2025-06-08 23:26:26 +08:00 |
|
Emma Qiao
|
c945e92fdb
|
[Infra]Remove some old keyword (#4552)
Signed-off-by: qqiao <qqiao@nvidia.com>
|
2025-05-31 13:50:45 +08:00 |
|
Aurelien Chartier
|
6cf1e4d0a9
|
chore: add -f to pkill calls (#4711)
Signed-off-by: Aurelien Chartier <2567591+achartier@users.noreply.github.com>
|
2025-05-29 02:54:31 +08:00 |
|
amirkl94
|
fbec0c3552
|
Release 0.20 to main (#4577)
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
Signed-off-by: Martin Marciniszyn Mehringer <11665257+MartinMarciniszyn@users.noreply.github.com>
Signed-off-by: Yuan Tong <13075180+tongyuantongyu@users.noreply.github.com>
Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
Signed-off-by: Venky <23023424+venkywonka@users.noreply.github.com>
Signed-off-by: Ruodi <200874449+ruodil@users.noreply.github.com>
Signed-off-by: Stefan Niebler <82932102+stnie@users.noreply.github.com>
Signed-off-by: Simeng Liu <simengl@nvidia.com>
Signed-off-by: Faraz Khoubsirat <58580514+farazkh80@users.noreply.github.com>
Signed-off-by: moraxu <mguzek@nvidia.com>
Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
Signed-off-by: Jinyang Yuan <154768711+jinyangyuan-nvidia@users.noreply.github.com>
Co-authored-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
Co-authored-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
Co-authored-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
Co-authored-by: Netanel Haber <58652339+netanel-haber@users.noreply.github.com>
Co-authored-by: Martin Marciniszyn Mehringer <11665257+MartinMarciniszyn@users.noreply.github.com>
Co-authored-by: Yuan Tong <13075180+tongyuantongyu@users.noreply.github.com>
Co-authored-by: Yukun He <23156053+hyukn@users.noreply.github.com>
Co-authored-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
Co-authored-by: Venky <23023424+venkywonka@users.noreply.github.com>
Co-authored-by: ruodil <200874449+ruodil@users.noreply.github.com>
Co-authored-by: stnie <82932102+stnie@users.noreply.github.com>
Co-authored-by: Simeng Liu <109828133+SimengLiu-nv@users.noreply.github.com>
Co-authored-by: Faraz <58580514+farazkh80@users.noreply.github.com>
Co-authored-by: Michal Guzek <moraxu@users.noreply.github.com>
Co-authored-by: Iman Tabrizian <10105175+Tabrizian@users.noreply.github.com>
Co-authored-by: Jinyang Yuan <154768711+jinyangyuan-nvidia@users.noreply.github.com>
|
2025-05-28 16:25:33 +08:00 |
|
Aurelien Chartier
|
f491244c84
|
feat: add dataset support for benchmark_core_model with LLMAPI (#4457)
* feat: add dataset support for benchmark_core_model with LLMAPI
Signed-off-by: Aurelien Chartier <2567591+achartier@users.noreply.github.com>
|
2025-05-21 19:18:43 -07:00 |
|
Aurelien Chartier
|
1681e9fd1e
|
chore: remove extra PYTHONPATH (#4453)
Signed-off-by: Aurelien Chartier <2567591+achartier@users.noreply.github.com>
|
2025-05-21 17:38:01 -07:00 |
|
Iman Tabrizian
|
7de90a66bc
|
Remove vila test (#4376)
Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
|
2025-05-19 09:02:39 +08:00 |
|
Iman Tabrizian
|
4c7191af67
|
Move Triton backend to TRT-LLM main (#3549)
* Move TRT-LLM backend repo to TRT-LLM repo
Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
* Address review comments
Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
* debug ci
Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
* Update triton backend
Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
* Fixes after update
Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
---------
Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
|
2025-05-16 07:15:23 +08:00 |
|