Yihan Wang
|
9df4dad3b6
|
[None][fix] Introduce inline namespace to avoid symbol collision (#9541)
Signed-off-by: Yihan Wang <yihwang@nvidia.com>
|
2025-12-12 23:32:15 +08:00 |
|
yunruis
|
c4f823319b
|
[None][doc] add adp balance blog (#7213)
Signed-off-by: yunruis <205571022+yunruis@users.noreply.github.com>
Co-authored-by: Kefeng-Duan <176893526+Kefeng-Duan@users.noreply.github.com>
|
2025-08-28 11:19:34 -04:00 |
|
Dimitrios Bariamis
|
f49dafe0da
|
[https://nvbugs/5394409][feat] Support Mistral Small 3.1 multimodal in Triton Backend (#6714)
Signed-off-by: Dimitrios Bariamis <12195802+dbari@users.noreply.github.com>
Signed-off-by: Dimitrios Bariamis <dbari@users.noreply.github.com>
Co-authored-by: Dimitrios Bariamis <12195802+dbari@users.noreply.github.com>
Co-authored-by: Iman Tabrizian <10105175+Tabrizian@users.noreply.github.com>
|
2025-08-21 18:08:38 +02:00 |
|
Tailing Yuan
|
85b4a6808d
|
Refactor: move DeepEP from Docker images to wheel building (#5534)
Signed-off-by: Tailing Yuan <yuantailing@gmail.com>
|
2025-07-07 22:57:03 +09:00 |
|
Tao Li @ NVIDIA
|
3b7120d60e
|
DeepSeek R1 throughut optimization tech blog for Blackwell GPUs (#4791)
Signed-off-by: Tao Li
|
2025-05-30 18:54:19 +08:00 |
|
Perkz Zheng
|
4d711be8f4
|
Feat: add sliding-window-attention generation-phase kernels on Blackwell (#4564)
* move cubins to LFS
Signed-off-by: Perkz Zheng <67892460+PerkzZheng@users.noreply.github.com>
* update cubins
Signed-off-by: Perkz Zheng <67892460+PerkzZheng@users.noreply.github.com>
* add sliding-window-attention generation-phase kernels on Blackwell
Signed-off-by: Perkz Zheng <67892460+PerkzZheng@users.noreply.github.com>
* address comments
Signed-off-by: Perkz Zheng <67892460+PerkzZheng@users.noreply.github.com>
---------
Signed-off-by: Perkz Zheng <67892460+PerkzZheng@users.noreply.github.com>
|
2025-05-26 09:06:33 +08:00 |
|
Iman Tabrizian
|
4c7191af67
|
Move Triton backend to TRT-LLM main (#3549)
* Move TRT-LLM backend repo to TRT-LLM repo
Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
* Address review comments
Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
* debug ci
Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
* Update triton backend
Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
* Fixes after update
Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
---------
Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
|
2025-05-16 07:15:23 +08:00 |
|
Yuan Tong
|
a139eae425
|
chore: Stabilize ABI boundary for internal kernel library (#3117)
chore: Stabilize ABI boundary for internal kernel library
Signed-off-by: Yuan Tong <13075180+tongyuantongyu@users.noreply.github.com>
|
2025-04-11 15:07:50 +08:00 |
|
Kaiyu Xie
|
89ba1b1a67
|
Update TensorRT-LLM (#1554)
|
2024-05-07 23:34:28 +08:00 |
|
Kaiyu Xie
|
035b99e0d0
|
Update TensorRT-LLM (#1427)
* Update TensorRT-LLM
---------
Co-authored-by: meghagarwal <16129366+megha95@users.noreply.github.com>
|
2024-04-09 17:03:34 +08:00 |
|
Kaiyu Xie
|
9b563baf10
|
Add static libraries (#2)
|
2023-09-21 11:52:23 +08:00 |
|