Commit Graph

18 Commits

Author SHA1 Message Date
Jhao-Ting Chen
92d90fa29a
[None][feat] Expose enable_trt_overlap in Triton_backend brings 1.05x OTPS (#10018)
Signed-off-by: Jhao-Ting Chen <jhaotingc@nvidia.com>
2025-12-23 11:41:31 -06:00
Simeng Liu
9f8d93f89a
[https://nvbugs/5606136][ci] Remove tests for deprecating triton multimodal models. (#8926)
Signed-off-by: Simeng Liu <simengl@nvidia.com>
2025-11-06 17:58:42 -08:00
Sai Kiran Polisetty
08134cbca0
[https://nvbugs/5556475] [fix] Fix the tensorrt_llm_bls model to correctly return the outputs for num_input_tokens and num_output_tokens (#8150)
Signed-off-by: Sai Kiran Polisetty <spolisetty@nvidia.com>
2025-10-27 21:06:28 -07:00
Guoming Zhang
9f0f52249e [None][doc] Rename TensorRT-LLM to TensorRT LLM for homepage and the … (#7850)
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
2025-09-25 21:02:35 +08:00
Dimitrios Bariamis
f49dafe0da
[https://nvbugs/5394409][feat] Support Mistral Small 3.1 multimodal in Triton Backend (#6714)
Signed-off-by: Dimitrios Bariamis <12195802+dbari@users.noreply.github.com>
Signed-off-by: Dimitrios Bariamis <dbari@users.noreply.github.com>
Co-authored-by: Dimitrios Bariamis <12195802+dbari@users.noreply.github.com>
Co-authored-by: Iman Tabrizian <10105175+Tabrizian@users.noreply.github.com>
2025-08-21 18:08:38 +02:00
nv-guomingz
49044733e1
chore: delete useless gitkeep files. (#6400)
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
2025-07-28 11:38:30 -04:00
Aurelien Chartier
6a47cac981
feat: Add support for Triton request cancellation (#5898)
Signed-off-by: Aurelien Chartier <2567591+achartier@users.noreply.github.com>
2025-07-15 20:52:43 -04:00
Aurelien Chartier
3ec3ff1d82
chore: remove support for llmapi + TRT backend in Triton (#5856)
Signed-off-by: Aurelien Chartier <2567591+achartier@users.noreply.github.com>
2025-07-09 21:30:34 -07:00
Vivian Chen
34212e2e36
[TRTLLM-6104] feat: add request_perf_metrics to triton LLMAPI backend (#5554)
Signed-off-by: Vivian Chen <140748220+xuanzic@users.noreply.github.com>
2025-06-30 21:34:42 -07:00
nv-guomingz
578430e64c
[TRTLLM-5530][BREAKING CHANGE]: enhance the llm args pytorch config part 1(cuda_graph_config) (#5014)
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
2025-06-30 11:05:40 +08:00
Iman Tabrizian
26b953e29a
[nvbugs/5309940] Add support for input output token counts (#5445)
Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
2025-06-28 04:39:39 +08:00
Yan Chunwei
9bd42ecf9b
[TRTLLM-5208][BREAKING CHANGE] chore: make pytorch LLM the default (#5312)
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
2025-06-20 03:01:10 +08:00
Aurelien Chartier
e1e5f725fc
fix: only set _mpi_session if world_size is > 1 (#5253)
Signed-off-by: Aurelien Chartier <2567591+achartier@users.noreply.github.com>
2025-06-17 19:21:41 -07:00
Aurelien Chartier
82e280f6f3
feat: add multi-node support for Triton with pytorch backend (#5172)
Signed-off-by: Aurelien Chartier <2567591+achartier@users.noreply.github.com>
2025-06-13 13:27:58 -07:00
Bo Li
f414a079ad
chore: Change the type annotations of input_ids and position_ids to int32. (#4632)
Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>
2025-06-07 16:10:47 +08:00
Yan Chunwei
5506f60037
chore [BREAKING CHANGE]: Flatten PyTorchConfig knobs into TorchLlmArgs (#4603)
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
2025-05-28 18:43:04 +08:00
Aurelien Chartier
1681e9fd1e
chore: remove extra PYTHONPATH (#4453)
Signed-off-by: Aurelien Chartier <2567591+achartier@users.noreply.github.com>
2025-05-21 17:38:01 -07:00
Iman Tabrizian
4c7191af67
Move Triton backend to TRT-LLM main (#3549)
* Move TRT-LLM backend repo to TRT-LLM repo

Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>

* Address review comments

Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>

* debug ci

Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>

* Update triton backend

Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>

* Fixes after update

Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>

---------

Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
2025-05-16 07:15:23 +08:00