Commit Graph

17 Commits

Author SHA1 Message Date
QI JUN
d167cbd5bb
refactor: remove ParallelConfig in tensorrt_llm._torch.distributed module (#3370)
* remove tensorrt_llm._torch.distributed.ParallelConfig

Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>

* fix ci

Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>

* fix ci

Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>

* clean

Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>

* fix embedding test

Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>

* fix

Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>

* fix comments

Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>

* polish

Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>

* fix ci

Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>

* rebase

Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>

---------

Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
Co-authored-by: hlu1 <14827759+hlu1@users.noreply.github.com>
2025-04-11 15:34:20 -07:00
amitz-nv
a6a2ae6cc1
chore: Rename nvsmall to nemotron nas (#3447)
* Rename nvsmall to nemotron NAS

* Revert nvsmall to nemotron_nas rename in paths in tests that access llm_models_root/nvsmall/tests

* Add NemotronNAS to pytorch supported models table

Signed-off-by: Amit Zuker <203509407+amitz-nv@users.noreply.github.com>
2025-04-10 23:16:52 +08:00
Yechan Kim
943218b54a
feat: Add Qwen2.5-VL and refactor Qwen2-VL (#3156)
* feat: Add Qwen2.5-VL and refactor Qwen2-VL

Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>

* fix yapf and codespell

Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>

* add test

Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>

* fix test_e2e

Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>

* generalize get_rope_index

Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>

* fix qwen2.5-vl on REAME

Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>

* fix test

Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>

* fix image test

Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>

---------

Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
Co-authored-by: Haohang Huang <31998628+symphonylyh@users.noreply.github.com>
2025-04-10 04:09:03 +08:00
Mike Iovine
5bdf997963
Add Llama 4 (#3302)
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-04-09 03:35:21 +08:00
Yan Chunwei
b21cfcfed1
chore: refactor the LlmArgs with Pydantic and migrate remaining pybinding configs to python (#3025)
* make LlmArgs Pydantic

Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>

* amending doc

fix api_stability

fix tests

Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>

* restore yaml groups

refine StackTrace

singleton

clean tests

Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>

* fix trtllm-bench

fix pytorch

Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>

* fix serve distagg

Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>

Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>

* fix

Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>

---------

Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
2025-04-05 13:31:48 +08:00
Yechan Kim
c7533d271f
doc: add supported-models on PyTorch example (#3179)
* doc: add supported-models on PyTorch example

Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>

* remove vision support from Llama3.2

Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>

---------

Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
Co-authored-by: juney-nvidia <143764042+juney-nvidia@users.noreply.github.com>
2025-04-03 21:09:25 +08:00
Mike Iovine
5416966ddb
Add initial EAGLE-3 implementation (#3035)
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-03-29 22:31:24 +08:00
Erin
c75d7cd684
move BuildConfig functional args to llmargs (#3036)
Signed-off-by: Erin Ho <14718778+hchings@users.noreply.github.com>
2025-03-29 02:20:18 +08:00
QI JUN
832ea997f6
chore: Simplify quickstart of PyTorch flow (#3000)
* simplify quickstart of PyTorch flow

Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>

* clean

Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>

---------

Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-03-24 14:32:17 +08:00
Kaiyu Xie
2631f21089
Update (#2978)
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
2025-03-23 16:39:35 +08:00
Kaiyu Xie
3aa6b11d13
Update TensorRT-LLM (#2936)
* Update TensorRT-LLM

---------

Co-authored-by: changcui <cuichang147@gmail.com>
2025-03-18 21:25:19 +08:00
Kaiyu Xie
9b931c0f63
Update TensorRT-LLM (#2873) 2025-03-11 21:13:42 +08:00
Kaiyu Xie
77d7fe1eb2
Update TensorRT-LLM (#2849)
* Update TensorRT-LLM

---------

Co-authored-by: aotman <chenhangatm@gmail.com>
2025-03-04 18:44:00 +08:00
Kaiyu Xie
ab5b19e027
Update TensorRT-LLM (#2820) 2025-02-25 21:21:49 +08:00
Kaiyu Xie
2ea17cdad2
Update TensorRT-LLM (#2792)
* Update TensorRT-LLM

---------

Co-authored-by: jlee <jungmoolee@clika.io>
2025-02-18 21:27:39 +08:00
Kaiyu Xie
e88da961c5
Update TensorRT-LLM (#2783) 2025-02-13 18:40:22 +08:00
Dan Blanaru
16d2467ea8 Update TensorRT-LLM (#2755)
* Update TensorRT-LLM

---------

Co-authored-by: Denis Kayshev <topenkoff@gmail.com>
Co-authored-by: akhoroshev <arthoroshev@gmail.com>
Co-authored-by: Patrick Reiter Horn <patrick.horn@gmail.com>

Update
2025-02-11 03:01:00 +00:00