Commit Graph

12 Commits

Author SHA1 Message Date
nv-guomingz
034f6f2d91
docs:update 0.19 docs (#3986)
Signed-off-by: nv-guomingz <37257613+nv-guomingz@users.noreply.github.com>
Co-authored-by: nv-guomingz <37257613+nv-guomingz@users.noreply.github.com>
2025-04-30 19:25:26 +08:00
sugunav14
84fc07b011
feat: [TRTLLM-3510] DeepseekV3 support in AutoDeploy (#3281)
Signed-off-by: Suguna Velury <178320438+sugunav14@users.noreply.github.com>
2025-04-08 21:47:57 +08:00
yuxianq
7b03350527
Add thread leak check and fix thread/memory leak issues. (#3270)
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
2025-04-08 19:03:18 +08:00
tburt-nv
7a659885e3
chore: remove usernames from comments (#3291)
Signed-off-by: Tyler Burt <195370667+tburt-nv@users.noreply.github.com>
2025-04-05 13:44:28 +08:00
Fridah-nv
a5f32f46fd
fix: [AutoDeploy] Update README.md (#3072)
* update support matrix and add toggle list

Signed-off-by: fridah <201670829+Fridah-nv@users.noreply.github.com>

* Update README.md

Signed-off-by: Fridah-nv <201670829+Fridah-nv@users.noreply.github.com>

* Update README.md

Signed-off-by: Fridah-nv <201670829+Fridah-nv@users.noreply.github.com>

---------

Signed-off-by: fridah <201670829+Fridah-nv@users.noreply.github.com>
Signed-off-by: Fridah-nv <201670829+Fridah-nv@users.noreply.github.com>
2025-04-01 16:16:36 -07:00
Suyog Gupta
047f2b234d
perf: [AutoDeploy] Enable AutoDeploy as a backend in trtllm-bench (#3041)
* Enable AutoDeploy as a backend in trtllm-bench

Signed-off-by: Suyog Gupta <suyogg@nvidia.com>

* update how caches are resized

Signed-off-by: Suyog Gupta <suyogg@nvidia.com>

* fix: files permission from 100755 to 100644

Signed-off-by: Suyog Gupta <suyogg@nvidia.com>

* some comments

Signed-off-by: Suyog Gupta <suyogg@nvidia.com>

* lint

Signed-off-by: Suyog Gupta <suyogg@nvidia.com>

* lint

Signed-off-by: Suyog Gupta <suyogg@nvidia.com>

* lint

Signed-off-by: Suyog Gupta <suyogg@nvidia.com>

* lint

Signed-off-by: Suyog Gupta <suyogg@nvidia.com>

* Fix function name

Signed-off-by: Suyog Gupta <suyogg@nvidia.com>

* refactor

Signed-off-by: Suyog Gupta <suyogg@nvidia.com>

* Remove spurious change

Signed-off-by: Suyog Gupta <suyogg@nvidia.com>

* Add cursor generated doc strings

Signed-off-by: Suyog Gupta <suyogg@nvidia.com>

* re-enable ad test

Signed-off-by: Suyog Gupta <suyogg@nvidia.com>

* some perf cleanup

Signed-off-by: Suyog Gupta <suyogg@nvidia.com>

* debug ci

Signed-off-by: Suyog Gupta <suyogg@nvidia.com>

* ensure that overlap scheduler is enabled

Signed-off-by: Suyog Gupta <suyogg@nvidia.com>

* Reorder the tests

Signed-off-by: Suyog Gupta <suyogg@nvidia.com>

---------

Signed-off-by: Suyog Gupta <suyogg@nvidia.com>
Co-authored-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
2025-03-26 14:33:14 -07:00
bhsueh_NV
7413cb555a
relax the limitation of setuptools (#2992)
Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>
2025-03-24 13:36:10 +08:00
Kaiyu Xie
2631f21089
Update (#2978)
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
2025-03-23 16:39:35 +08:00
Kaiyu Xie
3aa6b11d13
Update TensorRT-LLM (#2936)
* Update TensorRT-LLM

---------

Co-authored-by: changcui <cuichang147@gmail.com>
2025-03-18 21:25:19 +08:00
Kaiyu Xie
9b931c0f63
Update TensorRT-LLM (#2873) 2025-03-11 21:13:42 +08:00
Kaiyu Xie
77d7fe1eb2
Update TensorRT-LLM (#2849)
* Update TensorRT-LLM

---------

Co-authored-by: aotman <chenhangatm@gmail.com>
2025-03-04 18:44:00 +08:00
Kaiyu Xie
ab5b19e027
Update TensorRT-LLM (#2820) 2025-02-25 21:21:49 +08:00