Commit Graph

6 Commits

Author SHA1 Message Date
amitz-nv
85af62184b
[TRTLLM-6683][feat] Support LoRA reload CPU cache evicted adapter (#6510)
Signed-off-by: Amit Zuker <203509407+amitz-nv@users.noreply.github.com>
2025-08-07 09:05:36 +03:00
amitz-nv
98428f330e
[TRTLLM-5826][feat] Support pytorch LoRA adapter eviction (#5616)
Signed-off-by: Amit Zuker <203509407+amitz-nv@users.noreply.github.com>
2025-07-20 08:00:14 +03:00
Yuxian Qiu
c8e062bfd3
fix: [nvbugs/5287097] Align PP layer distribution between pytorch and TRT flow. (#4399)
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
Signed-off-by: Aurelien Chartier <2567591+achartier@users.noreply.github.com>
Co-authored-by: Aurelien Chartier <2567591+achartier@users.noreply.github.com>
2025-05-19 14:25:36 -07:00
Robin Kobus
d31fefde2c
[TRTLLM-5171] chore: Remove GptSession/V1 from TRT workflow (#4092)
* chore: Remove GptSession/V1 from TRT workflow

Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>

* chore: Remove stateful decoders

Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>

* chore: Remove GptSession buffers

Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>

* chore: Remove GptSession utils

Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>

* chore: Remove GptSession kernels

Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>

* chore: Remove V1 GPT models from tests

Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>

* chore: Remove gptSessionBenchmark from scripts and docs

Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>

* chore: Remove gptSession IO classes

Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>

* chore: Remove GptSession from test lists

Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>

* chore: Remove GptSession from docs

Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>

* chore: Remove useless encoder test

Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>

* chore: Remove mActualBatchSize from DecoderState

Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>

* chore: Remove static batching from ExecutorTest

- Updated `validateContextLogits` and `validateGenerationLogits` functions to remove the `batchingType` parameter.
- Adjusted related test functions to reflect the changes in parameter lists.
- Cleaned up the instantiation of test cases to eliminate unnecessary batchingType references.

Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>

---------

Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
2025-05-14 23:10:04 +02:00
shaharmor98
ee4aab72ec
feat: Support PeftCacheManager in Torch (#3186)
* Add PeftCacheManager implementation

Signed-off-by: Shahar Mor <smor@nvidia.com>
2025-04-04 12:38:08 +08:00
Kaiyu Xie
9b931c0f63
Update TensorRT-LLM (#2873) 2025-03-11 21:13:42 +08:00