Commit Graph

5 Commits

Author SHA1 Message Date
shaharmor98
7d94c9561f
feat: support multi lora adapters and TP (#3885)
* support multi lora, tp

Signed-off-by: Shahar Mor <17088876+shaharmor98@users.noreply.github.com>
2025-05-08 23:45:45 +08:00
shaharmor98
49262a62a5
add passing E2E LoRA flow (#3788)
add passing E2E LoRA flow (#3788)

Signed-off-by: Shahar Mor <smor@nvidia.com>
2025-04-23 18:38:06 +03:00
shaharmor98
5fff8f0935
Add running E2E LoRA flow (#3648)
* add passing E2E LoRA flow

Signed-off-by: Shahar Mor <smor@nvidia.com>

* add experimental feature

Signed-off-by: Shahar Mor <smor@nvidia.com>

* fix llma_args definition

Signed-off-by: Shahar Mor <smor@nvidia.com>

* decreased manually size of max loras to address OOM

Signed-off-by: Shahar Mor <smor@nvidia.com>

---------

Signed-off-by: Shahar Mor <smor@nvidia.com>
2025-04-23 11:19:41 +08:00
danielafrimi
0f084d9566
added loraOp into lora layer + test for mlp and comparison to lora plugin (#3455)
Loraop integration into torch modules

Signed-off-by: Ubuntu <dafrimi@nvidia.com>
2025-04-17 12:48:27 +08:00
danielafrimi
47f5cf6c0d
lora_tests (#3201)
LoRA tests and layers

Signed-off-by: Ubuntu <dafrimi@nvidia.com>
Co-authored-by: Ubuntu <dafrimi@nvidia.com>
2025-04-09 18:06:52 +03:00