Linda
|
eb4ed18a63
|
[None][fix] max_num_sequences argument in nanobind (#6862)
Signed-off-by: Linda-Stadter <57756729+Linda-Stadter@users.noreply.github.com>
|
2025-08-13 19:16:17 -04:00 |
|
Robin Kobus
|
45c7518032
|
[None][refactor] Simplify decoder state initialization (#6559)
Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
|
2025-08-12 21:44:41 +02:00 |
|
QI JUN
|
8845e0f065
|
[None][fix] fix ci (#6814)
|
2025-08-12 02:21:50 -07:00 |
|
Liao Lanyu
|
f7c13a4aa7
|
[TRTLLM-6906][chore] Using pybind to bind functions in thop/attentionOp (#6745)
Signed-off-by: Lanyu Liao <lancelly@users.noreply.github.com>
|
2025-08-12 16:45:16 +08:00 |
|
Martin Marciniszyn Mehringer
|
9a8195ef88
|
fix: Ensure that Python stub generation works against libnvidia-ml stubs (#6188)
Signed-off-by: Martin Marciniszyn Mehringer <11665257+MartinMarciniszyn@users.noreply.github.com>
|
2025-08-11 09:18:17 +02:00 |
|
Ziyi Xiong
|
de472828b9
|
[TRTLLM-6637][feat] Resolve KV cache divergence issue (#6628)
Signed-off-by: ziyixiong-nv <219238287+ziyixiong-nv@users.noreply.github.com>
|
2025-08-09 23:15:04 +08:00 |
|
pcastonguay
|
453a06e6ab
|
[TRTLLM-6881][feat] Include attention dp rank info with KV cache events (#6563)
Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>
|
2025-08-07 14:17:07 +02:00 |
|
hlu1
|
8207d5fd39
|
[None] [feat] Add model gpt-oss (#6645)
Signed-off-by: Hao Lu <14827759+hlu1@users.noreply.github.com>
|
2025-08-07 03:04:18 -04:00 |
|
amitz-nv
|
85af62184b
|
[TRTLLM-6683][feat] Support LoRA reload CPU cache evicted adapter (#6510)
Signed-off-by: Amit Zuker <203509407+amitz-nv@users.noreply.github.com>
|
2025-08-07 09:05:36 +03:00 |
|
ixlmar
|
1ebceb790d
|
[TRTLLM-5508][feat] check input tokens + improve error handling (#5170)
Signed-off-by: ixlmar <206748156+ixlmar@users.noreply.github.com>
|
2025-08-05 18:27:43 +01:00 |
|
Olya Kozlova
|
13cc1c4878
|
[TRTLLM-5271][feat] best_of/n for pytorch workflow (#5997)
Signed-off-by: Olya Kozlova <okozlova@nvidia.com>
|
2025-08-04 14:08:06 +02:00 |
|
Yuan Tong
|
a2f271c8e0
|
[TRTLLM-4406][feat] LLM sleep & wakeup Part 1: virtual device memory (#5034)
Signed-off-by: Yuan Tong <13075180+tongyuantongyu@users.noreply.github.com>
|
2025-08-04 13:51:01 +08:00 |
|
Robin Kobus
|
918fedf952
|
[None][refactor] Simplify finish reasons handling in DecoderState (#6524)
Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
|
2025-08-02 07:17:43 +02:00 |
|
Robin Kobus
|
d3c14682f0
|
refactor: Remove unused buffers and bindings from sampler (#6484)
Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
|
2025-08-01 00:43:03 -04:00 |
|
Michal Guzek
|
b8719fe96d
|
[nvbug/5374773] chore: Update nanobind with fail_fast_on_attention_window_too_large changes (#6491)
Signed-off-by: Michal Guzek <mguzek@nvidia.com>
|
2025-07-31 20:25:29 +01:00 |
|
Linda
|
9a99e6d6d7
|
fix: integration tests with nanobind (#6326)
Signed-off-by: Linda-Stadter <57756729+Linda-Stadter@users.noreply.github.com>
|
2025-07-25 09:23:20 +08:00 |
|
Linda
|
60073731ca
|
fix: bindings unit tests for nanobind (#6221)
Signed-off-by: Linda-Stadter <57756729+Linda-Stadter@users.noreply.github.com>
|
2025-07-22 14:51:43 +01:00 |
|
Linda
|
3efad2e58c
|
feat: nanobind bindings (#6185)
Signed-off-by: Linda-Stadter <57756729+Linda-Stadter@users.noreply.github.com>
|
2025-07-21 08:56:57 +01:00 |
|
Iman Tabrizian
|
b75e53ab69
|
Revert "feat: nanobind bindings (#5961)" (#6160)
Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
|
2025-07-18 10:12:54 +08:00 |
|
Linda
|
5bff317abf
|
feat: nanobind bindings (#5961)
Signed-off-by: Linda-Stadter <57756729+Linda-Stadter@users.noreply.github.com>
|
2025-07-17 22:42:52 +08:00 |
|
Linda
|
4d071eb2d1
|
feat: binding type build argument (pybind, nanobind) (#5802)
Signed-off-by: Linda-Stadter <57756729+Linda-Stadter@users.noreply.github.com>
|
2025-07-11 00:48:50 +09:00 |
|