Roey Azran
|
8408c40d8b
|
[https://nvbugs/5702786][fix] Fix race conditions in KV cache communication during unexpected termination (#10076)
Signed-off-by: roeya <165803633+RoeyAzran1992@users.noreply.github.com>
|
2025-12-23 14:09:51 +02:00 |
|
Chuang Zhu
|
4cc4cbe926
|
[https://nvbugs/5716787][fix] terminate nixl running when exiting (#9785)
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
Co-authored-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>
|
2025-12-12 11:15:02 -05:00 |
|
Iman Tabrizian
|
356a52edf5
|
[None][feat] Add support for KVCache reuse for DSv32 (#9383)
Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
|
2025-12-02 11:14:30 +08:00 |
|
Iman Tabrizian
|
cdde15b275
|
[TRTLLM-8540][feat] Add support for disagg in DSv3.2 (#8735)
Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
|
2025-11-12 08:21:11 -08:00 |
|
Zheng Duan
|
fea5bfbda7
|
[None][feat] add detailed KV cache transfer time breakdown (#8521)
Signed-off-by: zhengd-nv <200704041+zhengd-nv@users.noreply.github.com>
|
2025-10-29 10:11:09 +08:00 |
|
Chuang Zhu
|
2420918e5b
|
[TRTLLM-7078][chore] optimal kvcache transfer for VWSA (#7952)
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
|
2025-10-24 08:58:16 -04:00 |
|
Shi Xiaowei
|
a0024f4d34
|
[None][doc] Facilitates the integration of the transfer agent (#7867)
Signed-off-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
|
2025-10-21 20:06:24 +08:00 |
|
Chuang Zhu
|
40d129a415
|
[None][fix] Fix cache buffer size for window (#8320)
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
|
2025-10-16 09:01:11 +08:00 |
|
Chuang Zhu
|
8733e830fc
|
[None][fix] Add lock for request_to_session in sendReadySingal (#8310)
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
|
2025-10-14 04:32:37 -07:00 |
|
Chuang Zhu
|
85f157f389
|
[None][fix] Add Lock to protect mReqeustToSession (#8085)
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
Co-authored-by: Xianjie Qiao <5410381+qiaoxj07@users.noreply.github.com>
|
2025-10-10 21:51:50 +08:00 |
|
Patrice Castonguay
|
fefa7d8fa3
|
[None][feat] Support for cancelling requests with disaggregation (#8114)
Signed-off-by: Shunkang <182541032+Shunkangz@users.noreply.github.co>
Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>
Co-authored-by: Shunkang <182541032+Shunkangz@users.noreply.github.co>
|
2025-10-02 11:04:26 -07:00 |
|
Iman Tabrizian
|
33282351a2
|
[TRTLLM-6106][feat] Add support for KVCache transfer from KVCache reuse path (#6348)
Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
|
2025-09-27 19:29:30 -04:00 |
|
Chuang Zhu
|
f98fa0cf8b
|
[None][feat] Optimize kv cache transfer TEP (#7613)
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
|
2025-09-25 20:20:04 -07:00 |
|
Zheng Duan
|
e3c1a9409f
|
[TRTLLM-6549][fix] add kv cache time output back (#7798)
Signed-off-by: zhengd-nv <200704041+zhengd-nv@users.noreply.github.com>
|
2025-09-23 14:12:42 -04:00 |
|
Iman Tabrizian
|
6ce0624208
|
[TRTLLM-8044][refactor] Rename data -> cache for cacheTransceiver (#7659)
|
2025-09-16 08:43:56 -04:00 |
|
Raayan Dhar
|
bae9560e62
|
[https://nvbugs/5448767][fix] sync termination of requests across PP ranks (#7455)
Signed-off-by: raayandhar <rdhar@nvidia.com>
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
Co-authored-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
|
2025-09-07 08:45:49 -04:00 |
|
Shunkangz
|
bddf183e15
|
[None][feat] Add Request specific exception (#6931)
Signed-off-by: Shunkang <182541032+Shunkangz@users.noreply.github.co>
|
2025-09-04 18:43:42 -04:00 |
|
Zheng Duan
|
ebdc43e69d
|
[None][feat] move kv cache measure into transfer session (#6633)
Signed-off-by: zhengd-nv <200704041+zhengd-nv@users.noreply.github.com>
|
2025-08-08 17:49:22 +08:00 |
|
Zheng Duan
|
38db4bc7fb
|
feat: use session abstraction in data transceiver and cache formatter (#5611)
Signed-off-by: zhengd-nv <200704041+zhengd-nv@users.noreply.github.com>
|
2025-07-16 13:52:44 +08:00 |
|
Chuang Zhu
|
f117d6abe9
|
Fabric Memory for KV Cache Transfer (#4717)
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
|
2025-05-30 15:50:21 +08:00 |
|
Chuang Zhu
|
558eaecf16
|
fix sequence data race (#4565)
stash for debug broken promise
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
|
2025-05-22 23:13:48 +08:00 |
|
Chuang Zhu
|
75e13f4f88
|
chore: disable some env for disagg defaultly (#3415)
* disable some env for disagg defaultly
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
* doc
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
* remove
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
---------
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
|
2025-04-14 10:08:10 +08:00 |
|
Kaiyu Xie
|
9b931c0f63
|
Update TensorRT-LLM (#2873)
|
2025-03-11 21:13:42 +08:00 |
|