Zheng Duan
|
fea5bfbda7
|
[None][feat] add detailed KV cache transfer time breakdown (#8521)
Signed-off-by: zhengd-nv <200704041+zhengd-nv@users.noreply.github.com>
|
2025-10-29 10:11:09 +08:00 |
|
Chuang Zhu
|
2420918e5b
|
[TRTLLM-7078][chore] optimal kvcache transfer for VWSA (#7952)
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
|
2025-10-24 08:58:16 -04:00 |
|
Patrice Castonguay
|
fefa7d8fa3
|
[None][feat] Support for cancelling requests with disaggregation (#8114)
Signed-off-by: Shunkang <182541032+Shunkangz@users.noreply.github.co>
Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>
Co-authored-by: Shunkang <182541032+Shunkangz@users.noreply.github.co>
|
2025-10-02 11:04:26 -07:00 |
|
Iman Tabrizian
|
33282351a2
|
[TRTLLM-6106][feat] Add support for KVCache transfer from KVCache reuse path (#6348)
Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
|
2025-09-27 19:29:30 -04:00 |
|
Iman Tabrizian
|
6ce0624208
|
[TRTLLM-8044][refactor] Rename data -> cache for cacheTransceiver (#7659)
|
2025-09-16 08:43:56 -04:00 |
|
Shunkangz
|
bddf183e15
|
[None][feat] Add Request specific exception (#6931)
Signed-off-by: Shunkang <182541032+Shunkangz@users.noreply.github.co>
|
2025-09-04 18:43:42 -04:00 |
|
Zheng Duan
|
ebdc43e69d
|
[None][feat] move kv cache measure into transfer session (#6633)
Signed-off-by: zhengd-nv <200704041+zhengd-nv@users.noreply.github.com>
|
2025-08-08 17:49:22 +08:00 |
|
Zheng Duan
|
c9ed1ab436
|
[TRTLLM-6549] chore: record delay introduced by disaggregated serving in kv cache measure (#6135)
Signed-off-by: zhengd-nv <200704041+zhengd-nv@users.noreply.github.com>
|
2025-07-30 10:39:40 +08:00 |
|
Zheng Duan
|
38db4bc7fb
|
feat: use session abstraction in data transceiver and cache formatter (#5611)
Signed-off-by: zhengd-nv <200704041+zhengd-nv@users.noreply.github.com>
|
2025-07-16 13:52:44 +08:00 |
|
Zheng Duan
|
ee44fa00f8
|
chore: rename IOFormatter to BaseCacheFormatter (#5068)
Signed-off-by: Zheng Duan <200704041+zhengd-nv@users.noreply.github.com>
|
2025-06-12 10:50:14 +08:00 |
|
Chuang Zhu
|
9a874760c1
|
Kv cache transfer support duplicate heads (#4929)
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
|
2025-06-09 14:11:19 +08:00 |
|
Chuang Zhu
|
44cfd757b2
|
Agent interface impl for NIXL (#4125)
* agentConnection
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
recv
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
agentState
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
NIXL interfaces
Signed-off-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
update cmakelists
Signed-off-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
nixl improve
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
remove cppzmq
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
fix
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
transferAgent remove register
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
work for cache Test
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
reduce sleep time
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
fix test
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
intergarte
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
nixl env
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
fix rebase error
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
cpp test
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
stash for send metaData
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
loadRemoteMD after fetchRemoteMD
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
workaround for mixed gen and context
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
test_env
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
avoid port conflict in test
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
* format
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
* use std::string
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
* typo
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
* fix transferAgentTest
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
---------
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
|
2025-05-22 09:09:41 +08:00 |
|
BatshevaBlack
|
3e37531c6a
|
feat: Add BW measurement (#3070)
|
2025-03-28 10:53:00 +08:00 |
|
Shunkangz
|
8ee840159b
|
Add updateKVCacheTransfer (#2984)
Add kv cache transfer measurement
Signed-off-by: Shunkang <182541032+Shunkangz@users.noreply.github.co>
Co-authored-by: Shunkang <182541032+Shunkangz@users.noreply.github.co>
|
2025-03-25 21:45:35 +08:00 |
|
Kaiyu Xie
|
9b931c0f63
|
Update TensorRT-LLM (#2873)
|
2025-03-11 21:13:42 +08:00 |
|