Iman Tabrizian
|
6ce0624208
|
[TRTLLM-8044][refactor] Rename data -> cache for cacheTransceiver (#7659)
|
2025-09-16 08:43:56 -04:00 |
|
Zheng Duan
|
c9ed1ab436
|
[TRTLLM-6549] chore: record delay introduced by disaggregated serving in kv cache measure (#6135)
Signed-off-by: zhengd-nv <200704041+zhengd-nv@users.noreply.github.com>
|
2025-07-30 10:39:40 +08:00 |
|
Zheng Duan
|
38db4bc7fb
|
feat: use session abstraction in data transceiver and cache formatter (#5611)
Signed-off-by: zhengd-nv <200704041+zhengd-nv@users.noreply.github.com>
|
2025-07-16 13:52:44 +08:00 |
|
xiweny
|
eaf8bec88b
|
fix: Disaggregate serving with attention DP (#4993)
Signed-off-by: Xiwen Yu <13230610+VALLIS-NERIA@users.noreply.github.com>
|
2025-07-08 16:15:03 +08:00 |
|
Zheng Duan
|
ee44fa00f8
|
chore: rename IOFormatter to BaseCacheFormatter (#5068)
Signed-off-by: Zheng Duan <200704041+zhengd-nv@users.noreply.github.com>
|
2025-06-12 10:50:14 +08:00 |
|
Chuang Zhu
|
9a874760c1
|
Kv cache transfer support duplicate heads (#4929)
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
|
2025-06-09 14:11:19 +08:00 |
|
Zheng Duan
|
ded694b1aa
|
feat: cache reuse support (selective cache transfer) in mla cache formatter (#4749)
Signed-off-by: Zheng Duan <200704041+zhengd-nv@users.noreply.github.com>
|
2025-06-04 09:56:31 +08:00 |
|
Chuang Zhu
|
e2318756ed
|
cacheTransceiver buffer manager (#3798)
* cacheTransceiver buffer manager
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
* fix args
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
* cpp kvCacheManager
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
* format
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
---------
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
|
2025-04-27 11:48:15 +08:00 |
|
Kaiyu Xie
|
9b931c0f63
|
Update TensorRT-LLM (#2873)
|
2025-03-11 21:13:42 +08:00 |
|