mirror of
https://github.com/NVIDIA/TensorRT-LLM.git
synced 2026-02-18 16:55:08 +08:00
* kv cache aware router Signed-off-by: Zheng Duan <200704041+zhengd-nv@users.noreply.github.com> * add tests Signed-off-by: Zheng Duan <200704041+zhengd-nv@users.noreply.github.com> * router config Signed-off-by: Zheng Duan <200704041+zhengd-nv@users.noreply.github.com> * eviction test Signed-off-by: Zheng Duan <200704041+zhengd-nv@users.noreply.github.com> add test Signed-off-by: Zheng Duan <200704041+zhengd-nv@users.noreply.github.com> * eviction detect in worker test Signed-off-by: Zheng Duan <200704041+zhengd-nv@users.noreply.github.com> * move worker tests to single gpu Signed-off-by: Zheng Duan <200704041+zhengd-nv@users.noreply.github.com> * reduce memory fraction Signed-off-by: Zheng Duan <200704041+zhengd-nv@users.noreply.github.com> * fix partial block Signed-off-by: Zheng Duan <200704041+zhengd-nv@users.noreply.github.com> --------- Signed-off-by: Zheng Duan <200704041+zhengd-nv@users.noreply.github.com> |
||
|---|---|---|
| .. | ||
| disagg_config_cache_aware_balance.yaml | ||
| disagg_config_cache_reuse.yaml | ||
| disagg_config_ctxtp1_gentp1_deepseek_v3_lite_one_mtp_attention_dp_overlap.yaml | ||
| disagg_config_ctxtp1_gentp1_deepseek_v3_lite_one_mtp.yaml | ||
| disagg_config_ctxtp1_gentp1_deepseek_v3_lite.yaml | ||
| disagg_config_ctxtp2_gentp1.yaml | ||
| disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_one_mtp.yaml | ||
| disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_one.yaml | ||
| disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_overlap_cuda_graph.yaml | ||
| disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp_overlap.yaml | ||
| disagg_config_ctxtp2_gentp2_deepseek_v3_lite_attention_dp.yaml | ||
| disagg_config_ctxtp2_gentp2_deepseek_v3_lite_overlap_cuda_graph.yaml | ||
| disagg_config_ctxtp2_gentp2_deepseek_v3_lite.yaml | ||
| disagg_config_cuda_graph_padding.yaml | ||
| disagg_config_gen_only.yaml | ||
| disagg_config_load_balance.yaml | ||
| disagg_config_mixed.yaml | ||
| disagg_config_overlap.yaml | ||