TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-02-17 00:04:57 +08:00

History

Yuewei Na 0d18b2d7a4 [None][feat] Add priority-based KV cache offload filtering support (#10751 ) Signed-off-by: Yuewei Na <yna@nvidia.com> Signed-off-by: Yuewei Na <nv-yna@users.noreply.github.com> Co-authored-by: Yuewei Na <nv-yna@users.noreply.github.com>		2026-02-05 05:22:56 -05:00
..
__init__.py	chore: refactor llmapi e2e tests (#3803 )	2025-05-05 07:37:24 +08:00
_run_llmapi_llm.py	[None][chore] Mass integration of release/1.0 (#6864 )	2025-08-22 09:25:15 +08:00
test_llm_api_connector.py	[None][feat] Add priority-based KV cache offload filtering support (#10751 )	2026-02-05 05:22:56 -05:00
test_llm_api_qa.py	[None][test] Update llm_models_root to improve path handling on BareMetal environment (#7876 )	2025-09-24 17:35:57 +08:00
test_llm_e2e.py	[TRTLLM-8684][chore] Migrate BuildConfig to Pydantic, add a Python wrapper for KVCacheType enum (#8330 )	2025-10-28 09:17:26 -07:00
test_llm_examples.py	[https://nvbugs/5503529 ][fix] Change test_llmapi_example_multilora to get adapters path from cmd line to avoid downloading from HF (#7740 )	2025-09-16 16:35:13 +08:00