TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-14 06:27:45 +08:00

History

Simeng Liu 84d107b2f0 [https://nvbugs/5717993 ][fix] Add execution_stream across PyExecutor, KVCacheManager, PeftCacheManager to ensure proper CUDA stream synchronization between KV cache transfer operations and model forward kernels. (#10060 ) Signed-off-by: SimengLiu-nv <simengl@nvidia.com>		2025-12-31 09:22:54 -08:00
..
lm_eval_tasks/gpqa/cot_zeroshot_aa
__init__.py
cnn_dailymail.py
interface.py
json_mode_eval.py
lm_eval.py	[https://nvbugs/5717993 ][fix] Add execution_stream across PyExecutor, KVCacheManager, PeftCacheManager to ensure proper CUDA stream synchronization between KV cache transfer operations and model forward kernels. (#10060 )	2025-12-31 09:22:54 -08:00
longbench_v2.py
mmlu.py