TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-14 06:27:45 +08:00

History

John Calderon 46ee7acb33 [TRTLLM-6780][fix] Add multimodal data to dummy requests during memory profiling (#7539 ) Signed-off-by: John Calderon <johncalesp@gmail.com> Signed-off-by: John Calderon <jcalderon@nvidia.com> Signed-off-by: john calderon <jcalderon@nvidia.com> Signed-off-by: John Calderon <jcalderon@nvidia>		2025-10-16 17:49:22 +02:00
..
__init__.py	[TRTLLM-6780][fix] Add multimodal data to dummy requests during memory profiling (#7539 )	2025-10-16 17:49:22 +02:00
data.py	[TRTLLM-3925, https://nvbugs/5245262 ] [fix] Normalize LLM.generate API (#3985 )	2025-05-07 11:06:23 +08:00
multimodal.py	[TRTLLM-7385][feat] Optimize Qwen2/2.5-VL performance (#7250 )	2025-09-22 03:40:02 -07:00
registry.py	[TRTLLM-6780][fix] Add multimodal data to dummy requests during memory profiling (#7539 )	2025-10-16 17:49:22 +02:00
utils.py	[TRTLLM-7398][feat] Support KV cache salting for secure KV cache reuse (#7106 )	2025-09-06 17:58:32 -04:00