TensorRT-LLMs/tensorrt_llm/inputs
Yechan Kim f77aca9f2c
[TRTLLM-7385][feat] Optimize Qwen2/2.5-VL performance (#7250)
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
2025-09-22 03:40:02 -07:00
..
__init__.py [TRTLLM-7398][feat] Support KV cache salting for secure KV cache reuse (#7106) 2025-09-06 17:58:32 -04:00
data.py [TRTLLM-3925, https://nvbugs/5245262] [fix] Normalize LLM.generate API (#3985) 2025-05-07 11:06:23 +08:00
multimodal.py [TRTLLM-7385][feat] Optimize Qwen2/2.5-VL performance (#7250) 2025-09-22 03:40:02 -07:00
registry.py [TRTLLM-6903][feat] Support chunked prefill for multimodal models (#6843) 2025-09-14 20:10:10 -07:00
utils.py [TRTLLM-7398][feat] Support KV cache salting for secure KV cache reuse (#7106) 2025-09-06 17:58:32 -04:00