TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-14 06:27:45 +08:00

History

Yechan Kim 83c3ed128b chore: set default device to cpu on Multimodal models (#5994 ) Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>		2025-07-22 21:45:31 -07:00
..
__init__.py	feat: TRTLLM-5574 Add phi-4-multimodal pytorch-backend support (#5644 )	2025-07-17 06:30:58 +08:00
data.py	[TRTLLM-3925, https://nvbugs/5245262 ] [fix] Normalize LLM.generate API (#3985 )	2025-05-07 11:06:23 +08:00
multimodal.py	[TRTLLM-5059][feat] Add KV cache reuse support for multimodal models (#5444 )	2025-07-21 16:11:58 -07:00
registry.py	perf: Use tokenizers API to optimize incremental detokenization perf (#5574 )	2025-07-01 09:35:25 -04:00
utils.py	chore: set default device to cpu on Multimodal models (#5994 )	2025-07-22 21:45:31 -07:00