TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-02-16 15:55:08 +08:00

History

William Zhang 4debf153d8 [#11170 ][fix] Fix for mm placeholder counts (#11461 ) * Why? As reported by #11170, when a single request contains multiple messages, and only a subset of those messages include multimodal data, the previous logic incorrectly adds placeholder tokens to subsequent messages that do not contain such data. * What? This commit fixes this issue, and adds unit tests that would have caught this. Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>		2026-02-14 09:12:03 +08:00
..
__init__.py	[TRTLLM-9522][feat] support image_embeds in OpenAI API (#9715 )	2026-01-14 10:31:03 +01:00
data.py	[TRTLLM-10612][feat] Initial support of AIGV models in TRTLLM (#11462 )	2026-02-14 06:11:11 +08:00
evs.py	[TRTLLM-8238][feat] Add EVS support for nano-v2-vlm (#8024 )	2025-10-25 05:43:27 -04:00
multimodal.py	[TRTLLM-10487][feat] Add user-provided UUID support for multimodal KV cache identification. (#11075 )	2026-02-12 00:48:47 -05:00
registry.py	[TRTLLM-10487][feat] Add user-provided UUID support for multimodal KV cache identification. (#11075 )	2026-02-12 00:48:47 -05:00
utils.py	[#11170 ][fix] Fix for mm placeholder counts (#11461 )	2026-02-14 09:12:03 +08:00