TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-02-17 00:04:57 +08:00

History

William Zhang 4debf153d8 [#11170 ][fix] Fix for mm placeholder counts (#11461 ) * Why? As reported by #11170, when a single request contains multiple messages, and only a subset of those messages include multimodal data, the previous logic incorrectly adds placeholder tokens to subsequent messages that do not contain such data. * What? This commit fixes this issue, and adds unit tests that would have caught this. Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>		2026-02-14 09:12:03 +08:00
..
scripts	[None] [feat] skip batch_tokenize_prompts in CustomDataset (#10214 )	2025-12-23 17:40:57 +08:00
tool_parser	[None][chore] Unify DS tool parser names (#10239 )	2025-12-31 14:40:07 +08:00
__init__.py	Update TensorRT-LLM (#2820 )	2025-02-25 21:21:49 +08:00
chat_utils.py	[#11170 ][fix] Fix for mm placeholder counts (#11461 )	2026-02-14 09:12:03 +08:00
cluster_storage.py	[https://nvbugs/5826689 ][fix] replace etcd3 with etcd-sdk-python (#10886 )	2026-02-09 23:53:40 +08:00
disagg_auto_scaling.py	[https://nvbugs/5726066 ][fix] fix auto-scaling related failures (#9845 )	2025-12-18 16:37:48 -05:00
harmony_adapter.py	[TRTLLM-10866][feat] implement disaggregated harmony chat (#11336 )	2026-02-09 12:09:03 -05:00
media_storage.py	[TRTLLM-10612][feat] Initial support of AIGV models in TRTLLM (#11462 )	2026-02-14 06:11:11 +08:00
metadata_server.py	[https://nvbugs/5826689 ][fix] replace etcd3 with etcd-sdk-python (#10886 )	2026-02-09 23:53:40 +08:00
openai_client.py	[#10889 ][fix] fix pydantic deepcopy bug (#11004 )	2026-01-27 02:40:13 -05:00
openai_disagg_server.py	[TRTLLM-8921][feat] implement gen-first disagg_service (#11020 )	2026-02-03 15:46:11 -05:00
openai_disagg_service.py	[TRTLLM-8921][feat] implement gen-first disagg_service (#11020 )	2026-02-03 15:46:11 -05:00
openai_protocol.py	[TRTLLM-10612][feat] Initial support of AIGV models in TRTLLM (#11462 )	2026-02-14 06:11:11 +08:00
openai_server.py	[TRTLLM-10612][feat] Initial support of AIGV models in TRTLLM (#11462 )	2026-02-14 06:11:11 +08:00
openai_service.py	[TRTLLM-8920][feat] decouple disagg service from fastapi (#8714 )	2025-12-05 10:44:16 +08:00
perf_metrics.py	[TRTLLM-9181][feat] improve disagg-server prometheus metrics; synchronize workers' clocks when workers are dynamic (#9726 )	2025-12-16 05:16:32 -08:00
postprocess_handlers.py	[#10614 ][fix] gpt_oss first iteration streaming in trtllm-serve (#10808 )	2026-01-26 20:53:11 +08:00
responses_utils.py	[TRTLLM-10154][feat] Enable guided decoding with reasoning parsers (#10890 )	2026-01-22 14:14:28 +08:00
router.py	[TRTLLM-8921][feat] implement gen-first disagg_service (#11020 )	2026-02-03 15:46:11 -05:00
visual_gen_utils.py	[TRTLLM-10612][feat] Initial support of AIGV models in TRTLLM (#11462 )	2026-02-14 06:11:11 +08:00