TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-14 06:27:45 +08:00

History

Pengyun Lin 60e02a3684 Use llm.tokenizer in OpenAIServer (#3199 ) Signed-off-by: Pengyun Lin <81065165+LinPoly@users.noreply.github.com> Co-authored-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>		2025-04-08 14:55:02 +08:00
..
__init__.py	Update TensorRT-LLM (#2820 )	2025-02-25 21:21:49 +08:00
openai_disagg_server.py	feat: Add option to run disaggregated serving without ctx servers,… (#3243 )	2025-04-07 21:56:03 -04:00
openai_protocol.py	Update (#2978 )	2025-03-23 16:39:35 +08:00
openai_server.py	Use llm.tokenizer in OpenAIServer (#3199 )	2025-04-08 14:55:02 +08:00
postprocess_handlers.py	breaking change: perf: Make ipc_periodically the default responses_handler (#3102 )	2025-04-08 10:36:39 +08:00