TensorRT-LLMs/tensorrt_llm/serve
Pengyun Lin 60e02a3684
Use llm.tokenizer in OpenAIServer (#3199)
Signed-off-by: Pengyun Lin <81065165+LinPoly@users.noreply.github.com>
Co-authored-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
2025-04-08 14:55:02 +08:00
..
__init__.py Update TensorRT-LLM (#2820) 2025-02-25 21:21:49 +08:00
openai_disagg_server.py feat: Add option to run disaggregated serving without ctx servers,… (#3243) 2025-04-07 21:56:03 -04:00
openai_protocol.py Update (#2978) 2025-03-23 16:39:35 +08:00
openai_server.py Use llm.tokenizer in OpenAIServer (#3199) 2025-04-08 14:55:02 +08:00
postprocess_handlers.py breaking change: perf: Make ipc_periodically the default responses_handler (#3102) 2025-04-08 10:36:39 +08:00