TensorRT-LLMs/tensorrt_llm/serve
Pengyun Lin ce37e27066
[#10614][fix] gpt_oss first iteration streaming in trtllm-serve (#10808)
Signed-off-by: Pengyun Lin <81065165+LinPoly@users.noreply.github.com>
2026-01-26 20:53:11 +08:00
..
scripts [None] [feat] skip batch_tokenize_prompts in CustomDataset (#10214) 2025-12-23 17:40:57 +08:00
tool_parser [None][chore] Unify DS tool parser names (#10239) 2025-12-31 14:40:07 +08:00
__init__.py Update TensorRT-LLM (#2820) 2025-02-25 21:21:49 +08:00
chat_utils.py [TRTLLM-9522][feat] support image_embeds in OpenAI API (#9715) 2026-01-14 10:31:03 +01:00
cluster_storage.py [TRTLLM-9091] [feat] Replace GenAI-Perf with AIPerf (#9310) 2025-12-23 13:25:55 +08:00
disagg_auto_scaling.py [https://nvbugs/5726066][fix] fix auto-scaling related failures (#9845) 2025-12-18 16:37:48 -05:00
harmony_adapter.py [#10614][fix] gpt_oss first iteration streaming in trtllm-serve (#10808) 2026-01-26 20:53:11 +08:00
metadata_server.py feat: Add integration of etcd (#3738) 2025-06-03 20:01:44 +08:00
openai_client.py [TRTLLM-9181][feat] improve disagg-server prometheus metrics; synchronize workers' clocks when workers are dynamic (#9726) 2025-12-16 05:16:32 -08:00
openai_disagg_server.py [TRTLLM-9181][feat] improve disagg-server prometheus metrics; synchronize workers' clocks when workers are dynamic (#9726) 2025-12-16 05:16:32 -08:00
openai_disagg_service.py [TRTLLM-10059][feat] Use global unique id as disagg request id (#10187) 2026-01-21 22:52:34 -05:00
openai_protocol.py [TRTLLM-10388][feat] Support logprobs for Completions API (#10809) 2026-01-22 21:25:24 +08:00
openai_server.py [#10614][fix] gpt_oss first iteration streaming in trtllm-serve (#10808) 2026-01-26 20:53:11 +08:00
openai_service.py [TRTLLM-8920][feat] decouple disagg service from fastapi (#8714) 2025-12-05 10:44:16 +08:00
perf_metrics.py [TRTLLM-9181][feat] improve disagg-server prometheus metrics; synchronize workers' clocks when workers are dynamic (#9726) 2025-12-16 05:16:32 -08:00
postprocess_handlers.py [#10614][fix] gpt_oss first iteration streaming in trtllm-serve (#10808) 2026-01-26 20:53:11 +08:00
responses_utils.py [TRTLLM-10154][feat] Enable guided decoding with reasoning parsers (#10890) 2026-01-22 14:14:28 +08:00
router.py [TRTLLM-9181][feat] improve disagg-server prometheus metrics; synchronize workers' clocks when workers are dynamic (#9726) 2025-12-16 05:16:32 -08:00