TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-14 06:27:45 +08:00

History

mpikulski 533add5056 [TRTLLM-8598][feat] enable n > 1 in OpenAI API with PyTorch backend (#8951 ) Signed-off-by: ixlmar <206748156+ixlmar@users.noreply.github.com>		2025-11-07 17:47:35 -08:00
..
scripts	[https://nvbugs/5523315 ][fix] Fix serve benchmark test (#8255 )	2025-11-03 00:30:13 -08:00
tool_parser	[TRTLLM-8214][feat] Support Qwen3 tool parser (#8216 )	2025-10-29 15:48:29 +08:00
__init__.py	Update TensorRT-LLM (#2820 )	2025-02-25 21:21:49 +08:00
chat_utils.py	[TRTLLM-8598][feat] enable n > 1 in OpenAI API with PyTorch backend (#8951 )	2025-11-07 17:47:35 -08:00
cluster_storage.py	[TRTLLM-8431][doc] update public doc and example, add etcd auto-scaling tests (#8602 )	2025-10-28 17:04:53 -07:00
disagg_auto_scaling.py	[TRTLLM-8431][doc] update public doc and example, add etcd auto-scaling tests (#8602 )	2025-10-28 17:04:53 -07:00
harmony_adapter.py	[https://nvbugs/5521799 ][fix] Trim incorrectly generated harmony messages (#7849 )	2025-09-24 16:38:43 +08:00
metadata_server.py	feat: Add integration of etcd (#3738 )	2025-06-03 20:01:44 +08:00
openai_disagg_server.py	[None][feat] Add opentelemetry tracing (#5897 )	2025-10-27 18:51:07 +08:00
openai_protocol.py	[None][feat] Support ignored prompt length for penalties via new sampling config parameter (#8127 )	2025-10-27 13:12:31 -04:00
openai_server.py	[TRTLLM-8598][feat] enable n > 1 in OpenAI API with PyTorch backend (#8951 )	2025-11-07 17:47:35 -08:00
postprocess_handlers.py	[TRTLLM-8214][feat] Support Qwen3 tool parser (#8216 )	2025-10-29 15:48:29 +08:00
responses_utils.py	[None][feat] perf_metrics endpoint functionality improvement (#8005 )	2025-10-02 17:43:25 -07:00
router.py	[TRTLLM-8431][doc] update public doc and example, add etcd auto-scaling tests (#8602 )	2025-10-28 17:04:53 -07:00