TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-14 06:27:45 +08:00

History

Richard Huo ce580ce4f5 [None][feat] KV Cache Connector API (#7228 ) Signed-off-by: jthomson04 <jwillthomson19@gmail.com> Signed-off-by: richardhuo-nv <rihuo@nvidia.com> Co-authored-by: jthomson04 <jwillthomson19@gmail.com> Co-authored-by: Iman Tabrizian <10105175+Tabrizian@users.noreply.github.com> Co-authored-by: Sharan Chetlur <116769508+schetlur-nv@users.noreply.github.com>		2025-08-28 23:09:27 -04:00
..
batched_logits_processor.yaml	test: [TRTLLM-4334] Create 1.0 criteria scope from API stability references (#3069 )	2025-03-26 18:14:35 +08:00
calib_config.yaml	test: [TRTLLM-4334] Create 1.0 criteria scope from API stability references (#3069 )	2025-03-26 18:14:35 +08:00
completion_output.yaml	[TRTLLM-6104] feat: add request_perf_metrics to LLMAPI (#5497 )	2025-06-27 17:03:05 +02:00
guided_decoding_params.yaml	feat: Support the Structural Tag in guided decoding (#4066 )	2025-05-12 17:24:50 +08:00
llm.yaml	[None][feat] KV Cache Connector API (#7228 )	2025-08-28 23:09:27 -04:00
logits_processor.yaml	feat: LogitsProcessor in PyTorch backend (#3145 )	2025-05-01 14:15:30 -07:00
quant_config.yaml	[TRTLLM-6174][feat] Enable FP32 mamba ssm cache (#6574 )	2025-08-10 16:27:51 -04:00
request_output.yaml	[None][feat] Core Metrics Implementation (#5785 )	2025-08-09 02:48:53 -04:00
sampling_params.yaml	chore: Cleanup deprecated APIs from LLM-API (part 1/2) (#3732 )	2025-05-07 13:20:25 +08:00