TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-30 07:33:48 +08:00

History

dhansen-nvidia 2d33ae94d5 [https://nvbugs/5508301 ][feat] Move D->H copies to a worker thread whe… (#8463 ) Signed-off-by: Dan Hansen <1+dhansen-nvidia@users.noreply.github.com> Signed-off-by: dhansen-nvidia <218031328+dhansen-nvidia@users.noreply.github.com> Co-authored-by: Dan Hansen <1+dhansen-nvidia@users.noreply.github.com>		2025-12-09 18:51:31 -05:00
..
batched_logits_processor.yaml	test: [TRTLLM-4334] Create 1.0 criteria scope from API stability references (#3069 )	2025-03-26 18:14:35 +08:00
calib_config.yaml	test: [TRTLLM-4334] Create 1.0 criteria scope from API stability references (#3069 )	2025-03-26 18:14:35 +08:00
completion_output.yaml	[TRTLLM-4517] [feat] Additional model outputs (#7206 )	2025-10-13 15:33:18 +02:00
guided_decoding_params.yaml	feat: Support the Structural Tag in guided decoding (#4066 )	2025-05-12 17:24:50 +08:00
llm.yaml	[https://nvbugs/5508301 ][feat] Move D->H copies to a worker thread whe… (#8463 )	2025-12-09 18:51:31 -05:00
logits_processor.yaml	feat: LogitsProcessor in PyTorch backend (#3145 )	2025-05-01 14:15:30 -07:00
quant_config.yaml	[TRTLLM-6174][feat] Enable FP32 mamba ssm cache (#6574 )	2025-08-10 16:27:51 -04:00
request_output.yaml	[None][feat] Add opentelemetry tracing (#5897 )	2025-10-27 18:51:07 +08:00
sampling_params.yaml	[None][feat] Support ignored prompt length for penalties via new sampling config parameter (#8127 )	2025-10-27 13:12:31 -04:00