TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-14 06:27:45 +08:00

History

Enwei Zhu 1b9781e8e7 [TRTLLM-6409][feat] Enable guided decoding with speculative decoding (part 1: two-model engine) (#6300 ) Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>		2025-08-07 05:53:48 -04:00
..
features	[TRTLLM-6409][feat] Enable guided decoding with speculative decoding (part 1: two-model engine) (#6300 )	2025-08-07 05:53:48 -04:00
adding_new_model.md	chores: merge examples for v1.0 doc (#5736 )	2025-07-08 21:00:42 -07:00
arch_overview.md	update broken link of PyTorchModelEngine in arch_overview (#6171 )	2025-07-18 19:53:38 +08:00
attention.md	chore [BREAKING CHANGE]: Flatten PyTorchConfig knobs into TorchLlmArgs (#4603 )	2025-05-28 18:43:04 +08:00
kv_cache_manager.md	Release 0.20 to main (#4577 )	2025-05-28 16:25:33 +08:00
scheduler.md	Update TensorRT-LLM (#2873 )	2025-03-11 21:13:42 +08:00