TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-02-10 04:53:38 +08:00

History

Ziyi Xiong 8062e0fe7c [TRTLLM-6392][feat] Support turning on/off spec decoding dynamically (#6363 ) Signed-off-by: ziyixiong-nv <219238287+ziyixiong-nv@users.noreply.github.com>		2025-07-31 15:31:39 -04:00
..
test_draft_target.py	[refactor] Simplification of Speculative decoding configs - Part 2 (#5936 )	2025-07-23 09:20:27 +08:00
test_dynamic_spec_decode.py	[TRTLLM-6392][feat] Support turning on/off spec decoding dynamically (#6363 )	2025-07-31 15:31:39 -04:00
test_eagle3.py	[TRTLLM-6453][feat] Support chunked prefill on spec decode 2 model (#6104 )	2025-07-24 21:50:11 -04:00
test_kv_cache_reuse.py	[TRTLLM-6452][feat]: Two-model engine KV cache reuse support (#6133 )	2025-07-19 13:17:15 +08:00
test_mtp.py	[refactor] Simplification of Speculative decoding configs (#5639 )	2025-07-10 11:37:30 -04:00
test_ngram.py	[refactor] Simplification of Speculative decoding configs (#5639 )	2025-07-10 11:37:30 -04:00
test_user_provided.py	[refactor] Simplification of Speculative decoding configs (#5639 )	2025-07-10 11:37:30 -04:00