TensorRT-LLMs/tensorrt_llm/_torch/speculative
Fanrong Li bfa3b59bb6
[https://nvbugs/5277592][fix] fix cuda graph padding for spec decoding (only for 0.20) (#5058)
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
2025-06-11 02:14:14 +08:00
..
__init__.py Add initial EAGLE-3 implementation (#3035) 2025-03-29 22:31:24 +08:00
eagle3.py API Breaking Change + Readability: "decoder"->"sampler" (#4121) 2025-05-16 23:52:25 +08:00
interface.py API Breaking Change + Readability: "decoder"->"sampler" (#4121) 2025-05-16 23:52:25 +08:00
mtp.py [https://nvbugs/5277592][fix] fix cuda graph padding for spec decoding (only for 0.20) (#5058) 2025-06-11 02:14:14 +08:00
utils.py API Breaking Change + Readability: "decoder"->"sampler" (#4121) 2025-05-16 23:52:25 +08:00