TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-24 12:42:54 +08:00

History

Fanrong Li bfa3b59bb6 [https://nvbugs/5277592 ][fix] fix cuda graph padding for spec decoding (only for 0.20) (#5058 ) Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>		2025-06-11 02:14:14 +08:00
..
dev	Update (#2978 )	2025-03-23 16:39:35 +08:00
qa	test: shorten reqs in con:1 cases and add streaming cases, add l2 perf test (#4796 )	2025-06-03 10:20:55 +08:00
test-db	[https://nvbugs/5277592 ][fix] fix cuda graph padding for spec decoding (only for 0.20) (#5058 )	2025-06-11 02:14:14 +08:00
waives.txt	[5289904] chore: Unwaive test for Qwen model. (#4657 )	2025-06-09 14:06:59 +08:00