TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-23 20:23:08 +08:00

History

gramnarayan a9eb5afc9f [#9241 ][feat] AutoDeploy: Support Eagle3 Speculative Decoding (#9869 ) Support two model flow with no overlap scheduler or chain drafter. Drafting model is in PyTorch backend. Signed-off-by: Govind Ramnarayan <105831528+govind-ramnarayan@users.noreply.github.com>		2025-12-24 23:30:42 -05:00
..
dev	Update (#2978 )	2025-03-23 16:39:35 +08:00
qa	[None][test] Add disag-serving auto scaling qa test (#10262 )	2025-12-24 08:43:47 -05:00
test-db	[#9241 ][feat] AutoDeploy: Support Eagle3 Speculative Decoding (#9869 )	2025-12-24 23:30:42 -05:00
waives.txt	[None][chore] Add failed cases into waives.txt (#10240 )	2025-12-24 02:21:50 -05:00