TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-28 14:44:24 +08:00

History

JadoTu 51bf7164d3 [None][feat] add qwen3-next CI test of accuracy on BF16 and NVFP4 (#9330 ) Signed-off-by: jiant <107457950+JadoTu@users.noreply.github.com>		2025-11-27 18:05:00 +08:00
..
dev	Update (#2978 )	2025-03-23 16:39:35 +08:00
qa	[None][feat] Support MLA chunked prefill for DeepSeek V3.2 model (#9376 )	2025-11-26 16:38:25 +08:00
test-db	[None][feat] add qwen3-next CI test of accuracy on BF16 and NVFP4 (#9330 )	2025-11-27 18:05:00 +08:00
waives.txt	[None][chore] revert batch_size=1 to prevent timeout and lower accuracy reference by 0.12% as a WAR (#9447 )	2025-11-27 14:25:44 +08:00