TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-14 06:27:45 +08:00

History

Fanrong Li 1bbc0e323b [None][fix] Pre-allocate workspaces for DeepGEMM MoE to avoid frequent cudaFree/cudaMalloc (#6811 ) Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com> Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com> Co-authored-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>		2025-08-13 10:27:57 +08:00
..
__init__.py	Deepseek R1 FP8 Support on Blackwell (#6486 )	2025-08-01 10:26:28 +08:00
fp4_utils.py	[None] [feat] Add model gpt-oss (#6645 )	2025-08-07 03:04:18 -04:00
fp8_utils.py	[None][fix] Pre-allocate workspaces for DeepGEMM MoE to avoid frequent cudaFree/cudaMalloc (#6811 )	2025-08-13 10:27:57 +08:00