TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-02-19 01:05:12 +08:00

History

Grzegorz Kwasniewski 2101d46d68 [TRTLLM-6342][feat] TP Sharding read from the model config (#6972 ) Signed-off-by: greg-kwasniewski1 <213329731+greg-kwasniewski1@users.noreply.github.com> Co-authored-by: Suyog Gupta <41447211+suyoggupta@users.noreply.github.com>		2025-08-25 15:41:27 -07:00
..
multigpu	[TRTLLM-6342][feat] TP Sharding read from the model config (#6972 )	2025-08-25 15:41:27 -07:00
singlegpu	[#4403 ][refactor] Move fusion, kvcache, and compile to modular inference optimizer (#7057 )	2025-08-21 10:30:36 -07:00