TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-23 04:03:22 +08:00

History

Chenghao Zhang d6f95a4363 [None][feat] AutoDeploy: Perf optimization for Attention and rmsnorm (#9719 ) Signed-off-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com>		2025-12-05 12:59:04 -08:00
..
compile
config	[#8733 ][feat] Add Llama4 MoE handling to AutoDeploy (#9556 )	2025-12-04 08:03:33 +02:00
custom_ops	[None][feat] AutoDeploy: Perf optimization for Attention and rmsnorm (#9719 )	2025-12-05 12:59:04 -08:00
distributed
export
models
shim	[#9602 ][feat] AutoDeploy: Support TRTLLM Sampler (#9641 )	2025-12-04 19:24:11 -08:00
transform	[#8733 ][feat] Add Llama4 MoE handling to AutoDeploy (#9556 )	2025-12-04 08:03:33 +02:00
utils	[#8733 ][feat] Add Llama4 MoE handling to AutoDeploy (#9556 )	2025-12-04 08:03:33 +02:00
__init__.py
llm_args.py	[#9602 ][feat] AutoDeploy: Support TRTLLM Sampler (#9641 )	2025-12-04 19:24:11 -08:00
llm.py