TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-25 05:02:59 +08:00

History

Haohang Huang 980929e1a9 [https://nvbugs/5410687 ][fix] Hopper w4a8 groupwise MoE interleave (#6708 ) Signed-off-by: Haohang Huang <31998628+symphonylyh@users.noreply.github.com>		2025-08-07 15:30:16 -07:00
..
__init__.py
activation.py
attention.py	feat: TRTLLM-6450 update long rope for phi3.5/phi4-mini/phi4-mm (#6353 )	2025-07-30 09:20:16 -07:00
cast.py
conv.py
embedding.py	fix: #3137 speculative decoding and multimodal input support (#3276 )	2025-04-09 23:40:19 +08:00
language_adapter.py
linear.py
lora.py
mlp.py	Update TensorRT-LLM (#2936 )	2025-03-18 21:25:19 +08:00
moe.py	[https://nvbugs/5410687 ][fix] Hopper w4a8 groupwise MoE interleave (#6708 )	2025-08-07 15:30:16 -07:00
normalization.py
pooling.py
recurrent.py
ssm.py