TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-02-06 03:01:50 +08:00

History

Void f7de285a82 [None][fix] add quantization check for DeepEP LL low precision combine in new moe comm api (#10072 ) Signed-off-by: Yilin Zhang <18275976+yilin-void@users.noreply.github.com>		2026-01-14 22:15:29 -05:00
..
fla
fused_moe
mamba
__init__.py
attention.py
decoder_layer.py
embedding.py
gated_mlp.py
layer_norm.py
linear.py
logits_processor.py
mlp.py
multi_stream_utils.py
qk_norm_attention.py
rms_norm.py
rotary_embedding.py
swiglu.py
triton_linear.py