TensorRT-LLMs/tensorrt_llm/_torch/modules
HuiGao-NV 97674c3114
[TRTLLM-8690][feat] add more tensors to share buffers (#8691)
Signed-off-by: Hui Gao <huig@nvidia.com>
2025-11-03 21:08:01 -08:00
..
fla
fused_moe [TRTLLM-8690][feat] add more tensors to share buffers (#8691) 2025-11-03 21:08:01 -08:00
mamba
__init__.py
attention.py [TRTLLM-5966][feat] Helix: add full MLA support for Helix (#8104) 2025-11-04 09:06:58 +08:00
decoder_layer.py
embedding.py
gated_mlp.py
layer_norm.py [TRTLLM-8535][feat] Support DeepSeek V3.2 with FP8 + BF16 KV cache/NVFP4 + BF16 KV cache (#8405) 2025-10-24 13:40:41 -04:00
linear.py [https://nvbugs/5599086][fix] Fix FP8 Linear module for spark (#8707) 2025-10-29 13:58:19 -07:00
logits_processor.py
mlp.py
multi_stream_utils.py
qk_norm_attention.py [None][fix] Avoid unnecessary concat in attn_output_gate case. (#8094) 2025-10-13 12:59:40 -07:00
rms_norm.py
rotary_embedding.py
swiglu.py
triton_linear.py