This website requires JavaScript.
Explore
Help
Sign In
kanshan
/
TensorRT-LLMs
Watch
1
Star
0
Fork
0
You've already forked TensorRT-LLMs
mirror of
https://github.com/NVIDIA/TensorRT-LLM.git
synced
2026-02-06 03:01:50 +08:00
Code
Issues
Actions
1
Packages
Projects
Releases
Wiki
Activity
b64052539d
TensorRT-LLMs
/
tensorrt_llm
/
_torch
/
modules
History
Void
f7de285a82
[None][fix] add quantization check for DeepEP LL low precision combine in new moe comm api (
#10072
)
...
Signed-off-by: Yilin Zhang <18275976+yilin-void@users.noreply.github.com>
2026-01-14 22:15:29 -05:00
..
fla
fused_moe
mamba
__init__.py
attention.py
decoder_layer.py
embedding.py
gated_mlp.py
layer_norm.py
linear.py
logits_processor.py
mlp.py
multi_stream_utils.py
qk_norm_attention.py
rms_norm.py
rotary_embedding.py
swiglu.py
triton_linear.py