TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-02-06 03:01:50 +08:00

History

Yanchao Lu ae8f74b620 [None][chore] Reduce tedious logs (#10847 ) Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>		2026-01-20 22:56:24 +08:00
..
fla
fused_moe	[TRTLLM-10296][fix] Fix the potential misaligned access due to vectorized ld/st instructions in NVLinkOneSided A2A. (#10539 )	2026-01-20 11:08:04 +08:00
mamba	[TRTLLM-10060][feat] Enable attention dp for Nemotron Super v3. (#10347 )	2026-01-13 17:13:55 +08:00
__init__.py
attention.py	[None][chore] Reduce tedious logs (#10847 )	2026-01-20 22:56:24 +08:00
decoder_layer.py
embedding.py
gated_mlp.py
layer_norm.py
linear.py	[None][fix] default disable gemm+allreduce fusion (#10656 )	2026-01-20 12:31:17 +08:00
logits_processor.py
mlp.py	[None][fix] Enable AttentionDP on Qwen3-VL and fix test (#10435 )	2026-01-10 00:13:26 +09:00
multi_stream_utils.py
qk_norm_attention.py
rms_norm.py	[None][feat] Adding torch ext API for FusedAddRMSNormQuant kernel (#9905 )	2026-01-15 07:29:15 +08:00
rotary_embedding.py
swiglu.py
triton_linear.py