TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-14 06:27:45 +08:00

History

Tailing Yuan 648196f8ae [TRTLLM-9432][feat] Reduce synchronization and recompilation for qwen3-next (#9691 ) Signed-off-by: Tailing Yuan <yuantailing@gmail.com>		2025-12-23 10:14:29 +08:00
..
__init__.py	[None][feat] Support Qwen3 next (#7892 )	2025-09-29 21:16:07 +08:00
chunk_delta_h.py	[None][feat] Support Qwen3 next (#7892 )	2025-09-29 21:16:07 +08:00
chunk_o.py	[None][feat] Support Qwen3 next (#7892 )	2025-09-29 21:16:07 +08:00
chunk_scaled_dot_kkt.py	[None][feat] Support Qwen3 next (#7892 )	2025-09-29 21:16:07 +08:00
chunk.py	[None][chroe] Polish qwen3-next modeling code. (#8902 )	2025-12-02 11:28:35 +08:00
cumsum.py	[None][feat] Support Qwen3 next (#7892 )	2025-09-29 21:16:07 +08:00
fused_recurrent.py	[None][feat] Support Qwen3 next (#7892 )	2025-09-29 21:16:07 +08:00
fused_sigmoid_gating_recurrent.py	[None][feat] Support Qwen3 next (#7892 )	2025-09-29 21:16:07 +08:00
index.py	[None][feat] Support Qwen3 next (#7892 )	2025-09-29 21:16:07 +08:00
l2norm.py	[TRTLLM-9432][feat] Reduce synchronization and recompilation for qwen3-next (#9691 )	2025-12-23 10:14:29 +08:00
layernorm_gated.py	[None][feat] Support Qwen3 next (#7892 )	2025-09-29 21:16:07 +08:00
op.py	[None][feat] Support Qwen3 next (#7892 )	2025-09-29 21:16:07 +08:00
solve_tril.py	[None][feat] Support Qwen3 next (#7892 )	2025-09-29 21:16:07 +08:00
utils.py	[None][feat] Support Qwen3 next (#7892 )	2025-09-29 21:16:07 +08:00
wy_fast.py	[None][feat] Support Qwen3 next (#7892 )	2025-09-29 21:16:07 +08:00