TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-23 04:03:22 +08:00

History

Jinyang Yuan b618e1f55b perf: Eliminate the need for attention DP padding when possible (#3439 ) Signed-off-by: Jinyang Yuan <154768711+jinyangyuan-nvidia@users.noreply.github.com> Co-authored-by: raccoonliukai <raccoonliu@tencent.com>		2025-05-17 13:30:55 +08:00
..
test_deepseek.py	perf: Eliminate the need for attention DP padding when possible (#3439 )	2025-05-17 13:30:55 +08:00
test_llama4.py	[infra] Improve llama4 parallelism test coverage (#3821 )	2025-05-02 16:15:04 -04:00