TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-14 06:27:45 +08:00

History

Jinyang Yuan b618e1f55b perf: Eliminate the need for attention DP padding when possible (#3439 ) Signed-off-by: Jinyang Yuan <154768711+jinyangyuan-nvidia@users.noreply.github.com> Co-authored-by: raccoonliukai <raccoonliu@tencent.com>		2025-05-17 13:30:55 +08:00
..
__init__.py	chore: Fix pipeline break caused by previous PR (#4081 ) rebase + pipeline reuse (#4169 )	2025-05-09 12:51:02 +08:00
communicator.py	feat: [nvbugs/5261055][nvbugs/5170160] non-invasive pipeline parallelism (#4034 )	2025-05-16 04:16:53 +08:00
ops.py	perf: Eliminate the need for attention DP padding when possible (#3439 )	2025-05-17 13:30:55 +08:00