TensorRT-LLMs/tensorrt_llm/_torch/distributed
Jinyang Yuan b618e1f55b
perf: Eliminate the need for attention DP padding when possible (#3439)
Signed-off-by: Jinyang Yuan <154768711+jinyangyuan-nvidia@users.noreply.github.com>
Co-authored-by: raccoonliukai <raccoonliu@tencent.com>
2025-05-17 13:30:55 +08:00
..
__init__.py chore: Fix pipeline break caused by previous PR (#4081) rebase + pipeline reuse (#4169) 2025-05-09 12:51:02 +08:00
communicator.py feat: [nvbugs/5261055][nvbugs/5170160] non-invasive pipeline parallelism (#4034) 2025-05-16 04:16:53 +08:00
ops.py perf: Eliminate the need for attention DP padding when possible (#3439) 2025-05-17 13:30:55 +08:00