Yukun He
|
f001c4946d
|
[https://nvbugs/5782112][fix] Fix hanging issue for MNNVL Allreduce under PP (#10633)
Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>
|
2026-01-16 13:03:36 +08:00 |
|
Shiyu Li
|
8bdbb48264
|
[https://nvbugs/5489015][fix] Support communicator split in MNNVL allreduce and fix the binding issues. (#7387)
Signed-off-by: Shiyu Li <shili@nvidia.com>
|
2025-09-17 07:43:20 +08:00 |
|
Shiyu Li
|
6e1aee6fd6
|
[fix] Performance Optimization for MNNVL TwoShot Kernel (#5934)
Signed-off-by: Shiyu Li <shili@nvidia.com>
Co-authored-by: Zongfei Jing <20381269+zongfeijing@users.noreply.github.com>
|
2025-07-17 10:49:51 +08:00 |
|
Zongfei Jing
|
dbaddb3a29
|
Adding two-shot allreduce kernel and mnnvl multicasting buffer (#4216)
* Adding two-shot allreduce kernel and mnnvl multicasting buffergit gffe
Signed-off-by: Shiyu Li <shili@nvidia.com>
Adding comments
Signed-off-by: Shiyu Li <shili@nvidia.com>
Add unittest of the twoshot kernel.
Signed-off-by: Shiyu Li <shili@nvidia.com>
Update dispatch logic
Signed-off-by: Shiyu Li <shili@nvidia.com>
Use cpu barrier instead of GPU at init
Signed-off-by: Shiyu Li <shili@nvidia.com>
Merge dispatch logic fix
Signed-off-by: Shiyu Li <shili@nvidia.com>
Update the kernel to use GPU-managed buffer
Signed-off-by: Shiyu Li <shili@nvidia.com>
* Refine
Signed-off-by: Zongfei Jing <20381269+zongfeijing@users.noreply.github.com>
* Clean code
Signed-off-by: Zongfei Jing <20381269+zongfeijing@users.noreply.github.com>
* Fix compile error
Signed-off-by: Zongfei Jing <20381269+zongfeijing@users.noreply.github.com>
* Fix issue
Signed-off-by: Zongfei Jing <20381269+zongfeijing@users.noreply.github.com>
* Clean up
Signed-off-by: Zongfei Jing <20381269+zongfeijing@users.noreply.github.com>
* Simplify AllReduce interface
Signed-off-by: Zongfei Jing <20381269+zongfeijing@users.noreply.github.com>
* Rename
Signed-off-by: Zongfei Jing <20381269+zongfeijing@users.noreply.github.com>
* Fix warning
Signed-off-by: Zongfei Jing <20381269+zongfeijing@users.noreply.github.com>
* Tidy code
Signed-off-by: Zongfei Jing <20381269+zongfeijing@users.noreply.github.com>
* Rename
Signed-off-by: Zongfei Jing <20381269+zongfeijing@users.noreply.github.com>
* Fix compile error
Signed-off-by: Zongfei Jing <20381269+zongfeijing@users.noreply.github.com>
* Refine
Signed-off-by: Zongfei Jing <20381269+zongfeijing@users.noreply.github.com>
* Skip ut for no_fusion
Signed-off-by: Zongfei Jing <20381269+zongfeijing@users.noreply.github.com>
* Refine
Signed-off-by: Zongfei Jing <20381269+zongfeijing@users.noreply.github.com>
---------
Signed-off-by: Shiyu Li <shili@nvidia.com>
Signed-off-by: Zongfei Jing <20381269+zongfeijing@users.noreply.github.com>
Co-authored-by: Shiyu Li <shili@nvidia.com>
|
2025-05-22 03:42:36 +08:00 |
|