| .. |
|
allReduceFusionKernels.cu
|
[https://nvbugs/5788127][fix] Use uint64_t as the dtype of lamport_buffer_size to avoid overflow (#10499)
|
2026-01-13 17:16:22 +08:00 |
|
allReduceFusionKernels.h
|
[None][fix] Introduce inline namespace to avoid symbol collision (#9541)
|
2025-12-12 23:32:15 +08:00 |
|
allReduceWorkspace.cu
|
[https://nvbugs/5788127][fix] Use uint64_t as the dtype of lamport_buffer_size to avoid overflow (#10499)
|
2026-01-13 17:16:22 +08:00 |
|
allReduceWorkspace.h
|
[https://nvbugs/5788127][fix] Use uint64_t as the dtype of lamport_buffer_size to avoid overflow (#10499)
|
2026-01-13 17:16:22 +08:00 |
|
customLowPrecisionAllReduceKernels.cu
|
[None][fix] Introduce inline namespace to avoid symbol collision (#9541)
|
2025-12-12 23:32:15 +08:00 |
|
customLowPrecisionAllReduceKernels.h
|
[None][fix] Introduce inline namespace to avoid symbol collision (#9541)
|
2025-12-12 23:32:15 +08:00 |
|
mnnvlAllreduceKernels.cu
|
[https://nvbugs/5729697][fix] MNNVL Allreduce: use CUDA runtime instead of Macro to get SM version. (#10062)
|
2025-12-23 16:07:07 +08:00 |
|
mnnvlAllreduceKernels.h
|
[None][fix] Introduce inline namespace to avoid symbol collision (#9541)
|
2025-12-12 23:32:15 +08:00 |
|
moeAllReduceFusionKernels.cu
|
[https://nvbugs/5788127][fix] Use uint64_t as the dtype of lamport_buffer_size to avoid overflow (#10499)
|
2026-01-13 17:16:22 +08:00 |
|
moeAllReduceFusionKernels.h
|
[None][fix] Introduce inline namespace to avoid symbol collision (#9541)
|
2025-12-12 23:32:15 +08:00 |
|
moeAlltoAllKernels.cu
|
[TRTLLM-10126][feat] Increase topk upper limit to 22 for NVLinkOneSid… (#10229)
|
2025-12-27 22:48:10 +08:00 |
|
moeAlltoAllKernels.h
|
[TRTLLM-10126][feat] Increase topk upper limit to 22 for NVLinkOneSid… (#10229)
|
2025-12-27 22:48:10 +08:00 |