TensorRT-LLMs/cpp/tensorrt_llm/kernels/userbuffers
Ludwig Schneider 41ce14ab04
[None][feat] Enable NCCL_SYMMETRIC as default fallback for AllReduce (#9314)
Signed-off-by: Ludwig Schneider <lschneider@nvidia.com>
2025-12-07 09:43:26 -08:00
..
CMakeLists.txt feat: reduce unnecessary kernel generation (#5476) 2025-07-04 14:37:49 +08:00
ipcsocket.cpp Update TensorRT-LLM (#2532) 2024-12-04 21:16:56 +08:00
ipcsocket.h Update TensorRT-LLM (#2532) 2024-12-04 21:16:56 +08:00
ub_allocator.cpp [None][feat] Enable NCCL_SYMMETRIC as default fallback for AllReduce (#9314) 2025-12-07 09:43:26 -08:00
ub_allocator.h [None][feat] Enable NCCL_SYMMETRIC as default fallback for AllReduce (#9314) 2025-12-07 09:43:26 -08:00
ub_interface.cpp [None][feat] Add NCCL Symmetric Integration for All Reduce (#4500) 2025-08-07 17:28:14 -07:00
ub_interface.h feat: Introduce UB allocator for pytorch flow (#3257) 2025-04-08 18:39:49 +08:00
userbuffers-host.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
userbuffers.cu [https://nvbugs/5545522][fix] move PREEXIT in UB kernels to fix accuracy issue (#8318) 2025-11-04 16:42:31 +08:00
userbuffers.h None - Add one-shot version for UB AR NORM FP16/BF16 (#2995) 2025-03-31 11:16:03 +08:00
userbuffersManager.cpp [None][feat] Enable NCCL_SYMMETRIC as default fallback for AllReduce (#9314) 2025-12-07 09:43:26 -08:00
userbuffersManager.h [None][feat] Enable NCCL_SYMMETRIC as default fallback for AllReduce (#9314) 2025-12-07 09:43:26 -08:00
utils.h Update TensorRT-LLM (#2783) 2025-02-13 18:40:22 +08:00