TensorRT-LLMs/cpp/tensorrt_llm/kernels/userbuffers
Yibin Li 32ae1564bd
update FP4 quantize layout (#3045)
Signed-off-by: Yibin Li <109242046+yibinl-nvidia@users.noreply.github.com>
2025-04-03 13:13:54 -04:00
..
CMakeLists.txt Update TensorRT-LLM (#2532) 2024-12-04 21:16:56 +08:00
ipcsocket.cpp Update TensorRT-LLM (#2532) 2024-12-04 21:16:56 +08:00
ipcsocket.h Update TensorRT-LLM (#2532) 2024-12-04 21:16:56 +08:00
ub_allocator.cpp Update TensorRT-LLM (#2783) 2025-02-13 18:40:22 +08:00
ub_allocator.h Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
ub_interface.cpp None - Add one-shot version for UB AR NORM FP16/BF16 (#2995) 2025-03-31 11:16:03 +08:00
ub_interface.h None - Add one-shot version for UB AR NORM FP16/BF16 (#2995) 2025-03-31 11:16:03 +08:00
userbuffers-host.cpp Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
userbuffers.cu update FP4 quantize layout (#3045) 2025-04-03 13:13:54 -04:00
userbuffers.h None - Add one-shot version for UB AR NORM FP16/BF16 (#2995) 2025-03-31 11:16:03 +08:00
utils.h Update TensorRT-LLM (#2783) 2025-02-13 18:40:22 +08:00