TensorRT-LLMs/deep_ep at 01423ac183008343c04cbf90de5901fb40f52054 - TensorRT-LLMs - Gitea: Git with a cup of tea

kanshan/TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-14 06:27:45 +08:00

History

yifeizhang-c 34d158b6da [TRTLLM-6589][feat] Support CUDA graph for DeepEP (#7514 ) Signed-off-by: Yifei Zhang <219273404+yifeizhang-c@users.noreply.github.com>		2025-10-02 10:13:24 -07:00
..
CMakeLists.txt	[TRTLLM-6589][feat] Support CUDA graph for DeepEP (#7514 )	2025-10-02 10:13:24 -07:00
deep_ep_cpp_tllm.version	Refactor: move DeepEP from Docker images to wheel building (#5534 )	2025-07-07 22:57:03 +09:00
nvshmem_fast_build.patch	Refactor: move DeepEP from Docker images to wheel building (#5534 )	2025-07-07 22:57:03 +09:00
nvshmem_src_3.2.5-1.txz	Refactor: move DeepEP from Docker images to wheel building (#5534 )	2025-07-07 22:57:03 +09:00
README.md	Refactor: move DeepEP from Docker images to wheel building (#5534 )	2025-07-07 22:57:03 +09:00
strip_nvshmem_helper.py	Refactor: move DeepEP from Docker images to wheel building (#5534 )	2025-07-07 22:57:03 +09:00

README.md

How to generate nvshmem_fast_build.patch?

Build the project without applying the nvshmem_fast_build.patch.
Link NVSHMEM to DeepEP with one NVSHMEM object file omitted.
Repeat step 2 until no more object files can be omitted.
Remove the unused files from NVSHMEM's CMakelists.txt, and save the differences as nvshmem_fast_build.patch.

The script strip_nvshmem_helper.py automatically performs steps 2 and 3.