TensorRT-LLMs/cpp/tensorrt_llm/deep_ep
yifeizhang-c 34d158b6da
[TRTLLM-6589][feat] Support CUDA graph for DeepEP (#7514)
Signed-off-by: Yifei Zhang <219273404+yifeizhang-c@users.noreply.github.com>
2025-10-02 10:13:24 -07:00
..
CMakeLists.txt [TRTLLM-6589][feat] Support CUDA graph for DeepEP (#7514) 2025-10-02 10:13:24 -07:00
deep_ep_cpp_tllm.version Refactor: move DeepEP from Docker images to wheel building (#5534) 2025-07-07 22:57:03 +09:00
nvshmem_fast_build.patch Refactor: move DeepEP from Docker images to wheel building (#5534) 2025-07-07 22:57:03 +09:00
nvshmem_src_3.2.5-1.txz Refactor: move DeepEP from Docker images to wheel building (#5534) 2025-07-07 22:57:03 +09:00
README.md Refactor: move DeepEP from Docker images to wheel building (#5534) 2025-07-07 22:57:03 +09:00
strip_nvshmem_helper.py Refactor: move DeepEP from Docker images to wheel building (#5534) 2025-07-07 22:57:03 +09:00

How to generate nvshmem_fast_build.patch?

  1. Build the project without applying the nvshmem_fast_build.patch.
  2. Link NVSHMEM to DeepEP with one NVSHMEM object file omitted.
  3. Repeat step 2 until no more object files can be omitted.
  4. Remove the unused files from NVSHMEM's CMakelists.txt, and save the differences as nvshmem_fast_build.patch.

The script strip_nvshmem_helper.py automatically performs steps 2 and 3.