In this change we rename 3rdparty/README.md (which contains the process playboook for C++ dependencies) to 3rdparty/cpp-thirdparty.md and add a new 3rdparty/py-thirdparty.md file which contains the process playbook for python dependencies. We also update the main 3rdparty/README.md file to serve as a starting-point referring to both of these files. Signed-off-by: Josh Bialkowski <1309820+cheshirekow@users.noreply.github.com> Co-authored-by: Josh Bialkowski <1309820+cheshirekow@users.noreply.github.com>
12 KiB
Adding new C++ Dependencies
Step 1: Make the package available to the build
First, decide if you must install the package in the container or if you may defer fetching until the build phase. In general, prefer to fetch packages during the build phase. You may be required to install packages into the container, however, if there is a runtime component (e.g. shared objects) that cannot be reasonably distributed with the wheel.
Install in the container
Debian Packages via os package manager (e.g. apt, dnf)
Add your package to one of the existing shell scripts used by the docker build under docker/common/ Find the location where the package manager is invoked, and add the name of your package there.
NOTE: Internal compliance tooling will automatically detect the installation of this package and fetch sources using the source-fetching facilities of the OS package manager.
Python Packages via pip
If it makes sense, add your package to one of the existing shell scripts used by the docker build under docker/common/. Grep for "pip3 install" to see existing invocations. If none of the existing shell scripts make sense, add a new shell script to install your package and then invoke that script in Dockerfile.multi.
NOTE: If the new python package you are adding has a compiled component (e.g. a python extension module), you must coordinate with the Security Team to ensure that the source for this component is managed correctly.
Tarball packages via HTTP/FTP
Invoke wget in a shell script which is called from the docker build file.
When it makes sense, please prefer to extend an existing script in
docker/common/ rather than creating a new one. If you are downloading a
binary package, you must also download the source package that produced that
binary.
Ensure that the source package is copied to /third-party-source and retained after all cleanup within the docker image layer.
Fetch during the build
Python Packages via pip
Add an entry to requirements-dev.txt. The package will be installed by build_wheel.py during virtual environment initialization prior to configuring the build with cmake. Include a comment indicating the intended usage of the package.
Example:
requirements-dev.txt:
# my-package is needed by <feature> where it is used for <reason>
my-package==1.2.24
C/C++ Packages via conan
Add a new entry to conandata.yml indicating the package version for the
dependency you are adding. Include a yaml comment indicating the intended usage
of the package. Then add a new invocation of self.require() within the def requirements(self) method of [conanfile.py], referencing the version you added
to conandata.
Example:
conandata.yml:
# my_dependency is needed by <feature> where it is used for <reason>
my_dependency: 1.2.24+1
conanfile.py:
def requirements(self):
...
my_dependency_version = self.conandata["my_dependency"]
self.requires(f"my_dependency/{my_dependency_version}")
Source integration via CMake
If you have a package you need to build from source then use CMake FetchContent of ExternalProject to fetch the package sources and integrate it with the build. See the details in the next section.
git Submodule - Don't Use
Please avoid use of git-submodule. If, for some reason, the CMake integrations described below don't work and git-submodule is absolutely required, please add the submodule under the 3rdparty directory.
Rationale:
For a source-code dependency distributed via git, FetchContent/ExternalProject and git submodules both ultimately contain the same referential information (repository URL, commit sha) and, at the end of the day, do the same things. However FetchContent/ExternalProject have the following advantages:
-
The git operations happen during the build and are interleaved with the rest of the build processing, rather than requiring an additional step managed outside of CMake.
-
The fetch, patch, and build steps for the sub project are individually named in the build, so any failures are more clearly identified
-
The build state is better contained within the build tree where it is less prone to interference by development actions.
-
For source code that is modified, FetchContent/ExternalProject can manage application of the patches making it clear what modifications are present.
-
The build does not have to make assumptions about the version control configuration of the source tree, which may be incorrect due to the fact that it is bind-mounted in a container. For example,
git submodule --initinside a container will corrupt the git configuration outside the container if the source tree is a git worktree. -
External project references and their patches are collected under a more narrow surface, rather than being spread across different tools. This makes it easier to track third part dependencies as well as to recognize them during code review.
Example:
git submodule add https://github.com/some-organization/some-project.git 3rdparty/some-project
Step 2: Integrate the package
There are many ways to integrate a package with the build through cmake.
find_package for binary packages
For binary packages (os-provided via apt-get or yum, or conan-provided), prefer the use of find_package to integrate the package into the build. Conan will generate a find-script for packages that don't already come with a Cmake configuration file and the conan-specific logic is provided through the conan-generated toolchain already used in our build.
For any packages which do not have provided find modules (either built-in, or
available from conan), please implement one in cpp/cmake/modules. Please
do not add "direct" invocations of find_library / add_library / find_file
/ find_path outside of a find module the package.
Please add invocations of find_package directly in the root Cmake file.
Example:
cpp/CMakeLists.txt
find_package(NIXL)
cpp/cmake/modules/FindNIXL.cmake
...
find_library(
NIXL_LIBRARY nixl
HINTS
${NIXL_ROOT}/lib/${NIXL_TARGET_ARCH}
${NIXL_ROOT}/lib64)
...
add_library(NIXL::nixl SHARED IMPORTED)
set_target_properties(
NIXL::nixl
PROPERTIES
INTERFACE_INCLUDE_DIRECTORIES ${NIXL_INCLUDE_DIR}
IMPORTED_LOCATION ${NIXL_LIBRARY}
${NIXL_BUILD_LIBRARY}
${SERDES_LIBRARY}
)
FetchContent for source packages with compatible cmake builds
For source packages that have a compatible cmake (e.g. where add_subdirectory will work correctly), please use FetchContent to download the sources and integrate them into the build. Please add new invocations of FetchContent_Declare in 3rdparty/CMakeLists.txt. Add new invocations for FetchContent_MakeAvailable wherever it makes sense in the build where you are integrating it, but prefer the root listfile for that build (cpp/CMakeLists.txt for the primary build).
CODEOWNERS for this file will consist of PLC reviewers who verify that third-party license compliance strategies are being followed.
If the dependency you are adding has modified sources, please do the following:
-
Create a repository on gitlab to mirror the upstream source files. If the upstream is also in git, please use the gitlab "mirror" repository option. Otherwise, please use branches/tags to help identify the upstream source versions.
-
Track nvidia changes in a branch. Use a linear sequence (trunk-based) development strategy. Use meaningful, concise commit message subjects and comprehensive commit messages for the changes applied.
-
Use
git format-patch \<upstream-commit\>\...HEADto create a list of patches, one file per commit, -
Add your patches under 3rdparty/patches/<package-name>
-
Use CMake's PATCH_COMMAND option to apply the patches during the build process.
Example:
3rdparty/CMakeLists.txt
FetchContent_Declare(
pybind11
GIT_REPOSITORY https://github.com/pybind/pybind11.git
GIT_TAG f99ffd7e03001810a3e722bf48ad1a9e08415d7d
)
cpp/CmakeLists.txt
FetchContent_MakeAvailable(pybind11)
ExternalProject
If the package you are adding doesn't support FetchContent (e.g. if it's not built by CMake or if its CMake configuration doesn't nest well), then please use ExternalProject. In this case that project's build system will be invoked as a build step of the primary build system. Note that, unless both the primary and child build systems are GNU Make, they will not share a job server and will independently schedule parallelism (e.g. -j flags).
Example:
ExternalProject_Add(
nvshmem_project
URL https://developer.download.nvidia.com/compute/nvshmem/redist/libnvshmem/linux-x86_64/libnvshmem-linux-x86_64-3.2.5_cuda12-archive.tar.xz
URL_HASH ${NVSHMEM_URL_HASH}
PATCH_COMMAND patch -p1 --forward --batch -i
${DEEP_EP_SOURCE_DIR}/third-party/nvshmem.patch
...
CMAKE_CACHE_ARGS
-DCMAKE_C_COMPILER:STRING=${CMAKE_C_COMPILER}
-DCMAKE_C_COMPILER_LAUNCHER:STRING=${CMAKE_C_COMPILER_LAUNCHER}
...
BINARY_DIR ${CMAKE_CURRENT_BINARY_DIR}/nvshmem-build
BUILD_BYPRODUCTS
${CMAKE_CURRENT_BINARY_DIR}/nvshmem-build/src/lib/libnvshmem.a
)
add_library(nvshmem_project::nvshmem STATIC IMPORTED)
add_dependencies(nvshmem_project::nvshmem nvshmem_project)
...
set_target_properties(
nvshmem_project::nvshmem
PROPERTIES IMPORTED_LOCATION
${CMAKE_CURRENT_BINARY_DIR}/nvshmem-build/src/lib/libnvshmem.a
INTERFACE_INCLUDE_DIRECTORIES
${CMAKE_CURRENT_BINARY_DIR}/nvshmem-build/src/include)
Step 3: Update third-party attributions and license tracking
-
Clone the dependency source code to an NVIDIA-controlled repository. The consumed commit must be stored as-received (ensure the consumed commit-sha is present in the clone). For sources available via git (or git-adaptable) SCM, mirror the repository in the oss-components gitlab project.
-
Collect the license text of the consumed commit
-
If the license does not include a copyright notice, collect any copyright notices that were originally published with the dependency (these may be on individual file levels, in metadata files, or in packaging control files).
-
Add the license and copyright notices to the ATTRIBUTIONS-CPP-x86_64.md and ATTRIBUTIONS-CPP-aarch64.md files
CODEOWNERS for ATTRIBUTIONS-CPP-*.md are members of the PLC team and modifying this file will signal to reviewers that they are verifying that your change follows the process in this document.
Step 4: File a JIRA ticket if you need help from the Security team
This step is optional, if you need assistance from the Security team.
File a Jira ticket using the issue template TRTLLM-8383 to request inclusion of this new dependency and initiate license and/or security review. The Security Team will triage and assign the ticket.
If you don’t have access to the JIRA project, please email the Security Team.