Commit Graph

88 Commits

Author SHA1 Message Date
Linda
898f37faa0
[None][feat] Enable nanobind as the default binding library (#6608)
Signed-off-by: Linda-Stadter <57756729+Linda-Stadter@users.noreply.github.com>
2025-08-22 09:48:41 +02:00
Jin Li
e5e417019b
[None][chore] Only check the bindings lib for current build (#7026)
Signed-off-by: Jin Li <59594262+liji-nv@users.noreply.github.com>
2025-08-20 14:17:17 -04:00
Martin Marciniszyn Mehringer
425dad01fd
[None][fix] Clean up linking to CUDA stub libraries in build_wheel.py (#6823)
Signed-off-by: Linda-Stadter <57756729+Linda-Stadter@users.noreply.github.com>
Signed-off-by: Martin Marciniszyn Mehringer <11665257+MartinMarciniszyn@users.noreply.github.com>
Co-authored-by: Linda-Stadter <57756729+Linda-Stadter@users.noreply.github.com>
2025-08-18 11:20:51 -04:00
QI JUN
8845e0f065
[None][fix] fix ci (#6814) 2025-08-12 02:21:50 -07:00
Zhenhua Wang
7e33ed6d61
[None][chore] always try-catch when clear build folder in build_wheel.py (#6748)
Signed-off-by: Zhenhua Wang <zhenhuaw@nvidia.com>
2025-08-11 14:02:17 +02:00
Martin Marciniszyn Mehringer
9a8195ef88
fix: Ensure that Python stub generation works against libnvidia-ml stubs (#6188)
Signed-off-by: Martin Marciniszyn Mehringer <11665257+MartinMarciniszyn@users.noreply.github.com>
2025-08-11 09:18:17 +02:00
Zongfei Jing
0ff8df95b7
[https://nvbugs/5433581][fix] DeepGEMM installation on SBSA (#6588)
Signed-off-by: Zongfei Jing <20381269+zongfeijing@users.noreply.github.com>
2025-08-06 16:44:21 +08:00
Zhenhua Wang
59d91b8b94
[None][chore] add online help to build_wheel.py and fix a doc link (#6391)
Signed-off-by: Zhenhua Wang <zhenhuaw@nvidia.com>
2025-08-04 13:14:55 +08:00
pcastonguay
e7ae5e2824
feat: Add support for disaggregation with pp with pytorch backend (#6369)
Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>
Signed-off-by: raayandhar <rdhar@nvidia.com>
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
Signed-off-by: pcastonguay <55748270+pcastonguay@users.noreply.github.com>
Co-authored-by: raayandhar <rdhar@nvidia.com>
Co-authored-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
2025-07-30 09:42:13 -04:00
Zhanrui Sun
c3729dbd7d
infra: [TRTLLM-5873] Use build stage wheels to speed up docker release image build (#4939)
Signed-off-by: ZhanruiSunCh <184402041+ZhanruiSunCh@users.noreply.github.com>
2025-07-29 12:54:38 -04:00
Martin Marciniszyn Mehringer
943fd418dd
fix: Ensure mlx5 library is installed for deep_ep and remove deprecated python bindings (#6189)
Signed-off-by: Martin Marciniszyn Mehringer <11665257+MartinMarciniszyn@users.noreply.github.com>
2025-07-20 10:38:51 +08:00
Venky
22d4a8c48a
enh: Add script to map tests <-> jenkins stages & vice-versa (#5177)
Signed-off-by: Venky Ganesh <23023424+venkywonka@users.noreply.github.com>
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
2025-07-19 00:50:40 +08:00
Zhenhuan Chen
8c1c9ef7aa
fix: convert venv_prefix to str before comparison with base_prefix (#6121)
Signed-off-by: Zhenhuan Chen <chenzhh3671@gmail.com>
2025-07-17 15:04:54 +08:00
William Tambellini
fbb4cc7379
[TRTLLM-4770][feat] Enhance cpp executor cmake to listen to ENABLE_MU… (#5104)
...LTI_DEVICE

Signed-off-by: William Tambellini <wtambellini@sdl.com>
2025-07-11 10:59:44 +08:00
Linda
4d071eb2d1
feat: binding type build argument (pybind, nanobind) (#5802)
Signed-off-by: Linda-Stadter <57756729+Linda-Stadter@users.noreply.github.com>
2025-07-11 00:48:50 +09:00
ixlmar
10e686466e
fix: use current_image_tags.properties in rename_docker_images.py (#5846)
Signed-off-by: ixlmar <206748156+ixlmar@users.noreply.github.com>
2025-07-09 17:07:52 +09:00
xavier-nvidia
b6013da198
Fix GEMM+AR fusion on blackwell (#5563)
Signed-off-by: xsimmons <xsimmons@nvidia.com>
2025-07-09 08:48:47 +08:00
Tailing Yuan
85b4a6808d
Refactor: move DeepEP from Docker images to wheel building (#5534)
Signed-off-by: Tailing Yuan <yuantailing@gmail.com>
2025-07-07 22:57:03 +09:00
Yuan Tong
32b244af38
feat: reduce unnecessary kernel generation (#5476)
Signed-off-by: Yuan Tong <13075180+tongyuantongyu@users.noreply.github.com>
2025-07-04 14:37:49 +08:00
Alessio Netti
7e681fbe52
[chore] Allow configuring linking of NVRTC wrapper (#5189)
Signed-off-by: Alessio Netti <netti.alessio@gmail.com>
Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
Co-authored-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
2025-06-26 07:26:10 +02:00
qsang-nv
faca19c2f0
update setup.py for special cases (#5227)
Signed-off-by: Qidi Sang <200703406+qsang-nv@users.noreply.github.com>
2025-06-17 16:41:07 +08:00
dongxuy04
1e369658f1
feat: large-scale EP(part 6: Online EP load balancer integration for GB200 nvfp4) (#4818)
Signed-off-by: Dongxu Yang <78518666+dongxuy04@users.noreply.github.com>
Signed-off-by: ShiXiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
Co-authored-by: ShiXiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
2025-06-08 10:25:18 +08:00
Emma Qiao
202813f054
Check test names in waive list (#4292)
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-06-01 14:39:30 +08:00
Emma Qiao
c945e92fdb
[Infra]Remove some old keyword (#4552)
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-05-31 13:50:45 +08:00
Yuan Tong
5cb4f9be33
feat: improve build_wheel.py venv handling (#4525)
Signed-off-by: Yuan Tong <13075180+tongyuantongyu@users.noreply.github.com>
2025-05-27 15:17:14 +08:00
Emma Qiao
6f626af386
[TRTLLM-4535][infra]: Add marker TIMEOUT for test level (#3905)
* Add marker for TIMEOUT

Signed-off-by: qqiao <qqiao@nvidia.com>

* Remove workspace after tests

Signed-off-by: qqiao <qqiao@nvidia.com>

* Add missed property

Signed-off-by: qqiao <qqiao@nvidia.com>

* Add some debug info

Signed-off-by: qqiao <qqiao@nvidia.com>

* Fix errors

Signed-off-by: qqiao <qqiao@nvidia.com>

* Testing

Signed-off-by: qqiao <qqiao@nvidia.com>

* Special process for unittests

Signed-off-by: qqiao <qqiao@nvidia.com>

* Move special proecessing unittests to test generating stage

Signed-off-by: qqiao <qqiao@nvidia.com>

* Process for the whole test list

Signed-off-by: qqiao <qqiao@nvidia.com>

* Test more

Signed-off-by: qqiao <qqiao@nvidia.com>

* Add another test case

Signed-off-by: qqiao <qqiao@nvidia.com>

* Change back the setting for testing

Signed-off-by: qqiao <qqiao@nvidia.com>

* Revert another config file

Signed-off-by: qqiao <qqiao@nvidia.com>

* Add descriptionf or timeout in test readme

Signed-off-by: qqiao <qqiao@nvidia.com>

---------

Signed-off-by: qqiao <qqiao@nvidia.com>
2025-05-25 23:30:40 -07:00
Shi Xiaowei
3d62727303
test: NIXL single process test (#4486) 2025-05-21 10:41:46 +08:00
Shi Xiaowei
df2798e0c3
feat: NIXL interface integration (#3934)
NIXL interfaces

Signed-off-by: ShiXiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
2025-05-19 18:18:22 +08:00
Yuan Tong
593f65ff6a
fix: better method to help torch find nvtx3 (#4110)
Signed-off-by: Yuan Tong <13075180+tongyuantongyu@users.noreply.github.com>
2025-05-15 16:42:30 +08:00
Zhanrui Sun
5dc3b539ba
infra: Down the gcc toolset version from 13 to 11 (#4114)
* Down the gcc toolset version from 13 to 11

Signed-off-by: ZhanruiSunCh <184402041+ZhanruiSunCh@users.noreply.github.com>

* Update rocky8 images

Signed-off-by: ZhanruiSunCh <184402041+ZhanruiSunCh@users.noreply.github.com>

---------

Signed-off-by: ZhanruiSunCh <184402041+ZhanruiSunCh@users.noreply.github.com>
2025-05-15 11:08:51 +08:00
qsang-nv
0fd59d64ab
infra: open source fmha v2 kernels (#4185)
* add fmha repo

Signed-off-by: Qidi Sang <200703406+qsang-nv@users.noreply.github.com>

* fix format

Signed-off-by: Qidi Sang <200703406+qsang-nv@users.noreply.github.com>

* fix code style

Signed-off-by: Qidi Sang <200703406+qsang-nv@users.noreply.github.com>

* fix header

Signed-off-by: Qidi Sang <200703406+qsang-nv@users.noreply.github.com>

* fix header kernel_traits.h

Signed-off-by: Qidi Sang <200703406+qsang-nv@users.noreply.github.com>

* add .gitignore file

Signed-off-by: Qidi Sang <200703406+qsang-nv@users.noreply.github.com>

* add SLIDING_WINDOW_ATTENTION

Signed-off-by: Qidi Sang <200703406+qsang-nv@users.noreply.github.com>

* fix style

Signed-off-by: Qidi Sang <200703406+qsang-nv@users.noreply.github.com>

* fix format

Signed-off-by: Qidi Sang <200703406+qsang-nv@users.noreply.github.com>

* update setup.py

Signed-off-by: Qidi Sang <200703406+qsang-nv@users.noreply.github.com>

* update build_wheel.py

Signed-off-by: Qidi Sang <200703406+qsang-nv@users.noreply.github.com>

---------

Signed-off-by: Qidi Sang <200703406+qsang-nv@users.noreply.github.com>
Signed-off-by: qsang-nv <200703406+qsang-nv@users.noreply.github.com>
2025-05-15 10:56:34 +08:00
Yanchao Lu
504f4bf779
[Infra] - Update the upstream PyTorch dependency to 2.7.0 (#4235)
[Infra][TRTLLM-4941] - Update the upstream PyTorch dependency to 2.7.0

Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
2025-05-14 22:28:13 +08:00
Yiqing Yan
fda8b0277a
[Infra][TRTLLM-4374] Upgrade TRT 10.10.0 GA, CUDA 12.9 GA and DLFW 25.04 (#4049)
* [TRTLLM-4374] Upgrade TRT 10.10.0 GA, CUDA 12.9 GA and DLFW 25.04

Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>

* fix review

Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>

* update images

Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>

* Update jenkins/L0_Test.groovy

Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>

* update image name

Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>

---------

Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
2025-05-13 14:59:12 +08:00
Martin Marciniszyn Mehringer
33977dbd42
infra: [TRTLLM-325] Prepare for NGC release - multiplatform build (#4191)
* infra: [TRTLLM-325] Prepare for NGC release - prepare multiplatform build

Signed-off-by: Martin Marciniszyn Mehringer <11665257+MartinMarciniszyn@users.noreply.github.com>
2025-05-12 00:38:45 -07:00
Martin Marciniszyn Mehringer
d0e672f96d
chore: [TRTLLM-325][infra] Prepare for NGC release - reduce size of the docker images (#3990)
* chore: reduce size of the docker images

Signed-off-by: Martin Marciniszyn Mehringer <11665257+martinmarciniszyn@users.noreply.github.com>

* Finish the renaming script and run with new images.

Signed-off-by: Martin Marciniszyn Mehringer <11665257+MartinMarciniszyn@users.noreply.github.com>

* Fix installation of GCC toolset for Rocky Linux

Signed-off-by: Martin Marciniszyn Mehringer <11665257+MartinMarciniszyn@users.noreply.github.com>

* Upgrade to new docker images

Signed-off-by: Martin Marciniszyn Mehringer <11665257+MartinMarciniszyn@users.noreply.github.com>

---------

Signed-off-by: Martin Marciniszyn Mehringer <11665257+martinmarciniszyn@users.noreply.github.com>
Signed-off-by: Martin Marciniszyn Mehringer <11665257+MartinMarciniszyn@users.noreply.github.com>
2025-05-09 19:31:29 +08:00
Yuan Tong
4b6c19737b
feat: support add internal cutlass kernels as subproject (#3658)
Signed-off-by: Yuan Tong <13075180+tongyuantongyu@users.noreply.github.com>
2025-05-06 11:35:07 +08:00
tburt-nv
0aca05514a
build: keep using system python for dev install (#4014)
Signed-off-by: Tyler Burt <195370667+tburt-nv@users.noreply.github.com>
2025-05-03 07:38:47 +02:00
tburt-nv
7053d0ad5a
infra: add conan (#3744)
This MR integrates Conan into the build system, so that it can be used to fetch dependencies in future changes.

Also installs all requirements-dev.txt inside a virtualenv instead of the system, since some of Conan's dependencies may conflict with the system packages. Virtualenv is used instead of venv because the triton server backend container has only virtualenv installed. This also allows developers to cache the requirements-dev.txt packages between container launches.


Signed-off-by: Tyler Burt <195370667+tburt-nv@users.noreply.github.com>
2025-04-30 11:53:14 -07:00
Ming Wei
ed887940d4
infra: open source XQA kernels (#3762)
Replace libtensorrt_llm_nvrtc_wrapper.so with its source code, which
consists of two parts:

1. NVRTC glue code
2. XQA kernel code

During TensorRT-LLM build, XQA kernel code is embedded as C++ arries via
gen_cpp_header.py and passed to NVRTC for JIT compilation.

Signed-off-by: Ming Wei <2345434+ming-wei@users.noreply.github.com>
2025-04-30 18:05:15 +08:00
Emma Qiao
48db263d9a
infra: Add test list name check (#3097)
* Add steps to check test names

Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>

* Correct test-db command

Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>

* Switch to use a trt-llm image

Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>

* Update go path

Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>

* Correct go path

Signed-off-by: qqiao <qqiao@nvidia.com>
Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>

* Move the test list check to test ci

Signed-off-by: qqiao <qqiao@nvidia.com>
Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>

* Correct file path

Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>

* Fix path again

Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>

* Fix get path

Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>

* Fix typo

Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>

* Skip test list check for ARM

Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>

* Fix expression

Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>

* Change back unrelated file

Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>

* Correct qa test names

Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>

* Remove a stage

Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>

* Update jenkins/L0_Test.groovy

Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
Signed-off-by: Emma Qiao <qqiao@nvidia.com>
Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>

* Move some steps to a python script

Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>

* Fix script path

Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>

* Split commands and debug

Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>

* Fix typo

Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>

* Fix typo

Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>

* Also correct case name in waives list

Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>

* Move check script to another folder

Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>

* Update qa list after rebase

Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>

* Fix rebase

Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>

* Remove the perf tests under QA

Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>

* Some tests already fixed after rebase to TOT

Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>

---------

Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>
Signed-off-by: qqiao <qqiao@nvidia.com>
Signed-off-by: Emma Qiao <qqiao@nvidia.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
2025-04-20 23:02:16 +08:00
Zheng Duan
bce7ea8c38
test: add kv cache event tests for disagg workers (#3602) 2025-04-18 18:30:19 +08:00
Yuan Tong
0b0e6d8a0a
refactor: Clean up CMakeLists.txt (#3479)
Signed-off-by: Yuan Tong <13075180+tongyuantongyu@users.noreply.github.com>
2025-04-18 14:39:29 +08:00
Emma Qiao
2f48985b9c
infra: Add step to generate new duration file (#3298)
* Add step to generate new duration file

Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>

* Install python in earlier step

Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>

* Clone repo and add debug info

Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>

* Remove debug info and only generate duration for post-merge

Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>

* Test for the new duration file

Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>

* Update the duration file format

Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>

* Move generate_duration.py to scripts folder and add try-catch avoiding any broken

Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>

---------

Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>
2025-04-18 12:56:31 +08:00
Kaiyu Xie
258ae9c58c
Revert "infra: move nvrtc_wrapper to conan (#3282)" (#3573)
This reverts commit c0dd6cbce0.

Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
2025-04-15 22:45:13 +08:00
tburt-nv
c0dd6cbce0
infra: move nvrtc_wrapper to conan (#3282)
* add pip scripts dir to path
* move nvrtc_wrapper to conan
* support building nvrtc wrapper from source

---------

Signed-off-by: Tyler Burt <195370667+tburt-nv@users.noreply.github.com>
2025-04-15 05:31:01 +08:00
Yuan Tong
5985d362a9
fix: install RTC headers with linking when using --linking_install_binary (#3484)
Signed-off-by: Yuan Tong <13075180+tongyuantongyu@users.noreply.github.com>
2025-04-14 21:22:03 +08:00
tburt-nv
5616c0d232
add precommit check to github actions (#3129)
Signed-off-by: Tyler Burt <195370667+tburt-nv@users.noreply.github.com>
2025-04-11 06:40:53 +08:00
Gabriel Wu
376731013d
feat: use NVRTC for DeepGEMM JIT compilation (#3239)
* feat: use NVRTC for DeepGEMM JIT compilation

Signed-off-by: Zihua Wu 

* fix: add license

Signed-off-by: Zihua Wu

* feat: store NVRTC JIT results in memory by default

Signed-off-by: Zihua Wu


* feat: refinement

Signed-off-by: Zihua Wu

* feat: refinement

Signed-off-by: Zihua Wu

* test: set timeout to 7200

Signed-off-by: Zihua Wu

---------

Signed-off-by: Zihua Wu
2025-04-07 20:29:23 +08:00
Ming Wei
ca6615d800
Remove gen_cuda_headers_for_xqa.py (#3222)
No longer needed.
2025-04-03 07:13:22 +08:00
xiweny
6979afa6f2
test: reorganize tests folder hierarchy (#2996)
1. move TRT path tests to 'trt' folder
2. optimize some import usage
2025-03-27 12:07:53 +08:00