Grzegorz Kwasniewski
38bcee189c
[TRTLLM-10362][feat] Added Mamba and MLA layers to the sharding tests ( #10364 )
...
Signed-off-by: greg-kwasniewski1 <213329731+greg-kwasniewski1@users.noreply.github.com>
Signed-off-by: Grzegorz Kwasniewski <213329731+greg-kwasniewski1@users.noreply.github.com>
2026-01-28 10:34:10 +01:00
Linda
ce556290c9
[None][chore] Removing pybind11 bindings and references ( #10550 )
...
Signed-off-by: Linda-Stadter <57756729+Linda-Stadter@users.noreply.github.com>
2026-01-26 08:19:12 -05:00
Bo Deng
338b29d5ae
[None][infra] trigger multi-gpu tests when install_nixl/ucx.sh is mod… ( #10624 )
...
Signed-off-by: Bo Deng <deemod@nvidia.com>
2026-01-20 17:55:32 +08:00
chenfeiz0326
e97af45556
[TRTLLM-10300][feat] Upload regression info to artifactory ( #10599 )
...
Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>
2026-01-19 10:16:31 +08:00
Yanchao Lu
c4f27fa4c0
[None][ci] Some tweaks for the CI pipeline ( #10359 )
...
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
2026-01-04 11:10:47 -05:00
Yanchao Lu
270be801aa
[None][ci] Move remaining DGX-B200 tests to LBD ( #9876 )
...
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
2025-12-28 13:55:39 +08:00
dominicshanshan
825025b137
[None][infra] Add multi gpu Ray tests into L0 merge change request list. ( #9996 )
...
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
2025-12-15 15:55:54 +08:00
Yuxian Qiu
fcda1a1442
[None][fix] disable async pp send for ray cases. ( #9959 )
...
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
2025-12-13 20:22:36 -08:00
Yiqing Yan
5065b60cd1
[None][infra] Fix mergeWaiveList stage ( #9892 )
...
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
2025-12-12 11:19:42 +08:00
Guoming Zhang
12693a526b
[None][chore] Enable L0 multi-gpus testing for Qwen3-next ( #9789 )
...
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
2025-12-10 17:11:32 +08:00
Shi Xiaowei
b050804b63
[TRTLLM-6537][infra] extend multi-gpu tests related file list ( #9614 )
...
Signed-off-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
2025-12-09 12:54:53 +08:00
Yiqing Yan
e834f04238
[TRTLLM-9579][infra] Set mergeWaiveList stage UNSTABLE when there is any issue ( #9692 )
...
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
2025-12-05 10:18:31 +08:00
Yiqing Yan
731b2eb4ef
[TRTLLM-5312][infra] Add triton trigger rules ( #6440 )
...
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
2025-12-05 07:35:04 +08:00
Yiqing Yan
8c88454fa5
[TRTLLM-7101][infra] Reuse passed tests ( #6894 )
...
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
2025-12-03 10:07:23 +08:00
Yiqing Yan
c72919980a
[TRTLLM-6768][infra] Fix params for not updating github status ( #6747 )
...
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
2025-12-01 23:51:21 +08:00
Yiqing Yan
24f5cd7493
[TRTLLM-8000][infra] Catch error in merge waive list stage ( #7289 )
...
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
2025-11-17 13:28:50 +08:00
Kaiyu Xie
04be5a704e
[None] [fix] Fix missing ActivationType issue ( #9171 )
...
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
Signed-off-by: Neta Zmora <96238833+nzmora-nvidia@users.noreply.github.com>
Co-authored-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
Co-authored-by: Neta Zmora <96238833+nzmora-nvidia@users.noreply.github.com>
2025-11-17 10:43:25 +08:00
Yiteng Niu
1ce83582f9
[None][infra] update github token name ( #8907 )
2025-11-05 00:55:28 -08:00
Yanchao Lu
da73410d3b
[None][fix] WAR for tensorrt depending on the archived nvidia-cuda-runtime-cu13 package ( #8857 )
...
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
2025-11-02 09:57:37 +08:00
Nikita Korobov
9b3d7cc3e6
[None][feat] Update TRT-LLM Gen MoE kernels ( #7970 )
...
Signed-off-by: Nikita Korobov <14355239+nekorobov@users.noreply.github.com>
2025-10-03 09:22:45 +08:00
mpikulski
fc7f78c400
[TRTLLM-8269][test] do not explicitly pass temperature=0 to select greedy sampling ( #8110 )
...
Signed-off-by: ixlmar <206748156+ixlmar@users.noreply.github.com>
2025-10-02 10:20:32 +02:00
Tracin
1f2761e67b
[None][feat] Enable gpt oss on DGX H100. ( #6775 )
...
Signed-off-by: Tracin <10434017+Tracin@users.noreply.github.com>
2025-09-23 09:35:19 -07:00
Yiqing Yan
5c616da2fd
[TRTLLM-5877][infra] Add fmha tests and auto trigger rules ( #6050 )
...
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
2025-09-09 11:33:09 +08:00
Zhanrui Sun
0de3f83805
[TRTLLM-6893][infra] Disable the x86 / SBSA build stage when run BuildDockerImage ( #6729 )
...
Signed-off-by: ZhanruiSunCh <184402041+ZhanruiSunCh@users.noreply.github.com>
Signed-off-by: Zhanrui Sun <184402041+ZhanruiSunCh@users.noreply.github.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
2025-09-04 07:20:15 -04:00
Robin Kobus
31979aefac
[None] [ci] Reorganize CMake and Python integration test infrastructure for C++ tests ( #6754 )
...
Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
2025-08-24 20:53:17 +02:00
Yiqing Yan
62d6c98d68
[TRTLLM-5633][infra] Force set changed file diff to empty string for post-merge CI ( #6777 )
...
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
2025-08-11 02:38:05 -04:00
Yiqing Yan
3e41e6c077
[TRTLLM-6892][infra] Run guardwords scan first in Release Check stage ( #6659 )
...
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
2025-08-06 23:00:15 -04:00
Yiqing Yan
98424f3186
[TRTLLM-5633][infra] Change the TOT repo to default-llm-repo for merge waive list ( #6605 )
...
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
2025-08-06 06:19:03 -04:00
Yiqing Yan
4763e94156
[TRTLLM-5563][infra] Move test_rerun.py to script folder ( #6571 )
...
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
2025-08-04 13:26:04 +08:00
Yiqing Yan
d38c26bb78
[Infra][TRTLLM-5633] - Fix merge waive list ( #6504 )
...
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
2025-07-31 14:57:51 +08:00
Yiqing Yan
0cf2f6f154
[TRTLLM-5633] - Merge current waive list with the TOT waive list ( #5198 )
...
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
2025-07-30 17:50:05 +08:00
Zhanrui Sun
64ba483656
infra: [TRTLLM-6499] Split L0_Test into two pipeline by single GPU and multi GPU(For SBSA) ( #6132 )
...
Signed-off-by: ZhanruiSunCh <184402041+ZhanruiSunCh@users.noreply.github.com>
2025-07-28 22:54:37 -04:00
Yiqing Yan
d97419805b
[TRTLLM-5312] - Add bot run rules for triton tests ( #4988 )
...
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
2025-07-25 10:31:12 +08:00
Lizhi Zhou
3e1a0fbac4
[TRTLLM-6537][infra] extend multi-gpu tests related file list ( #6139 )
...
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
2025-07-22 16:57:06 +08:00
Zhanrui Sun
3cbc23f783
infra: [TRTLLM-5250] Add sanity check stage for ngc-release images (Build wheels for devel image) ( #4656 )
...
Signed-off-by: ZhanruiSunCh <184402041+ZhanruiSunCh@users.noreply.github.com>
Signed-off-by: Zhanrui Sun <184402041+ZhanruiSunCh@users.noreply.github.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
2025-07-21 16:06:43 +08:00
Zhanrui Sun
8454640ee1
infra: fix single-GPU stage failed will not raise error ( #6165 )
...
Signed-off-by: ZhanruiSunCh <184402041+ZhanruiSunCh@users.noreply.github.com>
2025-07-18 22:39:32 +08:00
QI JUN
e821c68611
CI: update multi gpu test trigger file list ( #6131 )
...
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-07-17 14:48:23 +08:00
Zhanrui Sun
e42f5a9581
infra: [TRTLLM-5879] Spilt single GPU test and multi GPU test into 2 pipelines ( #5199 )
...
Signed-off-by: ZhanruiSunCh <184402041+ZhanruiSunCh@users.noreply.github.com>
Signed-off-by: Zhanrui Sun <184402041+ZhanruiSunCh@users.noreply.github.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
2025-07-16 18:04:04 +08:00
Yiqing Yan
6b35afaf1b
[Infra][TRTLLM-6013] - Fix stage name in single stage test rerun report ( #5672 )
...
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
2025-07-15 12:27:21 +09:00
Alex Zhang
6c30d78b78
[TRTLLM-5653][infra] Run docs build only if PR contains only doc changes ( #5184 )
...
Signed-off-by: Alex Zhang <13271672+zhanga5@users.noreply.github.com>
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
Co-authored-by: Alex Zhang <13271672+zhanga5@users.noreply.github.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
2025-07-14 21:40:33 +08:00
Tailing Yuan
035155df7c
Fix: ignore nvshmem_src_*.txz from confidentiality-scan ( #5831 )
...
Signed-off-by: Tailing Yuan <yuantailing@gmail.com>
2025-07-08 17:17:29 +09:00
Yanchao Lu
d95ae1378b
[Infra] - Always use x86 image for the Jenkins agent and few clean-ups ( #5753 )
...
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
2025-07-06 10:25:57 +08:00
ixlmar
04fa6c0cfc
[TRTLLM-6143] feat: Improve dev container tagging ( #5551 )
...
Signed-off-by: ixlmar <206748156+ixlmar@users.noreply.github.com>
2025-07-02 14:56:34 +02:00
Void
7992869798
perf: better heuristic for allreduce ( #5432 )
...
Signed-off-by: Yilin Zhang <18275976+yilin-void@users.noreply.github.com>
2025-07-01 22:56:06 -04:00
Emma Qiao
b8a568d3c6
[Infra][main] Cherry-pick from release/0.21: Update nccl to 2.27.5 ( #5539 ) ( #5587 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-06-30 18:12:08 +08:00
Omer Ullman Argov
fa0ea92dfd
[fix][ci] trigger multigpu tests for deepseek changes ( #5423 )
...
Signed-off-by: Omer Ullman Argov <118735753+omera-nv@users.noreply.github.com>
2025-06-26 14:30:00 +08:00
QI JUN
478f668dcc
CI: update multi gpu test triggering file list ( #5466 )
...
Signed-off-by: QI JUN <22017000+QiJune@users.noreply.github.com>
2025-06-25 15:51:02 +08:00
Emma Qiao
ff32caf4d7
[Infra] - Update dependencies with NGC PyTorch 25.05 and TRT 10.11 ( #4885 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
Signed-off-by: Erin Ho <14718778+hchings@users.noreply.github.com>
Signed-off-by: Emma Qiao <qqiao@nvidia.com>
Co-authored-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
Co-authored-by: Erin Ho <14718778+hchings@users.noreply.github.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
2025-06-17 23:48:34 +08:00
Tailing Yuan
0b60da2c45
feat: large-scale EP(part 7: DeepEP integration) ( #4792 )
...
Signed-off-by: Tailing Yuan <yuantailing@gmail.com>
Co-authored-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
2025-06-14 19:12:38 +08:00
Zhanrui Sun
a97f4581d2
infra: upload imageTag info to artifactory and add ngc_staging to save ngc image ( #4764 )
...
Signed-off-by: ZhanruiSunCh <184402041+ZhanruiSunCh@users.noreply.github.com>
2025-06-12 15:38:47 +08:00