Commit Graph

9 Commits

Author SHA1 Message Date
Yanchao Lu
a07bb163f7
[None][ci] Correct docker args for GPU devices and remove some stale CI codes (#7417)
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
2025-09-02 04:06:51 -04:00
Yiqing Yan
486bc763c3
[None][infra] Split DGX_B200 stage into multiple parts and pre-/post-merge (#7074)
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
2025-08-24 21:09:04 -04:00
Yanchao Lu
ec35481b0a
[None][infra] Prepare for single GPU GB200 test pipeline (#7073)
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
2025-08-24 21:46:39 +08:00
Yiqing Yan
4763e94156
[TRTLLM-5563][infra] Move test_rerun.py to script folder (#6571)
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
2025-08-04 13:26:04 +08:00
Yiqing Yan
3f7abf87bc
[TRTLLM-6224][infra] Upgrade dependencies to DLFW 25.06 and CUDA 12.9.1 (#5678)
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
2025-08-03 11:18:59 +08:00
Yiqing Yan
d38c26bb78
[Infra][TRTLLM-5633] - Fix merge waive list (#6504)
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
2025-07-31 14:57:51 +08:00
Yiqing Yan
0cf2f6f154
[TRTLLM-5633] - Merge current waive list with the TOT waive list (#5198)
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
2025-07-30 17:50:05 +08:00
Emma Qiao
1cc49494fe
[Infra] - Add wiave list for pytest when using slurm (#6130)
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-07-17 16:53:15 +08:00
yuanjingx87
a1c5704055
[feat] Multi-node CI testing support via Slurm (#4771)
Signed-off-by: Yuanjing Xue <197832395+yuanjingx87@users.noreply.github.com>
Signed-off-by: yuanjingx87 <197832395+yuanjingx87@users.noreply.github.com>
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
2025-06-19 01:11:12 +08:00