Yanchao Lu
|
a07bb163f7
|
[None][ci] Correct docker args for GPU devices and remove some stale CI codes (#7417)
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
|
2025-09-02 04:06:51 -04:00 |
|
Yiqing Yan
|
486bc763c3
|
[None][infra] Split DGX_B200 stage into multiple parts and pre-/post-merge (#7074)
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
|
2025-08-24 21:09:04 -04:00 |
|
Yanchao Lu
|
ec35481b0a
|
[None][infra] Prepare for single GPU GB200 test pipeline (#7073)
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
|
2025-08-24 21:46:39 +08:00 |
|
Yiqing Yan
|
4763e94156
|
[TRTLLM-5563][infra] Move test_rerun.py to script folder (#6571)
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
|
2025-08-04 13:26:04 +08:00 |
|
Yiqing Yan
|
3f7abf87bc
|
[TRTLLM-6224][infra] Upgrade dependencies to DLFW 25.06 and CUDA 12.9.1 (#5678)
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
|
2025-08-03 11:18:59 +08:00 |
|
Yiqing Yan
|
d38c26bb78
|
[Infra][TRTLLM-5633] - Fix merge waive list (#6504)
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
|
2025-07-31 14:57:51 +08:00 |
|
Yiqing Yan
|
0cf2f6f154
|
[TRTLLM-5633] - Merge current waive list with the TOT waive list (#5198)
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
|
2025-07-30 17:50:05 +08:00 |
|
Emma Qiao
|
1cc49494fe
|
[Infra] - Add wiave list for pytest when using slurm (#6130)
Signed-off-by: qqiao <qqiao@nvidia.com>
|
2025-07-17 16:53:15 +08:00 |
|
yuanjingx87
|
a1c5704055
|
[feat] Multi-node CI testing support via Slurm (#4771)
Signed-off-by: Yuanjing Xue <197832395+yuanjingx87@users.noreply.github.com>
Signed-off-by: yuanjingx87 <197832395+yuanjingx87@users.noreply.github.com>
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
|
2025-06-19 01:11:12 +08:00 |
|