Yi Zhang
5ac92bb8ff
[nvbugs/5336321][fix] Enable attention dp = False test case, Fix TRTLLM Gen Moe workspace allocation ( #5463 )
...
Signed-off-by: Yi Zhang <187001205+yizhang-nv@users.noreply.github.com>
Signed-off-by: yizhan <187001205+yizhang-nv@users.noreply.github.com>
2025-07-04 23:23:41 +09:00
Yiqing Yan
3e44db11c9
[Infra][nvbugs/5370968] - Unwaive l0 test ( #5750 )
...
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
2025-07-04 15:27:53 +08:00
Yi Zhang
53394e0030
test: Move some of the test from post merge to pre-merge, update dgx b200 test case ( #5640 )
...
Signed-off-by: Yi Zhang <187001205+yizhang-nv@users.noreply.github.com>
2025-07-04 13:26:53 +09:00
brb-nv
2b66fe8fbd
[nvbug/5341178][fix] Fix OOM in Llama 4 accuracy test ( #5735 )
...
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-07-04 10:55:34 +08:00
Faraz
8a8d2e9901
[NVBUG:5355009] Modify check for fuse_fp4_quant on SM120 ( #5651 )
...
Signed-off-by: Faraz Khoubsirat <58580514+farazkh80@users.noreply.github.com>
Signed-off-by: peaceh <103117813+peaceh-nv@users.noreply.github.com>
Co-authored-by: peaceh-nv <103117813+peaceh-nv@users.noreply.github.com>
2025-07-03 22:08:15 +09:00
Emma Qiao
2f9d0619c3
[Infra] - Waive failed cases on release/0.21 ( #5674 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-07-02 22:23:54 -04:00
brb-nv
a3c0cf02ce
fix: Investigate Gemma3 1B decoder output discrepancy ( #5564 )
...
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-07-03 09:55:25 +08:00
bhsueh_NV
d5606b062a
fix: [ https://nvbugs/5355219 ] Fix bug of Qwen3 235B CI on dgx_gb200 ( #5602 )
...
Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>
2025-07-02 10:07:01 +08:00
Yi Zhang
aa0b9278d2
test: add more tests for GB200 with 8 GPUs/2 nodes in L0 tests ( #5397 )
...
Signed-off-by: Yi Zhang <187001205+yizhang-nv@users.noreply.github.com>
2025-07-01 01:06:47 -04:00
Zheng Duan
1824c44004
[nvbug 5300551] test: increase block count in eviction test ( #5465 )
...
Signed-off-by: zhengd-nv <200704041+zhengd-nv@users.noreply.github.com>
2025-07-01 10:48:25 +08:00
nv-guomingz
9fe1dd6be1
fix: https://nvbugs/5362398 ( #5609 )
...
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
2025-06-30 13:29:40 -04:00
Yan Chunwei
d6c81bad97
fix [nvbug5351244]: test_mpi_session submit sync/async ( #5608 )
...
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
2025-07-01 00:48:59 +08:00
Venky
4fc0666daa
[cherry-pick] [CI] Waive test_fp8_block_scales_4gpus[ep4-mtp_nextn=0-fp8kv=True-attention_dp=True-cuda_graph=True-overlap_scheduler=True-torch_compile=False] ( #5553 )
...
Signed-off-by: Venky <23023424+venkywonka@users.noreply.github.com>
2025-06-28 01:15:04 +08:00
Yan Chunwei
b78ad754c8
ci: unwaive llmapi launch test ( #5281 )
...
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
2025-06-27 14:10:45 +08:00
Emma Qiao
e2054bb2aa
[Infra][release/0.21] - waive failed tests ( #5537 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-06-27 13:58:13 +08:00
Yan Chunwei
87ead4ecbe
[nvbug 5273941] fix: broken cyclic reference detect ( #5417 )
...
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
2025-06-26 07:35:35 +08:00
Emma Qiao
b6d23d58c4
[Infra] - Waive failed tests on release/0.21 ( #5477 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-06-25 19:01:55 +08:00
HuiGao-NV
5cd87bee41
tests: Set kv cache free memory fraction in test case ( #5462 )
...
Signed-off-by: Hui Gao <huig@nvidia.com>
2025-06-25 16:27:46 +08:00
ruodil
5e50fcc51b
test: set enable_attention_dp=True in default deepseek settings ( #5461 )
...
Signed-off-by: ruodil <200874449+ruodil@users.noreply.github.com>
Co-authored-by: Larry <197874197+LarryXFly@users.noreply.github.com>
2025-06-25 14:21:14 +08:00
brb-nv
32f50ded17
nvbugs-5331031; nvbugs-5344203 - address intermittent issues with Mistral Small multimodal for BS=8 ( #5453 )
...
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-06-25 11:45:14 +08:00
Ivy Zhang
9e110b2d11
tests: fix typos in qa test ( #5421 )
...
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
2025-06-25 10:42:34 +08:00
Yi Zhang
2d5e202484
fix: Fix skip by mpi size fixture ( #5355 )
...
Signed-off-by: Yi Zhang <187001205+yizhang-nv@users.noreply.github.com>
2025-06-22 02:51:01 +08:00
Emma Qiao
8686805a3b
[Infra]cherry pick sanity check yml change for 5080 and 5090 from main ( #5363 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-06-19 15:33:57 +08:00
ruodil
e87cf62c12
tests: cherry-pick from main branch, add qwen3 test cases and amend test name in perf test ( #5357 )
...
Signed-off-by: ruodil <200874449+ruodil@users.noreply.github.com>
2025-06-19 14:34:05 +08:00
Yiqing Yan
da576bcafa
Waive L0 test ( #5349 )
...
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
2025-06-19 12:01:11 +08:00
Fanrong Li
6c3210a8be
[test] add nvfp4 DeepSeek-V3-Lite-mtp tests ( #5125 )
...
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
2025-06-19 09:48:22 +08:00
nv-guomingz
6a388b105a
chore: remove torch_compile prefix for TorchCompileConfig field members ( #5261 )
...
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
2025-06-19 09:21:51 +08:00
Yan Chunwei
3946e798db
fix[nvbug5298640]: trtllm-llmapi-launch multiple LLM instances ( #4727 )
...
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
2025-06-19 06:13:53 +08:00
Omer Ullman Argov
0b6d005ef6
[fix][test] clear cuda cache before unittests automatically ( #5121 )
...
Signed-off-by: Omer Ullman Argov <118735753+omera-nv@users.noreply.github.com>
2025-06-19 00:36:53 +03:00
Aurelien Chartier
d25f93c07f
chore: skip test_llm_gpt2_medium_fp8 for fp8_pc_pt + quant_lm_head ( #5293 )
...
Signed-off-by: Aurelien Chartier <2567591+achartier@users.noreply.github.com>
2025-06-18 11:13:12 -07:00
Omer Ullman Argov
5010f8719d
[fix][test] remove duplicate test runs ( #5241 )
...
Signed-off-by: Omer Ullman Argov <118735753+omera-nv@users.noreply.github.com>
2025-06-19 01:59:54 +08:00
Omer Ullman Argov
a28a152001
[fix][test] remove some cpp test cases from h100 ( #5335 )
...
Signed-off-by: Omer Ullman Argov <118735753+omera-nv@users.noreply.github.com>
2025-06-18 20:40:26 +03:00
yuanjingx87
a1c5704055
[feat] Multi-node CI testing support via Slurm ( #4771 )
...
Signed-off-by: Yuanjing Xue <197832395+yuanjingx87@users.noreply.github.com>
Signed-off-by: yuanjingx87 <197832395+yuanjingx87@users.noreply.github.com>
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
2025-06-19 01:11:12 +08:00
Iman Tabrizian
e5ee5c5352
Unwaive disaggregated serving accuracy tests ( #5095 )
...
Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
Signed-off-by: Iman Tabrizian <10105175+Tabrizian@users.noreply.github.com>
2025-06-19 00:41:15 +08:00
HuiGao-NV
d13d2f460d
Remove duplicated test cases ( #5323 )
...
Signed-off-by: Hui Gao <huig@nvidia.com>
Signed-off-by: Hui Gaoâ <huig@nvidia.com>
Co-authored-by: QI JUN <22017000+QiJune@users.noreply.github.com>
2025-06-18 21:20:20 +08:00
Emma Qiao
b29ac5b561
[Infra] Update 5080 and 5090 case condition due to the driver update ( #5317 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-06-18 20:01:36 +08:00
xinhe-nv
610a49f117
tests: add multi nodes tests ( #5196 )
...
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2025-06-18 18:08:04 +08:00
Yi Zhang
375dd0b971
Waive L0 ( #5311 )
...
Signed-off-by: Yi Zhang <187001205+yizhang-nv@users.noreply.github.com>
2025-06-18 16:40:41 +08:00
Yuan Tong
f599ee63c1
test: correct unittest rerun behavior ( #5273 )
...
Signed-off-by: Yuan Tong <13075180+tongyuantongyu@users.noreply.github.com>
2025-06-18 16:37:19 +08:00
Robin Kobus
38547b92f3
refactor: Introduce ResourceManagerType enum for resource management ( #5246 )
...
Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
2025-06-18 09:55:59 +02:00
Wanli Jiang
3a02489e86
[TRTLLM-5758] test: Add Bielik-11B-v2.2 Model Support ( #5159 )
...
Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>
2025-06-18 15:12:49 +08:00
QI JUN
9ea7bb67a4
CI: fix TensorRT H200 tests ( #5301 )
...
Signed-off-by: QI JUN <22017000+QiJune@users.noreply.github.com>
2025-06-18 14:40:57 +08:00
ruodil
3b5d916250
test: cherry-pick deepseek rcca cases in main branch ( #5307 )
...
Signed-off-by: ruodil <200874449+ruodil@users.noreply.github.com>
Co-authored-by: Larry <197874197+LarryXFly@users.noreply.github.com>
2025-06-18 14:26:26 +08:00
Yiqing Yan
8f67e3604d
Waive L0 tests ( #5308 )
...
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
2025-06-18 12:43:45 +08:00
Omer Ullman Argov
f501ce57b1
[fix][test] move deepseek single gpu tests to post merge ( #5280 )
...
Signed-off-by: Omer Ullman Argov <118735753+omera-nv@users.noreply.github.com>
2025-06-18 06:59:39 +03:00
dominicshanshan
3c0fecbf42
CI: extend model weights load time for dsv3 in stress test. ( #5275 )
...
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
2025-06-18 11:51:48 +08:00
Ivy Zhang
41cfcaa964
test: update qa test list ( #5305 )
...
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
2025-06-18 11:29:11 +08:00
Emma Qiao
ff32caf4d7
[Infra] - Update dependencies with NGC PyTorch 25.05 and TRT 10.11 ( #4885 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
Signed-off-by: Erin Ho <14718778+hchings@users.noreply.github.com>
Signed-off-by: Emma Qiao <qqiao@nvidia.com>
Co-authored-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
Co-authored-by: Erin Ho <14718778+hchings@users.noreply.github.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
2025-06-17 23:48:34 +08:00
QI JUN
f899c4d294
Re-implement LlmResponse in Python to reduce host overhead of pybind ( #5224 )
...
Signed-off-by: QI JUN <22017000+QiJune@users.noreply.github.com>
2025-06-17 21:28:09 +08:00
Yanchao Lu
f4cdbfcdf0
None - Some clean-ups for the automation pipeline ( #5245 )
...
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
2025-06-17 21:08:24 +08:00