Yiqing Yan
|
b8fef809ae
|
[Infra] - Waive L0 test (#5748)
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
|
2025-07-04 15:04:49 +08:00 |
|
Yi Zhang
|
73d30a23c7
|
test: add more tests for GB200 with 8 GPUs/2 nodes in L0 tests (#5397)
Signed-off-by: Yi Zhang <187001205+yizhang-nv@users.noreply.github.com>
|
2025-07-04 13:14:13 +08:00 |
|
Zheng Duan
|
cb9f596dbe
|
[nvbug 5300551] test: increase block count in eviction test (#5465)
Signed-off-by: zhengd-nv <200704041+zhengd-nv@users.noreply.github.com>
|
2025-07-04 13:14:13 +08:00 |
|
xinhe-nv
|
7f837b6e8b
|
tests: waive failures on main (#5704)
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
|
2025-07-04 12:39:12 +09:00 |
|
Venky
|
4762e0b244
|
Waive tests : test_openai_lora, test_trtllm_serve_lora_example and test_openai_chat_structural_tag_example (#5740)
Signed-off-by: Venky <23023424+venkywonka@users.noreply.github.com>
|
2025-07-04 11:01:08 +09:00 |
|
Netanel Haber
|
f91379b7e8
|
delete duplicate eagle3 and ngram tests (#5711)
Signed-off-by: Netanel Haber <58652339+netanel-haber@users.noreply.github.com>
|
2025-07-03 15:47:26 +03:00 |
|
Omer Ullman Argov
|
c72856188c
|
[ci] small multigpu speedups (#5643)
Signed-off-by: Omer Ullman Argov <118735753+omera-nv@users.noreply.github.com>
|
2025-07-03 08:06:10 -04:00 |
|
Emma Qiao
|
530897388c
|
[Infra] - Waive a failed case on main (#5702)
Signed-off-by: qqiao <qqiao@nvidia.com>
|
2025-07-03 06:09:27 -04:00 |
|
Emma Qiao
|
2a5fdebf10
|
[Infra] - Waive failed tests for main 0702 (#5671)
Signed-off-by: qqiao <qqiao@nvidia.com>
|
2025-07-02 22:05:07 -04:00 |
|
Emma Qiao
|
31699cbeb1
|
[Infra] - Set default timeout to 1hr and remove some specific settings (#5667)
Signed-off-by: qqiao <qqiao@nvidia.com>
|
2025-07-02 08:37:54 -04:00 |
|
Kaiyu Xie
|
f9a455651b
|
perf: Use tokenizers API to optimize incremental detokenization perf (#5574)
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
|
2025-07-01 09:35:25 -04:00 |
|
Yan Chunwei
|
3bc703d450
|
ci: unwaive llmapi launch test (#5281)
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
|
2025-07-01 20:12:55 +08:00 |
|
brb-nv
|
4ef60d5fbb
|
nvbugs-5331031; nvbugs-5344203 - address intermittent issues with Mistral Small multimodal for BS=8 (#5453)
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
|
2025-07-01 20:12:55 +08:00 |
|
Yan Chunwei
|
a5eff139f1
|
[TRTLLM-5277] chore: refine llmapi examples for 1.0 (part1) (#5431)
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
Signed-off-by: Erin Ho <14718778+hchings@users.noreply.github.com>
Co-authored-by: Erin Ho <14718778+hchings@users.noreply.github.com>
|
2025-07-01 19:06:41 +08:00 |
|
Emma Qiao
|
65c2b93284
|
[Infra] - Add some timeout and unwaive a test which dev fixed (#5631)
Signed-off-by: qqiao <qqiao@nvidia.com>
|
2025-07-01 05:01:32 -04:00 |
|
Pamela Peng
|
071ad758c4
|
[https://nvbugs/5318059][test] Unwaive test (#5624)
Signed-off-by: Pamela Peng <179191831+pamelap-nvidia@users.noreply.github.com>
|
2025-07-01 04:54:44 -04:00 |
|
xinhe-nv
|
19c56f0374
|
test: [CI] Add failed cases into waives.txt (#5582)
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
|
2025-07-01 14:57:03 +08:00 |
|
xinhe-nv
|
a8cf611baa
|
test: [CI] Add failed cases into waives.txt (#5569)
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
|
2025-07-01 11:02:56 +08:00 |
|
xinhe-nv
|
9b17b29b6e
|
test: [CI] remove closed bugs (#5572)
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
|
2025-07-01 10:15:43 +08:00 |
|
Omer Ullman Argov
|
42134b8b84
|
[ci] move eagle1 and medusa tests to post-merge (#5604)
Signed-off-by: Omer Ullman Argov <118735753+omera-nv@users.noreply.github.com>
|
2025-06-30 19:32:28 +08:00 |
|
Fanrong Li
|
6cbc9a5297
|
[nvbug/5354946][fix] Fix mtp vanilla draft inputs (#5568)
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
|
2025-06-30 15:59:12 +08:00 |
|
Yiqing Yan
|
4fef14da56
|
Deduplicate waive list (#5546)
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
|
2025-06-30 11:12:26 +08:00 |
|
Talor Abramovich
|
70e34a3291
|
[TRTLLM-5831][feat] Add LoRA support for pytorch backend in trtllm-serve (#5376)
Signed-off-by: Talor Abramovich <talora@nvidia.com>
|
2025-06-29 12:46:30 +00:00 |
|
amirkl94
|
a985c0b7e6
|
tests: Move stress tests to be Post-Merge only (#5166)
Signed-off-by: Amir Klein <203507526+amirkl94@users.noreply.github.com>
|
2025-06-29 09:44:47 +03:00 |
|
Iman Tabrizian
|
26b953e29a
|
[nvbugs/5309940] Add support for input output token counts (#5445)
Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
|
2025-06-28 04:39:39 +08:00 |
|
wili
|
56cdfe5c6c
|
[TRTLLM-5000][feat] NGrams V2 (#4569)
Signed-off-by: wili-65535 <wili-65535@users.noreply.github.com>
Co-authored-by: wili-65535 <wili-65535@users.noreply.github.com>
|
2025-06-27 23:00:17 +08:00 |
|
Iman Tabrizian
|
49af791f66
|
Add testing for trtllm-llmapi-launch with tritonserver (#5528)
Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
|
2025-06-27 11:19:52 +08:00 |
|
xinhe-nv
|
a3494bebec
|
tests: waive failed tests on main (#5512)
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
Co-authored-by: Larry <197874197+LarryXFly@users.noreply.github.com>
|
2025-06-27 10:13:22 +08:00 |
|
Frank
|
aa6e015ef8
|
Update trtllm-bench to support new Pytorch default. (#5491)
Signed-off-by: Frank Di Natale <3429989+FrankD412@users.noreply.github.com>
|
2025-06-26 17:05:43 -07:00 |
|
jmydurant
|
8836990bde
|
[TRTLLM-3602][feat] support nvfp4 model and fp8 kv cache for MLA chunked prefill (Blackwell) (#5475)
Signed-off-by: Mingyang Jiang <13463932+jmydurant@users.noreply.github.com>
|
2025-06-26 22:18:08 +08:00 |
|
Omer Ullman Argov
|
6bae76d7ca
|
[fix][ci] move torch tests to run under torch stage (#5473)
Signed-off-by: Omer Ullman Argov <118735753+omera-nv@users.noreply.github.com>
|
2025-06-26 14:31:38 +03:00 |
|
Omer Ullman Argov
|
1633bd2bef
|
[CI] move flashinfer llama tests to post merge (#5506)
Signed-off-by: Omer Ullman Argov <118735753+omera-nv@users.noreply.github.com>
|
2025-06-26 19:27:32 +08:00 |
|
xinhe-nv
|
ff2dd72df4
|
tests: waive tests (#5458)
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
Co-authored-by: Larry <197874197+LarryXFly@users.noreply.github.com>
|
2025-06-26 14:53:55 +08:00 |
|
Emma Qiao
|
32d1573c43
|
[Infra] - Add timeout setting for long tests found in post-merge (#5501)
Signed-off-by: qqiao <qqiao@nvidia.com>
|
2025-06-26 11:31:39 +08:00 |
|
Venky
|
d9b75f83fd
|
[CI] Waive test_fp8_block_scales_4gpus[ep4-mtp_nextn=0-fp8kv=True-attention_dp=True-cuda_graph=True-overlap_scheduler=True-torch_compile=False] (#5494)
Signed-off-by: Venky <23023424+venkywonka@users.noreply.github.com>
|
2025-06-25 20:17:12 -07:00 |
|
jmydurant
|
578dbc8d9a
|
feat: chunked prefill for MLA (Blackwell) (#4651)
Signed-off-by: Mingyang Jiang <13463932+jmydurant@users.noreply.github.com>
|
2025-06-26 09:01:00 +08:00 |
|
HuiGao-NV
|
74ae15a26b
|
CI: enable test cases on single device type (#5484)
Signed-off-by: Hui Gao <huig@nvidia.com>
|
2025-06-26 08:03:44 +08:00 |
|
QI JUN
|
feaf789342
|
CI: reduce BF16 test cases in B200 (#5482)
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
|
2025-06-26 07:18:20 +08:00 |
|
HuiGao-NV
|
cc3c2b3be2
|
Move 3 disaggregated cases from 4 GPUs devices to 1 GPU device (#5457)
Signed-off-by: Hui Gao <huig@nvidia.com>
|
2025-06-25 21:38:14 +08:00 |
|
Kaiyu Xie
|
d6ada5ffce
|
[nvbug/5354956] fix: unexpected keyword argument 'streaming' (#5436)
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
|
2025-06-25 20:37:24 +08:00 |
|
Netanel Haber
|
3ca2f6ac51
|
start OAIServer with max_beam_width=1 for TorchSampler (#5427)
Signed-off-by: Netanel Haber <nhaber@nvidia.com>
|
2025-06-25 15:52:06 +08:00 |
|
Enwei Zhu
|
fc7a81ceb0
|
test: Add LLGuidance test and refine guided decoding (#5348)
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
|
2025-06-25 14:12:56 +08:00 |
|
Enwei Zhu
|
76da7fed86
|
fix (NvBug 5354925): Fix static EPLB (#5411)
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
|
2025-06-25 13:14:40 +08:00 |
|
dongxuy04
|
699520082b
|
Add MTP support for Online EPLB (#5213)
Signed-off-by: Dongxu Yang <78518666+dongxuy04@users.noreply.github.com>
|
2025-06-25 07:58:13 +08:00 |
|
Emma Qiao
|
475272046a
|
[Infra] - Waive failed tests in post-merge and increase some timeout setting (#5424)
Signed-off-by: qqiao <qqiao@nvidia.com>
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
|
2025-06-24 17:19:31 +08:00 |
|
xinhe-nv
|
658fb5b54e
|
tests: update benchmark test lists (#5365)
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
|
2025-06-24 15:23:38 +08:00 |
|
xinhe-nv
|
4b32a3f1a7
|
test: [CI] remove closed bugs (#5400)
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
|
2025-06-24 13:39:57 +08:00 |
|
Fanrong Li
|
5d4ab47d5b
|
fix: refactor and fix mtp vanilla (#4762)
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
|
2025-06-20 05:23:39 +08:00 |
|
Kaiyu Xie
|
7246fd75d1
|
feat: Support stream_interval (#5284)
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
|
2025-06-19 21:57:10 +08:00 |
|
Enwei Zhu
|
bca758fce1
|
fix: Fix DS-R1 nvfp4 test case naming (#5361)
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
|
2025-06-19 15:50:43 +08:00 |
|