xinhe-nv
|
89bbb230cc
|
tests: waive failed cases on main (#5781)
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
|
2025-07-08 19:44:12 +10:00 |
|
nv-guomingz
|
c8fa08da5c
|
doc: update cuda_graph_config usage part in DS R1 docs (#5796)
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
Co-authored-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
|
2025-07-08 16:54:46 +09:00 |
|
liji-nv
|
95978e3044
|
[fix] https://nvbugs/5333654 Unwaive to check ci status and improve torch compile multi-gpu coverage (#5700)
Signed-off-by: Jin Li <59594262+liji-nv@users.noreply.github.com>
|
2025-07-08 12:42:15 +08:00 |
|
nv-guomingz
|
0be41b6524
|
Revert "chore: [Breaking Change] Rename cuda_graph_config padding_enabled fie…" (#5818)
|
2025-07-08 13:15:30 +09:00 |
|
nv-guomingz
|
5a8173c121
|
chore: [Breaking Change] Rename cuda_graph_config padding_enabled fie… (#5795)
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
|
2025-07-08 08:52:36 +08:00 |
|
Robin Kobus
|
30a19fcf7c
|
[TRTLLM-6291] feat: Add user-provided speculative decoding support (#5204)
Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
|
2025-07-07 16:30:43 +02:00 |
|
Yi Zhang
|
ed1b3c884a
|
fix: Adjust free GPU memory fraction in KvCacheConfig for DeepSeek R1 tests (#5774)
Signed-off-by: Yi Zhang <187001205+yizhang-nv@users.noreply.github.com>
|
2025-07-07 18:38:54 +09:00 |
|
xinhe-nv
|
ded38ebdbd
|
test: [CI] remove closed bugs (#5770)
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
Co-authored-by: Larry <197874197+LarryXFly@users.noreply.github.com>
|
2025-07-07 18:06:07 +10:00 |
|
Yanchao Lu
|
2013034948
|
[Test] - Waive or fix few known test failures (#5769)
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
|
2025-07-06 21:14:16 +08:00 |
|
Stefan Niebler
|
d1112aac37
|
[TRTLLM-3442] feat: added beam search support to the PyTorch Workflow (#5333)
Signed-off-by: Stefan Niebler <82932102+stnie@users.noreply.github.com>
|
2025-07-05 01:35:13 +09:00 |
|
Chuang Zhu
|
ffc0b8f5da
|
Cache transceiver support VSWA (#5505)
Signed-off-by: ShiXiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
Co-authored-by: ShiXiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
|
2025-07-05 01:18:42 +09:00 |
|
Yiqing Yan
|
7f3ea058f0
|
[Infra] - Waive L0 flaky test (#5759)
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
|
2025-07-04 19:25:12 +09:00 |
|
xinhe-nv
|
3869b969a6
|
test: [CI] Add failed cases into waives.txt (#5718)
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
|
2025-07-04 17:24:48 +09:00 |
|
Faraz
|
81c0764012
|
Cherry pick "[NVBUG:5355009] Modify check for fuse_fp4_quant on SM120 (#5724)
Signed-off-by: Faraz Khoubsirat <58580514+farazkh80@users.noreply.github.com>
Signed-off-by: peaceh <103117813+peaceh-nv@users.noreply.github.com>
Co-authored-by: peaceh-nv <103117813+peaceh-nv@users.noreply.github.com>
|
2025-07-04 16:53:20 +09:00 |
|
Yiqing Yan
|
b8fef809ae
|
[Infra] - Waive L0 test (#5748)
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
|
2025-07-04 15:04:49 +08:00 |
|
Yuan Tong
|
32b244af38
|
feat: reduce unnecessary kernel generation (#5476)
Signed-off-by: Yuan Tong <13075180+tongyuantongyu@users.noreply.github.com>
|
2025-07-04 14:37:49 +08:00 |
|
brb-nv
|
cdaa6abce7
|
fix: Investigate Gemma3 1B decoder output discrepancy (#5564)
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
|
2025-07-04 13:14:13 +08:00 |
|
Yi Zhang
|
73d30a23c7
|
test: add more tests for GB200 with 8 GPUs/2 nodes in L0 tests (#5397)
Signed-off-by: Yi Zhang <187001205+yizhang-nv@users.noreply.github.com>
|
2025-07-04 13:14:13 +08:00 |
|
Zheng Duan
|
cb9f596dbe
|
[nvbug 5300551] test: increase block count in eviction test (#5465)
Signed-off-by: zhengd-nv <200704041+zhengd-nv@users.noreply.github.com>
|
2025-07-04 13:14:13 +08:00 |
|
xinhe-nv
|
7f837b6e8b
|
tests: waive failures on main (#5704)
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
|
2025-07-04 12:39:12 +09:00 |
|
Venky
|
4762e0b244
|
Waive tests : test_openai_lora, test_trtllm_serve_lora_example and test_openai_chat_structural_tag_example (#5740)
Signed-off-by: Venky <23023424+venkywonka@users.noreply.github.com>
|
2025-07-04 11:01:08 +09:00 |
|
Netanel Haber
|
f91379b7e8
|
delete duplicate eagle3 and ngram tests (#5711)
Signed-off-by: Netanel Haber <58652339+netanel-haber@users.noreply.github.com>
|
2025-07-03 15:47:26 +03:00 |
|
Omer Ullman Argov
|
c72856188c
|
[ci] small multigpu speedups (#5643)
Signed-off-by: Omer Ullman Argov <118735753+omera-nv@users.noreply.github.com>
|
2025-07-03 08:06:10 -04:00 |
|
Emma Qiao
|
530897388c
|
[Infra] - Waive a failed case on main (#5702)
Signed-off-by: qqiao <qqiao@nvidia.com>
|
2025-07-03 06:09:27 -04:00 |
|
Emma Qiao
|
2a5fdebf10
|
[Infra] - Waive failed tests for main 0702 (#5671)
Signed-off-by: qqiao <qqiao@nvidia.com>
|
2025-07-02 22:05:07 -04:00 |
|
Emma Qiao
|
31699cbeb1
|
[Infra] - Set default timeout to 1hr and remove some specific settings (#5667)
Signed-off-by: qqiao <qqiao@nvidia.com>
|
2025-07-02 08:37:54 -04:00 |
|
qixiang-99
|
ca7b6ec8d8
|
Feat/pytorch vswa kvcachemanager (#5151)
Signed-off-by: qixiang-99 <203170375+qixiang-99@users.noreply.github.com>
|
2025-07-02 15:58:00 +08:00 |
|
liji-nv
|
c345f5876c
|
[feat] Support torch compile for attention dp (#5086)
Signed-off-by: Jin Li <59594262+liji-nv@users.noreply.github.com>
|
2025-07-01 13:48:52 -04:00 |
|
Kaiyu Xie
|
f9a455651b
|
perf: Use tokenizers API to optimize incremental detokenization perf (#5574)
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
|
2025-07-01 09:35:25 -04:00 |
|
Yan Chunwei
|
3bc703d450
|
ci: unwaive llmapi launch test (#5281)
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
|
2025-07-01 20:12:55 +08:00 |
|
ruodil
|
ded203d8aa
|
test: set enable_attention_dp=True in default deepseek settings (#5461)
Signed-off-by: ruodil <200874449+ruodil@users.noreply.github.com>
Co-authored-by: Larry <197874197+LarryXFly@users.noreply.github.com>
|
2025-07-01 20:12:55 +08:00 |
|
brb-nv
|
4ef60d5fbb
|
nvbugs-5331031; nvbugs-5344203 - address intermittent issues with Mistral Small multimodal for BS=8 (#5453)
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
|
2025-07-01 20:12:55 +08:00 |
|
Ivy Zhang
|
61213e3562
|
tests: fix typos in qa test (#5421)
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
|
2025-07-01 20:12:55 +08:00 |
|
Yan Chunwei
|
a5eff139f1
|
[TRTLLM-5277] chore: refine llmapi examples for 1.0 (part1) (#5431)
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
Signed-off-by: Erin Ho <14718778+hchings@users.noreply.github.com>
Co-authored-by: Erin Ho <14718778+hchings@users.noreply.github.com>
|
2025-07-01 19:06:41 +08:00 |
|
Emma Qiao
|
65c2b93284
|
[Infra] - Add some timeout and unwaive a test which dev fixed (#5631)
Signed-off-by: qqiao <qqiao@nvidia.com>
|
2025-07-01 05:01:32 -04:00 |
|
Pamela Peng
|
071ad758c4
|
[https://nvbugs/5318059][test] Unwaive test (#5624)
Signed-off-by: Pamela Peng <179191831+pamelap-nvidia@users.noreply.github.com>
|
2025-07-01 04:54:44 -04:00 |
|
Robin Kobus
|
5f77d212ef
|
test: Reduce number of C++ test cases (#5437)
Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
|
2025-07-01 09:40:49 +02:00 |
|
xinhe-nv
|
19c56f0374
|
test: [CI] Add failed cases into waives.txt (#5582)
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
|
2025-07-01 14:57:03 +08:00 |
|
Stanley Sun
|
7135b27284
|
rcca: test default kv_cache_reuse option for pytorch multimodal (#5544)
Signed-off-by: Stanley Sun <190317771+StanleySun639@users.noreply.github.com>
|
2025-07-01 12:12:48 +08:00 |
|
xinhe-nv
|
a8cf611baa
|
test: [CI] Add failed cases into waives.txt (#5569)
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
|
2025-07-01 11:02:56 +08:00 |
|
xinhe-nv
|
9b17b29b6e
|
test: [CI] remove closed bugs (#5572)
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
|
2025-07-01 10:15:43 +08:00 |
|
Yi Zhang
|
7cf1209a19
|
[fix]: Fix main test skip issue (#5503)
Signed-off-by: Yi Zhang <187001205+yizhang-nv@users.noreply.github.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
|
2025-06-30 21:39:49 -04:00 |
|
nv-guomingz
|
6e48ac25a6
|
chore: remove cuda_graph_ prefix from cuda_graph_config filed members. (#5585)
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
|
2025-06-30 12:23:14 -04:00 |
|
Omer Ullman Argov
|
42134b8b84
|
[ci] move eagle1 and medusa tests to post-merge (#5604)
Signed-off-by: Omer Ullman Argov <118735753+omera-nv@users.noreply.github.com>
|
2025-06-30 19:32:28 +08:00 |
|
Fanrong Li
|
6cbc9a5297
|
[nvbug/5354946][fix] Fix mtp vanilla draft inputs (#5568)
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
|
2025-06-30 15:59:12 +08:00 |
|
Yiqing Yan
|
4fef14da56
|
Deduplicate waive list (#5546)
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
|
2025-06-30 11:12:26 +08:00 |
|
nv-guomingz
|
578430e64c
|
[TRTLLM-5530][BREAKING CHANGE]: enhance the llm args pytorch config part 1(cuda_graph_config) (#5014)
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
|
2025-06-30 11:05:40 +08:00 |
|
Omer Ullman Argov
|
2780fc27a7
|
[ci] remove MMLU if followed by GSM8K (#5578)
Signed-off-by: Omer Ullman Argov <118735753+omera-nv@users.noreply.github.com>
|
2025-06-30 05:29:54 +03:00 |
|
Talor Abramovich
|
70e34a3291
|
[TRTLLM-5831][feat] Add LoRA support for pytorch backend in trtllm-serve (#5376)
Signed-off-by: Talor Abramovich <talora@nvidia.com>
|
2025-06-29 12:46:30 +00:00 |
|
amirkl94
|
a985c0b7e6
|
tests: Move stress tests to be Post-Merge only (#5166)
Signed-off-by: Amir Klein <203507526+amirkl94@users.noreply.github.com>
|
2025-06-29 09:44:47 +03:00 |
|