Yuan Tong
|
db8dc97b7b
|
[None][fix] Migrate to new cuda binding package name (#6700)
Signed-off-by: Yuan Tong <13075180+tongyuantongyu@users.noreply.github.com>
|
2025-08-07 16:29:55 -04:00 |
|
Raayan Dhar
|
4055b764db
|
[None][fix] disagg ctx pp4 + gen pp4 integ test (#6489)
Signed-off-by: raayandhar <rdhar@nvidia.com>
Signed-off-by: Raayan Dhar <58057652+raayandhar@users.noreply.github.com>
|
2025-08-07 11:18:02 -04:00 |
|
pcastonguay
|
453a06e6ab
|
[TRTLLM-6881][feat] Include attention dp rank info with KV cache events (#6563)
Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>
|
2025-08-07 14:17:07 +02:00 |
|
Enwei Zhu
|
1b9781e8e7
|
[TRTLLM-6409][feat] Enable guided decoding with speculative decoding (part 1: two-model engine) (#6300)
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
|
2025-08-07 05:53:48 -04:00 |
|
xinhe-nv
|
0a467b00cc
|
[https://nvbugs/5409414][fix] fix Not registered specs (#6660)
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
Co-authored-by: Larry <197874197+LarryXFly@users.noreply.github.com>
|
2025-08-07 17:55:53 +10:00 |
|
hlu1
|
8207d5fd39
|
[None] [feat] Add model gpt-oss (#6645)
Signed-off-by: Hao Lu <14827759+hlu1@users.noreply.github.com>
|
2025-08-07 03:04:18 -04:00 |
|
ruodil
|
6c1f7d8b91
|
[None][test] correct test-db context for perf yaml file (#6686)
Signed-off-by: ruodil <200874449+ruodil@users.noreply.github.com>
|
2025-08-07 02:47:10 -04:00 |
|
YueWeng
|
157ea77549
|
[https://nvbugs/5375966][chore] Unwaive test_disaggregated_deepseek_v3_lite_fp8_attention_dp_one (#6658)
Signed-off-by: Yue Weng <25103990+yweng0828@users.noreply.github.com>
|
2025-08-07 10:25:17 +08:00 |
|
ruodil
|
780d7507f9
|
[None][test] remove trt backend cases in release perf test and move NIM cases to llm_perf_nim.yml (#6662)
Signed-off-by: ruodil <200874449+ruodil@users.noreply.github.com>
Co-authored-by: Larry <197874197+LarryXFly@users.noreply.github.com>
|
2025-08-07 10:02:13 +10:00 |
|
ruodil
|
f30398470d
|
[None][chore] update readme for perf release test (#6664)
Signed-off-by: ruodil <200874449+ruodil@users.noreply.github.com>
Co-authored-by: Larry <197874197+LarryXFly@users.noreply.github.com>
|
2025-08-07 10:00:45 +10:00 |
|
Yan Chunwei
|
5eae3184fa
|
[None][chore] add missing tests to test list (#6590)
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
|
2025-08-06 22:12:27 +08:00 |
|
Yechan Kim
|
1aed7511fe
|
[https://nvbugs/5430124][fix] Mistral mixture_text_image test case fix (#6648)
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
|
2025-08-06 06:58:58 -07:00 |
|
Iman Tabrizian
|
13ecb4aced
|
[https://nvbugs/5328160][fix] Unwaive disaggregated serving tests (#6644)
Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
|
2025-08-06 09:08:29 -04:00 |
|
ruodil
|
907c180eb2
|
[None][test] align kv_frac in perf test with perflab and add more cases for 4 gpus GB200 (#6632)
Signed-off-by: ruodil <200874449+ruodil@users.noreply.github.com>
|
2025-08-06 02:25:57 -04:00 |
|
ruodil
|
0bd99b5d6d
|
[TRTLLM-6764][test] add new feature cases in cluster(B200/GB200) and sanity test (#6650)
Signed-off-by: ruodil <200874449+ruodil@users.noreply.github.com>
|
2025-08-06 01:45:13 -04:00 |
|
yunruis
|
3ff4f503ad
|
[None][opt] ADP schedule balance optimization (#6061)
Signed-off-by: yunruis <205571022+yunruis@users.noreply.github.com>
|
2025-08-06 09:38:02 +08:00 |
|
Yechan Kim
|
c17f4984e2
|
[None][feat] Refactor Llava-Next (#6478)
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
|
2025-08-05 17:53:53 -07:00 |
|
ixlmar
|
1ebceb790d
|
[TRTLLM-5508][feat] check input tokens + improve error handling (#5170)
Signed-off-by: ixlmar <206748156+ixlmar@users.noreply.github.com>
|
2025-08-05 18:27:43 +01:00 |
|
liji-nv
|
dcbfa7e509
|
[https://nvbugs/5252313][fix] Fix torch compile + MTP (#6554)
Signed-off-by: Jin Li <59594262+liji-nv@users.noreply.github.com>
|
2025-08-05 10:31:29 -04:00 |
|
Venky
|
61da2daeb4
|
[TRTLLM-6761][refactor] Replace LogitBiasLogitsProcessor with embedding bias tensor system (#6464)
Signed-off-by: Venky Ganesh <23023424+venkywonka@users.noreply.github.com>
|
2025-08-05 07:14:24 -07:00 |
|
Emma Qiao
|
78a75c2990
|
[None][Infra] - Split gb200 stages for each test (#6594)
Signed-off-by: qqiao <qqiao@nvidia.com>
|
2025-08-05 07:10:00 -04:00 |
|
xinhe-nv
|
c32584125e
|
[TRTQA-2920][fix] Add failed cases into waives.txt (#6600)
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
|
2025-08-05 20:12:55 +10:00 |
|
Pengbo Wang @ NVIDIA
|
c289880afb
|
[None][fix] fix kimi k2 serving and add test for Kimi-K2 (#6589)
Signed-off-by: Pengbo Wang <221450789+pengbowang-nv@users.noreply.github.com>
|
2025-08-05 18:05:33 +08:00 |
|
Ivy Zhang
|
08ed9d7305
|
[None][doc] add introduction doc on qa test (#6535)
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
|
2025-08-05 17:02:17 +08:00 |
|
Ivy Zhang
|
d101a6cebc
|
[https://nvbugs/5410279][test] resubmit timeout refactor (#6337)
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
|
2025-08-05 16:39:25 +08:00 |
|
Haohang Huang
|
c9eebcb454
|
[TRTLLM-6674][feat] (Breaking Change) Hopper SWA non-cyclic kernels + KV reuse + Spec Dec (#6379)
Signed-off-by: Haohang Huang <31998628+symphonylyh@users.noreply.github.com>
Signed-off-by: symphonylyh <31998628+symphonylyh@users.noreply.github.com>
|
2025-08-05 07:47:41 +00:00 |
|
Leslie Fang
|
164acfa31e
|
[None][infra] Skip test_eagle3 test with device memory check (#6617)
Signed-off-by: leslie-fang25 <leslief@nvidia.com>
|
2025-08-05 02:36:03 -04:00 |
|
ruodil
|
7625845365
|
test: add README_release_test.md for perf test (#6443)
Signed-off-by: ruodil <200874449+ruodil@users.noreply.github.com>
|
2025-08-05 02:07:42 -04:00 |
|
xinhe-nv
|
a178cea324
|
[TRTLLM-6856][feat] add disaggregated serving tests to QA list (#6536)
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
|
2025-08-05 12:47:53 +10:00 |
|
xinhe-nv
|
fe3d607c4b
|
[TRTQA-2920][fix] Add failed cases into waives.txt (#6581)
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
Co-authored-by: Larry <197874197+LarryXFly@users.noreply.github.com>
|
2025-08-05 12:41:23 +10:00 |
|
Ivy Zhang
|
f3651adea8
|
[None][test] update invalid test name (#6596)
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
|
2025-08-04 08:01:05 -04:00 |
|
Emma Qiao
|
5d8a5a0cb8
|
[None][Infra]Waive failed case in post-merge on main (#6602)
Signed-off-by: qqiao <qqiao@nvidia.com>
|
2025-08-04 19:39:44 +08:00 |
|
brb-nv
|
87e4e9f468
|
[None][chore] Add unit test for Gemma3 lora (#6560)
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
|
2025-08-04 04:56:57 -04:00 |
|
Pengyun Lin
|
a15e33351d
|
[None][fix] Revert commit 48ddc3d & add test for disagg server with different max_num_tokens (#6259)
Signed-off-by: Pengyun Lin <81065165+LinPoly@users.noreply.github.com>
|
2025-08-04 15:09:51 +08:00 |
|
xinhe-nv
|
a54972e463
|
[None][fix] remove closed bugs (#6576)
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
Co-authored-by: Larry <197874197+LarryXFly@users.noreply.github.com>
|
2025-08-04 15:52:11 +10:00 |
|
Leslie Fang
|
a60190836c
|
[None][infra] Enable accuracy test for eagle3 and chunked prefill (#6386)
Signed-off-by: leslie-fang25 <leslief@nvidia.com>
|
2025-08-04 01:45:24 -04:00 |
|
ruodil
|
6459725bf9
|
test: move ministral_8b_fp8 to fp8_specific gpu list(exclude Ampere) (#6533)
Signed-off-by: ruodil <200874449+ruodil@users.noreply.github.com>
Co-authored-by: Larry <197874197+LarryXFly@users.noreply.github.com>
|
2025-08-04 15:22:39 +10:00 |
|
Ivy Zhang
|
5eefdf2c75
|
tests: Add llama4 functional cases (#6392)
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
|
2025-08-04 11:19:58 +08:00 |
|
ruodil
|
8d82ccca63
|
test: modify max_lora_rank of phi4_multimodal to 320 (#6474)
Signed-off-by: ruodil <200874449+ruodil@users.noreply.github.com>
Co-authored-by: Larry <197874197+LarryXFly@users.noreply.github.com>
|
2025-08-04 12:20:22 +10:00 |
|
Yechan Kim
|
ee6ab5be96
|
chore: add EXAONE4 accuracy test (#6397)
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
|
2025-08-04 10:14:16 +08:00 |
|
Ivy Zhang
|
7547a7d0a2
|
[TRTLLM-6473][test] add speculative decoding and ep load balance cases into QA test list (#6436)
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
|
2025-08-03 22:11:26 -04:00 |
|
Yiqing Yan
|
3f7abf87bc
|
[TRTLLM-6224][infra] Upgrade dependencies to DLFW 25.06 and CUDA 12.9.1 (#5678)
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
|
2025-08-03 11:18:59 +08:00 |
|
Jhao-Ting Chen
|
4da5cfc511
|
[None][infra] add eagle3 one model accuracy tests (#6264)
Signed-off-by: Jhao-Ting Chen <jhaotingc@nvidia.com>
|
2025-08-02 16:07:46 -07:00 |
|
Lizhi Zhou
|
6f34f3489b
|
[TRTLLM-6357][test] Add accuracy tests for Qwen3 (#6177)
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
|
2025-08-01 13:33:34 -04:00 |
|
xinhe-nv
|
263c6c0ad0
|
test: skip post blackwell (#6357)
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
|
2025-08-01 13:10:14 -04:00 |
|
Emma Qiao
|
16febefee0
|
[None][Infra] - Skip failed tests in post-merge (#6558)
Signed-off-by: qqiao <qqiao@nvidia.com>
|
2025-08-01 22:21:23 +08:00 |
|
brb-nv
|
7447d6ed85
|
[TRTLLM-6657][feat] Add LoRA support for Gemma3 (#6371)
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
|
2025-08-01 09:19:54 -04:00 |
|
liji-nv
|
1daa8c3232
|
[https://nvbugs/5340941][https://nvbugs/5375785] - fix: Wrap attentio… (#6355)
Signed-off-by: Jin Li <59594262+liji-nv@users.noreply.github.com>
|
2025-08-01 07:38:06 -04:00 |
|
xinhe-nv
|
fca0d37798
|
[None][fix] update nemotron nas tests free_gpu_memory_fraction=0.8 (#6552)
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
|
2025-08-01 20:27:22 +10:00 |
|
chenfeiz0326
|
ba5bdbb138
|
[None][chore] Disable add special tokens for Llama3.3 70B (#6482)
Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>
|
2025-08-01 17:03:27 +08:00 |
|