Emma Qiao
14bc8571ae
[TRTLLM-8435][infra] Test existing rtxpro6000 stages on rtxpro6000d ( #8319 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
Signed-off-by: Emma Qiao <qqiao@nvidia.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
2025-11-03 05:26:17 -08:00
Emma Qiao
d7176768cd
[None][infra] Waive the failed test for main on 11/3 ( #8875 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
Signed-off-by: Emma Qiao <qqiao@nvidia.com>
2025-11-03 02:52:52 -08:00
Tailing Yuan
8303cfa477
[None][fix] Fix import issues in layer-wise benchmarks ( #8827 )
...
Signed-off-by: Tailing Yuan <yuantailing@gmail.com>
2025-11-03 02:32:48 -08:00
xinhe-nv
4873ca04cc
[ https://nvbugs/5521799 ][fix] add harmony channel validation ( #8837 )
...
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-11-03 02:31:54 -08:00
xinhe-nv
64540451e7
[None][chore] Add failed cases into waives.txt ( #8872 )
...
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2025-11-03 01:19:04 -08:00
Fanrong Li
e9f78c687a
[ https://nvbugs/5625962 ][chore] unwaive DS-v32-fp4 tests ( #8853 )
...
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
2025-11-03 00:34:52 -08:00
Yechan Kim
00c0e6c440
[ https://nvbugs/5523315 ][fix] Fix serve benchmark test ( #8255 )
...
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
2025-11-03 00:30:13 -08:00
chenfeiz0326
cc4ab8d9d1
[TRTLLM-8825][feat] Support Pytest Perf Results uploading to Database ( #8653 )
...
Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>
2025-11-03 16:23:13 +08:00
yufeiwu-nv
b4d17d1a4c
[TRTLLM-8991][test] Add Llama 3.3 70B model with different performance config ( #8753 )
...
Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>
Co-authored-by: Larry Xu <197874197+LarryXFly@users.noreply.github.com>
2025-11-03 13:34:06 +08:00
Chang Liu
f57dc01e6f
[ https://nvbugs/5625380 ][chore] Remove multimodal related fields from decoder llm input ( #8846 )
2025-11-02 17:44:08 -08:00
dongfengy
6d6797c792
[None][test] Enhance GPT-OSS CI with GPQA Diamond and additional Spec Decoding Test ( #8661 )
...
Signed-off-by: Dongfeng Yu <dongfengy@nvidia.com>
Signed-off-by: dongfengy <99041270+dongfengy@users.noreply.github.com>
2025-11-02 16:44:02 -08:00
Yan Chunwei
1551ed8e5f
[ https://nvbugs/5437384 ][test] CHERRY-PICK: fix trtllm-llmapi-launch multi tests ( #8567 )
...
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
2025-11-01 06:49:33 -07:00
QI JUN
89e0117097
[TRTLLM-8836][chore] Create ModelEngine from LlmArgs ( #8600 )
...
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-11-01 05:26:06 -07:00
dongxuy04
bba2519726
[TRTLLM-7008][fix] Enable GDRCopy and unwaive online eplb tests ( #8720 )
...
Signed-off-by: Dongxu Yang <78518666+dongxuy04@users.noreply.github.com>
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
2025-10-31 16:39:51 -07:00
Fanrong Li
f0dc746738
[TRTLLM-8541][feat] Add trtllm-gen sparse MLA kernels to support per-Tensor FP8 KV Cache ( #8692 )
...
Signed-off-by: Perkz Zheng <67892460+PerkzZheng@users.noreply.github.com>
Signed-off-by: Tracin <10434017+Tracin@users.noreply.github.com>
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
Co-authored-by: Perkz Zheng <67892460+PerkzZheng@users.noreply.github.com>
Co-authored-by: Tracin <10434017+Tracin@users.noreply.github.com>
2025-10-31 14:38:31 -07:00
dongfengy
0edba5a7e2
[ https://nvbugs/5474119 ][fix] Re-enable test ( #8809 )
...
Signed-off-by: Dongfeng Yu <dongfengy@nvidia.com>
2025-10-31 10:17:58 -07:00
Patrice Castonguay
afa75c9494
[ https://nvbugs/5614506 ][chore] Adding e+p+d e2e test ( #8801 )
...
Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>
2025-10-31 09:52:42 -07:00
Anthony Chang
852e5060aa
[ https://nvbugs/5558117 ][fix] Allow per-layer quant config from hf_quant_config.json ( #8617 )
...
Signed-off-by: Anthony Chang <27950904+rosenrodt@users.noreply.github.com>
2025-10-31 04:41:44 -07:00
Tailing Yuan
98453d2bb7
[None][fix] Waive layer-wise benchmark tests ( #8823 )
...
Signed-off-by: Tailing Yuan <yuantailing@gmail.com>
2025-10-30 22:51:31 -07:00
Chang Liu
3a79d03874
[ https://nvbugs/5617275 ][fix] Extract py files from prebuilt wheel for editable installs ( #8738 )
...
Signed-off-by: Chang Liu (Enterprise Products) <9713593+chang-l@users.noreply.github.com>
2025-10-30 21:40:22 -07:00
Emma Qiao
aecc9655a0
[None][info] Waive failed case for main ( #8826 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-10-30 20:43:59 -07:00
HuiGao-NV
1a338e1a05
[None][chore] use cached vila model ( #8788 )
...
Signed-off-by: Hui Gao <huig@nvidia.com>
2025-10-30 20:26:45 -07:00
Yuxian Qiu
025d2926df
[ https://nvbugs/5599515 ][fix] Fix PP bubbles. ( #8687 )
...
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
2025-10-31 10:13:56 +08:00
Yilin Fan
f3224ccd32
[None][feat] Add disagg relay time to time breakdown tool ( #8465 )
...
Signed-off-by: nv-yilinf <206948969+nv-yilinf@users.noreply.github.com>
2025-10-30 18:21:45 -07:00
Mike Iovine
b87448b009
[TRTLLM-8978][test] Remove llama 4 spec dec tests ( #8766 )
...
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
2025-10-30 15:47:04 -04:00
Tailing Yuan
ec31363a86
[None][fix] Layer wise benchmarks: use local models, lint ( #8799 )
...
Signed-off-by: Tailing Yuan <yuantailing@gmail.com>
2025-10-30 09:47:46 -07:00
Emma Qiao
9112cffaf3
[None][infra] Waive failed case for main branch ( #8797 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-10-30 07:57:35 -07:00
Tailing Yuan
f9c7786dc8
[None][feat] Add layer wise benchmarks ( #8777 )
...
Signed-off-by: Tailing Yuan <yuantailing@gmail.com>
2025-10-30 20:29:34 +08:00
Anthony Chang
f666ad2f6b
[None][feat] Autotuner can iterate through all tactics for test purposes ( #8663 )
...
Signed-off-by: Anthony Chang <27950904+rosenrodt@users.noreply.github.com>
2025-10-30 13:11:25 +01:00
Emma Qiao
a5cc9fe0aa
[TRTLLM-5453][infra] Check all steps for test name and also check the test in waives.txt also exists in l0 or qa test list. ( #6256 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
Signed-off-by: Emma Qiao <qqiao@nvidia.com>
2025-10-30 01:56:04 -07:00
ChristinaZ
13cfd70f57
[None][feat] Add unit tests and revision in block_level kernel for invalid input ( #8718 )
...
Signed-off-by: Christina Zhang <83400082+ChristinaZ@users.noreply.github.com>
2025-10-30 16:42:18 +08:00
WeiHaocheng
cc286687c4
[None][feat] Refactor scaffolding streaming feature and fix openai wo… ( #8622 )
...
Signed-off-by: Fred Wei <20514172+WeiHaocheng@users.noreply.github.com>
2025-10-30 16:02:40 +08:00
xinhe-nv
a4f75399b9
[ https://nvbugs/5481206 ][fix] update waives ( #8774 )
...
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-10-30 00:43:38 -07:00
Leslie Fang
2072185d76
[ https://nvbugs/5608461 ][fix] exclude InductorSubproc from thread leak check ( #8704 )
...
Signed-off-by: leslie-fang25 <leslief@nvidia.com>
2025-10-30 15:35:15 +08:00
Emma Qiao
7d3cebf34e
[None][infra] Unwaive the tests passed in latest CI and disable a perf stage ( #8775 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-10-30 12:48:23 +08:00
Emma Qiao
db99a936b0
[TRTLLM-8971][infra] Update gpu key for B300/GB300 ( #8724 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-10-29 20:36:44 -07:00
Yuxian Qiu
3176bd3815
[None][fix] Fix UnboundLocalError. ( #8756 )
...
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
2025-10-29 19:41:37 -07:00
HuiGao-NV
ae57738bae
[ https://nvbugs/5547414 ][fix] Use cached models ( #8755 )
...
Signed-off-by: Hui Gao <huig@nvidia.com>
2025-10-29 19:10:10 -07:00
Iman Tabrizian
ae6875fe10
[TRTLLM-8976][feat] Move indexer-k-cache to KVCacheManager ( #8699 )
...
Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
2025-10-29 08:04:26 -07:00
Emma Qiao
579e1067bf
[None][infra] Waive failed tests on main ( #8759 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-10-29 21:32:23 +08:00
Yan Chunwei
fc3b6f5331
[None][ci] waive test_rpc.py ( #8745 )
...
Signed-off-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
2025-10-29 05:17:40 -07:00
kris1025
e2c5a38879
[ https://nvbugs/5534574 ][fix] disable spec decoding forever once the request spec decoding is disabled ( #8446 )
...
Signed-off-by: linquanh <linquanh@nvidia.com>
2025-10-29 19:28:43 +08:00
Chang Liu
81eb861df0
[None][chore] Enable GPQA in CI for DeepSeek V3.2 ( #8712 )
...
Signed-off-by: Chang Liu (Enterprise Products) <9713593+chang-l@users.noreply.github.com>
2025-10-29 04:22:22 -07:00
Zheng Duan
d626d13d37
[ https://nvbugs/5607238 ][test] fix working dir in disagg worker test ( #8648 )
...
Signed-off-by: zhengd-nv <200704041+zhengd-nv@users.noreply.github.com>
2025-10-29 16:13:52 +08:00
Pengyun Lin
2aade46d18
[TRTLLM-8214][feat] Support Qwen3 tool parser ( #8216 )
...
Signed-off-by: Pengyun Lin <81065165+LinPoly@users.noreply.github.com>
2025-10-29 15:48:29 +08:00
Stefan Niebler
19ca7b15c7
[ https://nvbugs/5593199 ][test] Enhance beam search tests deterministic dummy model ( #8625 )
...
Signed-off-by: Stefan Niebler <82932102+stnie@users.noreply.github.com>
2025-10-29 06:12:22 +01:00
Chang Liu
5f737b8dbe
[None][perf] Use fp8 quant kernel in DS3.2 indexer module ( #8701 )
...
Signed-off-by: Chang Liu (Enterprise Products) <9713593+chang-l@users.noreply.github.com>
2025-10-29 12:45:09 +08:00
Cheng Hang
15c293a90b
[None][feat] Enable nvfp4 cuda core for sm120 ( #8620 )
...
Signed-off-by: Cheng Hang <chang@nvidia.com>
2025-10-29 12:39:03 +08:00
xinhe-nv
7ba98a6b20
[None][chore] Add failed cases into waives.txt ( #8684 )
...
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2025-10-28 20:30:01 -07:00
Yan Chunwei
f2faf2809f
[None][ci] waive test_rpc.py temporarily ( #8743 )
...
Signed-off-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
2025-10-28 19:20:27 -07:00