Commit Graph

791 Commits

Author SHA1 Message Date
HuiGao-NV
253af9f9af
[https://nvbugs/5410391][bug] Support to share device buffers in attention meta (#6557)
Signed-off-by: Hui Gao <huig@nvidia.com>
2025-08-22 13:19:27 +08:00
Pamela Peng
1e5a6be55d
[https://nvbugs/5448442][fix] Skip trtllm moe backend for sm120 (#7010)
Signed-off-by: Pamela <179191831+pamelap-nvidia@users.noreply.github.com>
2025-08-21 13:34:07 -04:00
Venky
9eac744d72
[https://nvbugs/5464088] [fix] dequantize fp8 activation input to lora forward; update perf test config (#7014)
Signed-off-by: Venky Ganesh <23023424+venkywonka@users.noreply.github.com>
2025-08-21 08:28:54 -04:00
Yan Chunwei
e77ec061db
[https://nvbugs/5451296][fix] zmq nonblock bug with retry (#7019)
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
2025-08-21 08:34:46 +08:00
chenfeiz0326
5acf213a15
[https://nvbugs/5440241][fix] Fix 70B GSM8K Accuracy drop (#7075)
Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>
2025-08-20 18:11:00 -04:00
yifeizhang-c
5959d72d74
[https://nvbugs/5394392][fix] Enlarge scheduler capacity under disagg bs == 1 (#6975)
Signed-off-by: Yifei Zhang <219273404+yifeizhang-c@users.noreply.github.com>
2025-08-20 16:32:27 +08:00
Jin Li
69846c6586
[https://nvbugs/5427801][fix] Torch compile support for Llama4 and Ea… (#6978)
Signed-off-by: Jin Li <59594262+liji-nv@users.noreply.github.com>
2025-08-20 15:06:56 +08:00
Bo Deng
df00c81aea
[https://nvbugs/5448437][fix] fix some nixl tests (#6940)
Signed-off-by: Bo Deng <deemod@nvidia.com>
2025-08-20 14:19:48 +08:00
Emma Qiao
c4535e6c3a
[None][infra] Waive failed tests for release branch (#7036)
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-08-19 20:42:47 +08:00
brb-nv
da91256503
[None][chore] Waive E2E GB200 tests for Gemma3 27B (#6916)
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-08-19 05:19:34 -04:00
Yechan Kim
d6c2a6a81f
[https://nvbugs/5448579][fix] EXAONE-4.0 accuracy test bugfix (#6888)
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
2025-08-19 09:29:32 +02:00
Nave Assaf
d4dd5b4f4d
[https://nvbugs/5451028][fix] Constrain NemotronSuper test parameters… (#6987)
Signed-off-by: Nave Assaf <nassaf@nvidia.com>
2025-08-19 09:19:50 +02:00
William Zhang
790a105563
[https://nvbugs/5462007][ci] Unwaive Mistral Small 3.1 FP8 test (#7008)
The error was fixed by #6909.

Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>
2025-08-18 19:50:03 -04:00
Yiqing Yan
28c30e1bf8
[None][chore] Remove duplicate test waives (#6999)
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
2025-08-18 22:04:43 +08:00
Emma Qiao
2992e9cd58
[None][infra] Waive failed tests for release branch 0818 (#6993)
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-08-18 20:31:50 +08:00
peaceh-nv
28526fe2b1
[https://nvbugs/5449218][fix] Fix KvCacheConfig error in test_perf (#6937)
Signed-off-by: peaceh <103117813+peaceh-nv@users.noreply.github.com>
2025-08-18 15:58:53 +08:00
Ivy Zhang
055fdd9e31
[None][fix] update skip config (#6891)
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
2025-08-18 13:50:46 +08:00
Guoming Zhang
96bda14fbd
[https://nvbugs/5375646][fix] update waives.txt for nvbug 5375646 (#6847)
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
2025-08-17 23:22:01 -04:00
William Zhang
c16aff5e3f
[https://nvbugs/5448525][fix] Mistral Small 3.1 accuracy tests (#6909)
This commit lowers the GPU memory allocated for KV cache in accuracy
tests, and adjusts a threshold for Mistral Small 3.1 24B for FP8.

Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>
2025-08-18 11:17:37 +08:00
Yan Chunwei
6d65b63b8d
[None][ci] unwaive test_ptp_star_attention_example (#6943)
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
2025-08-15 05:33:25 -04:00
xinhe-nv
c03ea1ba2d
[TRTLLM-7048][feat] add benchmark TRT flow test for MIG (#6884)
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
2025-08-15 14:01:05 +08:00
Yan Chunwei
54ffc6a250
[None][doc] add legacy section for tensorrt engine (#6724)
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
2025-08-15 11:08:38 +08:00
2ez4bz
ccb62ef97e
[TRTLLM-5252][feat] Add fp8 support for Mistral Small 3.1 (#6731)
This commit adds some level of FP8 support to Mistral Small 3.1 by:

* disabling quantization for the vision sub-model since `modelopt` does
  support quantizing it (yet).
* extending existing accuracy tests to use a modelopt produced FP8
  checkpoint.

Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>
2025-08-13 21:25:55 -04:00
brb-nv
3d95742d97
[https://nvbugs/5401114][fix] Unwaive Gemma3 tests (#6870)
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-08-13 20:05:35 -04:00
Guoming Zhang
3e46624f09
[https://nvbugs/5375594][fix] fix oom issue on structural_tag test case (#6838)
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
Co-authored-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
2025-08-13 10:09:35 -04:00
Ivy Zhang
fd8f417bf2
[None][fix] fix Llama3 eagle3 test case OOM (#6832)
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
2025-08-13 02:21:05 -04:00
xinhe-nv
0958efdcff
[None][chore] waive GB300 known issues (#6812)
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-08-13 13:13:36 +08:00
Ivy Zhang
15bcf80596
[TRTLLM-6975][test] Add multi-turn test cases for VLM models (#6749)
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
2025-08-13 13:10:13 +08:00
Yuxian Qiu
cf00003f3d
[None][fix] fix CUDA graph config for test_llm_api_pytorch.py. (#6826)
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
2025-08-13 10:24:15 +08:00
brb-nv
3d169bfdad
[https://nvbugs/5445774][fix] Unwaive Gemma3 27B fp8 test (#6799)
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-08-12 08:54:15 -07:00
Yanchao Lu
c39454c617
[None][infra] Avoid intermittent access broken to nvcr.io (#6715)
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
Co-authored-by: Zhanrui Sun <184402041+ZhanruiSunCh@users.noreply.github.com>
2025-08-12 11:48:59 +08:00
Raayan Dhar
ddf8e8d1a0
[None][feat] adding support for disaggregated multi-instance tests (#6674)
Signed-off-by: raayandhar <rdhar@nvidia.com>
2025-08-11 13:00:57 -07:00
2ez4bz
efd0a51508
[TRTLLM-5252][fix] Propagate mapping to intermediate layers (#6611) (#6765)
This commit propagates the mapping to intermediate layers to enable
tensor parallelism (amongst other things) in them.

It also fixes issues with a unit test for TP for pixtral, and adds it to a
test list.

Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>
2025-08-11 10:13:10 -07:00
Yechan Kim
e6642eb68c
[https://nvbugs/5444095][infra] waive test_ptp_quickstart_multimodal llava test (#6795)
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
2025-08-11 11:58:37 -04:00
Emma Qiao
824feb8653
[None][infra] Waive failed tests on release branch (#6782)
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-08-11 03:14:47 -04:00
Bo Deng
a4f9e637ae
[https://nvbugs/5431127][fix] Run test_disaggregated_deepseek_v3_lite_fp8_nixl[DeepSeek-V3-Lite-fp8] only on hopper (#6737)
Signed-off-by: Bo Deng <deemod@nvidia.com>
2025-08-11 13:29:11 +08:00
Yan Chunwei
21e4f51139
[TRTLLM-4721][test] Add qa test for llm-api (#6727)
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
2025-08-11 08:03:16 +08:00
Yuxian Qiu
2206e49554
[https://nvbugs/5442608][fix] Update CUDA graph config for get_model_yaml_config. (#6693)
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
Co-authored-by: ruodil <200874449+ruodil@users.noreply.github.com>
2025-08-10 01:48:55 -04:00
ruodil
28b762a2a2
[None][test] fix yml condition error under qa folder (#6733)
Signed-off-by: ruodil <200874449+ruodil@users.noreply.github.com>
2025-08-08 15:59:09 +10:00
Bo Deng
d289d85bff
[TRTLLM-6675][infra] Nixl test completion (#6623)
Signed-off-by: Bo Deng <deemod@nvidia.com>
2025-08-08 10:15:54 +08:00
Ivy Zhang
232a39de1f
[TRTLLM-5574][test] Add NIM required VLM models multi-gpu test (#6687)
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
Co-authored-by: Larry <197874197+LarryXFly@users.noreply.github.com>
2025-08-08 11:58:58 +10:00
brb-nv
4adde41632
[TRTLLM-6656][chore] Validate FP8 support for Gemma3 (#6678)
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-08-07 13:14:04 -04:00
ruodil
0f8242aed9
[None][test] cherry-pick: correct test-db context for perf yaml file and add mistral cases (#6688)
Signed-off-by: ruodil <200874449+ruodil@users.noreply.github.com>
2025-08-07 06:16:42 -04:00
Stanley Sun
53f94a4a0e
[None][test] Add Mistral Small 3.1 24B accuracy test to QA test list (#6682)
Signed-off-by: Stanley Sun <190317771+StanleySun639@users.noreply.github.com>
2025-08-07 03:24:35 -04:00
YueWeng
157ea77549
[https://nvbugs/5375966][chore] Unwaive test_disaggregated_deepseek_v3_lite_fp8_attention_dp_one (#6658)
Signed-off-by: Yue Weng <25103990+yweng0828@users.noreply.github.com>
2025-08-07 10:25:17 +08:00
ruodil
780d7507f9
[None][test] remove trt backend cases in release perf test and move NIM cases to llm_perf_nim.yml (#6662)
Signed-off-by: ruodil <200874449+ruodil@users.noreply.github.com>
Co-authored-by: Larry <197874197+LarryXFly@users.noreply.github.com>
2025-08-07 10:02:13 +10:00
ruodil
f30398470d
[None][chore] update readme for perf release test (#6664)
Signed-off-by: ruodil <200874449+ruodil@users.noreply.github.com>
Co-authored-by: Larry <197874197+LarryXFly@users.noreply.github.com>
2025-08-07 10:00:45 +10:00
Yan Chunwei
5eae3184fa
[None][chore] add missing tests to test list (#6590)
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
2025-08-06 22:12:27 +08:00
Yechan Kim
1aed7511fe
[https://nvbugs/5430124][fix] Mistral mixture_text_image test case fix (#6648)
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
2025-08-06 06:58:58 -07:00
Iman Tabrizian
13ecb4aced
[https://nvbugs/5328160][fix] Unwaive disaggregated serving tests (#6644)
Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
2025-08-06 09:08:29 -04:00