Wanli Jiang
9632dba02e
feat: TRTLLM-6450 update long rope for phi3.5/phi4-mini/phi4-mm ( #6353 )
...
Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>
2025-07-30 09:20:16 -07:00
xinhe-nv
d9ab3fd35e
tests: add TestNemotronH cuda graph tests ( #6390 )
...
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
Co-authored-by: Larry <197874197+LarryXFly@users.noreply.github.com>
2025-07-30 18:45:58 +10:00
Yechan Kim
d6eb8e2366
fix: support mixture of text & multimodal prompts ( #6345 )
...
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
2025-07-30 08:52:31 +08:00
ruodil
e11255e9d0
test:[nvbug 5415268] add kv_cache_free_gpu_mem_fraction param and llama4 rcca cases ( #6430 )
...
Signed-off-by: ruodil <200874449+ruodil@users.noreply.github.com>
2025-07-29 15:52:45 +10:00
Michal Guzek
2573bb729d
feat: Add Phi-4-Mini-Instruct in Pytorch backend for LLM API accuracy tests ( #6303 )
...
Signed-off-by: moraxu <mguzek@nvidia.com>
2025-07-28 14:02:14 -07:00
ruodil
03632a679f
test: organize perf cases and add missing perflab cases in qa test list ( #6283 )
...
Signed-off-by: ruodil <200874449+ruodil@users.noreply.github.com>
2025-07-28 20:33:32 +10:00
xinhe-nv
470544cf17
test: [CI] Add failed cases into waives.txt ( #6333 )
...
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-07-25 17:18:06 +10:00
xinhe-nv
6268a60ab3
tests: add test_chunked_prefill for llama4 ( #5549 )
...
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-07-24 23:02:00 -04:00
Iman Tabrizian
5fceaa6153
Revert "tests: add timeout_manager to tensorrt flow test cases ( #5942 )" ( #6309 )
2025-07-23 23:58:10 -04:00
Stanley Sun
04f2d4b2eb
test: update test list for RTX6KD ( #6213 )
...
Signed-off-by: Stanley Sun <190317771+StanleySun639@users.noreply.github.com>
2025-07-22 18:55:24 +08:00
Ivy Zhang
eb5cb5b642
tests: add timeout_manager to tensorrt flow test cases ( #5942 )
...
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
2025-07-22 10:23:41 +08:00
ruodil
6a3c9f8061
test: add phi-4 multimodel and bielik-11b-v2.2 models for perf test ( #5826 )
...
Signed-off-by: ruodil <200874449+ruodil@users.noreply.github.com>
Co-authored-by: Larry <197874197+LarryXFly@users.noreply.github.com>
2025-07-21 11:29:19 +10:00
wili
82d3587bb8
[refactor] Unify name of NGram speculative decoding ( #5937 )
...
Signed-off-by: wili-65535 <wili-65535@users.noreply.github.com>
Co-authored-by: wili-65535 <wili-65535@users.noreply.github.com>
2025-07-19 12:59:57 +08:00
Bo Deng
2c6fa145ee
[TRTLLM-6471] Infra: unwaive nixl tests and some disagg-serve tests ( #6095 )
...
Signed-off-by: Bo Deng <deemod@nvidia.com>
2025-07-19 00:48:44 +08:00
Chuang Zhu
44c70c88f9
chore:[BREAKING CHANGE] use cacheTransceiverConfig as knobs for disagg service ( #5234 )
...
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
2025-07-17 17:42:07 +08:00
chenfeiz0326
fe070a0168
test: Update Llama4 Scout FP4 & FP8 accuracy tests ( #5901 )
...
Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>
2025-07-17 09:41:18 +08:00
Wanli Jiang
2d2b8bae32
feat: TRTLLM-5574 Add phi-4-multimodal pytorch-backend support ( #5644 )
...
Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>
2025-07-17 06:30:58 +08:00
Ivy Zhang
dda91b5117
tests: add QA test cases ( #5959 )
...
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
2025-07-16 16:14:25 +08:00
Ivy Zhang
763012a88a
[nvbug/5359218][tests] add test llm api test case on lookahead with chunked prefill ( #6051 )
...
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
2025-07-16 16:04:08 +08:00
Wanli Jiang
8679a058a3
fix: Unable to load phi4-model with tp_size>1 ( #5962 )
...
Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>
2025-07-16 11:39:41 +08:00
ruodil
2a147c4d01
test: add llama_v3.3_70b_cases in perf test ( #6035 )
...
Signed-off-by: ruodil <200874449+ruodil@users.noreply.github.com>
2025-07-15 17:53:59 +10:00
brb-nv
1a2d96919c
feat: Update Gemma3 Vision Encoder ( #5973 )
...
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-07-14 22:38:10 +08:00
ruodil
347520494b
test: remove duplicate cases in perf sanity test ( #5870 )
...
Signed-off-by: ruodil <200874449+ruodil@users.noreply.github.com>
2025-07-14 17:17:30 +08:00
ruodil
278a1a7df3
test: fix some test failure and add llama_nemotron models in perf sanity test, add more torch cases ( #5693 )
...
Signed-off-by: ruodil <200874449+ruodil@users.noreply.github.com>
Co-authored-by: Larry <197874197+LarryXFly@users.noreply.github.com>
2025-07-14 17:17:30 +08:00
xinhe-nv
509363d858
tests: update sanity tests & fix tests ( #5906 )
...
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-07-11 19:48:19 +10:00
brb-nv
0385f89abc
test: Fix Gemma3 unit tests due to transformers upgrade ( #5921 )
...
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-07-10 17:24:10 -07:00
2ez4bz
87fe44fd29
feat(models): Mistral3.1 VLM pytorch backend support ( #5529 )
...
Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>
2025-07-09 13:17:40 -07:00
DylanChen-NV
74dca0aa7b
[NVBUG-5304516/5319741]Qwen2.5VL FP8 support ( #5029 )
...
Signed-off-by: Dylan Chen <191843203+DylanChen-NV@users.noreply.github.com>
2025-07-09 23:16:42 +08:00
Venky
e27215ca03
test: Validate and add accuracy& perf tests for Ministral-8B-Instruct[-FP8](pytorch only) ( #5654 )
...
Signed-off-by: Venky Ganesh <23023424+venkywonka@users.noreply.github.com>
2025-07-08 18:16:21 -07:00
Pamela Peng
da8c7372d4
[TRTLLM-5366][feat]Add support for sm121 ( #5524 )
...
Signed-off-by: Pamela Peng <179191831+pamelap-nvidia@users.noreply.github.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
Initial CI run failed a single step A30-CPP-3 due to timeout. Rerunning that step succeeded.
2025-07-08 14:27:00 -07:00
xinhe-nv
ff2dd72df4
tests: waive tests ( #5458 )
...
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
Co-authored-by: Larry <197874197+LarryXFly@users.noreply.github.com>
2025-06-26 14:53:55 +08:00
Enwei Zhu
fc7a81ceb0
test: Add LLGuidance test and refine guided decoding ( #5348 )
...
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
2025-06-25 14:12:56 +08:00
xinhe-nv
658fb5b54e
tests: update benchmark test lists ( #5365 )
...
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2025-06-24 15:23:38 +08:00
Fanrong Li
5d4ab47d5b
fix: refactor and fix mtp vanilla ( #4762 )
...
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
2025-06-20 05:23:39 +08:00
ruodil
e22e884b02
test: amend test case name in perf cluster test ( #5356 )
...
Signed-off-by: ruodil <200874449+ruodil@users.noreply.github.com>
2025-06-19 14:50:12 +08:00
ruodil
21ce9b6749
test: add qwen3 cases ( #5302 )
...
Signed-off-by: ruodil <200874449+ruodil@users.noreply.github.com>
Co-authored-by: Larry <197874197+LarryXFly@users.noreply.github.com>
2025-06-19 14:38:36 +08:00
bhsueh_NV
dce8620013
chore: enable moe_backend on Qwen3 test ( #5230 )
...
Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>
2025-06-19 13:40:45 +08:00
xinhe-nv
e5400eeae0
tests: add ds r1 tp4 test ( #5197 )
...
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2025-06-19 12:48:33 +08:00
Fanrong Li
6c3210a8be
[test] add nvfp4 DeepSeek-V3-Lite-mtp tests ( #5125 )
...
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
2025-06-19 09:48:22 +08:00
xinhe-nv
610a49f117
tests: add multi nodes tests ( #5196 )
...
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2025-06-18 18:08:04 +08:00
Wanli Jiang
3a02489e86
[TRTLLM-5758] test: Add Bielik-11B-v2.2 Model Support ( #5159 )
...
Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>
2025-06-18 15:12:49 +08:00
ruodil
3b5d916250
test: cherry-pick deepseek rcca cases in main branch ( #5307 )
...
Signed-off-by: ruodil <200874449+ruodil@users.noreply.github.com>
Co-authored-by: Larry <197874197+LarryXFly@users.noreply.github.com>
2025-06-18 14:26:26 +08:00
Ivy Zhang
41cfcaa964
test: update qa test list ( #5305 )
...
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
2025-06-18 11:29:11 +08:00
Ivy Zhang
2ad8758ecc
[TRTLLM-5786][ https://nvbugspro.nvidia.com/bug/5310520 ][test] Add QA test cases ( #5073 )
...
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
2025-06-17 17:14:01 +08:00
ruodil
bb2348372c
test: add more pytorch cases in perf test ( #5237 )
...
Signed-off-by: ruodil <200874449+ruodil@users.noreply.github.com>
2025-06-17 11:11:28 +08:00
Ivy Zhang
64b7f04fdc
[test] split nemotron test cases from examples_test_list ( #5238 )
...
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
2025-06-16 16:36:33 +08:00
ruodil
2848e012ae
test: add llama4 models for perf test ( #5187 )
...
Signed-off-by: ruodil <200874449+ruodil@users.noreply.github.com>
Co-authored-by: Larry <197874197+LarryXFly@users.noreply.github.com>
2025-06-16 11:24:35 +08:00
ruodil
3d22f27063
test: add more cases for llama_v3.3/3.1 70b fp8 and set enable_attention_dp to false to non-deepseek models ( #5155 )
...
Signed-off-by: ruodil <200874449+ruodil@users.noreply.github.com>
2025-06-16 11:23:20 +08:00
Enwei Zhu
babdd9ce06
test: Add json_mode_eval for guided decoding evaluation ( #5179 )
...
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
2025-06-16 10:03:55 +08:00
amitz-nv
109c426077
Enable trtllm-bench to run LoRA and add basic e2e perf testing capability for LoRA in PyT flow ( #5130 )
2025-06-15 18:54:04 +03:00