Commit Graph

88 Commits

Author SHA1 Message Date
shuyixiong
c73efe12e7
[None][chore] Use cached model in all ray tests (#8962)
Signed-off-by: shuyix <219646547+shuyixiong@users.noreply.github.com>
2025-11-06 15:14:15 +01:00
Aurelien Chartier
0a02f5f25d
[None][chore] Use a cached model path for Ray integration test (#8660)
Signed-off-by: Aurelien Chartier <2567591+achartier@users.noreply.github.com>
2025-10-27 19:16:06 -07:00
gramnarayan
88b0fbc8ff
[#8245][feat] Autodeploy: Guided Decoding Support (#8551)
Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>
Signed-off-by: Govind Ramnarayan <105831528+govind-ramnarayan@users.noreply.github.com>
Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
Co-authored-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>
Co-authored-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
2025-10-28 09:29:57 +08:00
xinhe-nv
2aaedd08cd
[TRTLLM-8638][fix] fix test issues (#8557)
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-10-24 02:16:55 -04:00
Stanley Sun
6b793d5c3d
[TRTLLM-8738][test] Add end-to-end trtllm-serve negative tests (#8580)
Signed-off-by: Stanley Sun <stsun@nvidia.com>
2025-10-24 13:23:47 +08:00
Anish Shanbhag
15de45d782
[TRTLLM-8682][chore] Remove auto_parallel module (#8329)
Signed-off-by: Anish Shanbhag <ashanbhag@nvidia.com>
2025-10-22 20:53:08 -04:00
xinhe-nv
f70eff30b3
[TRTLLM-8638][fix] waive llam4 tests on H20 (#8416)
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-10-16 03:14:56 -07:00
Jonas Yang CN
88ea2c4ee9
[TRTLLM-7349][feat] Adding new orchestrator type -- ray (#7520)
Signed-off-by: Erin Ho <14718778+hchings@users.noreply.github.com>
Co-authored-by: Yuan Tong <13075180+tongyuantongyu@users.noreply.github.com>
Co-authored-by: Erin Ho <14718778+hchings@users.noreply.github.com>
2025-10-04 08:12:24 +08:00
Michal Guzek
38da871db3
[TRTLLM-6496][feat] Add LoRa Torch tests for the latest NIM model list (#6806)
Signed-off-by: Michal Guzek <mguzek@nvidia.com>
2025-10-03 12:10:48 -07:00
Guoming Zhang
202bed4574 [None][chroe] Rename TensorRT-LLM to TensorRT LLM for source code. (#7851)
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
2025-09-25 21:02:35 +08:00
Yuxian Qiu
48a5270868
[https://nvbugs/5492485][fix] Use offline dataset from llm-models instead. (#7435)
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
2025-09-04 09:58:16 -07:00
Stanley Sun
db8eb0a447
[TRTLLM-7876][test] Test trtllm-serve with --extra_llm_api_options (#7492)
Signed-off-by: Stanley Sun <190317771+StanleySun639@users.noreply.github.com>
2025-09-04 10:34:38 +08:00
Ivy Zhang
fc85e3db1c
[None][fix] fix llmapi import error (#7030)
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
2025-08-19 22:58:13 -04:00
xinhe-nv
2c86cee38c
[None][chore] Remove closed bugs (#6969)
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-08-19 16:01:33 +08:00
Ivy Zhang
d101a6cebc
[https://nvbugs/5410279][test] resubmit timeout refactor (#6337)
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
2025-08-05 16:39:25 +08:00
Ivy Zhang
7547a7d0a2
[TRTLLM-6473][test] add speculative decoding and ep load balance cases into QA test list (#6436)
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
2025-08-03 22:11:26 -04:00
xinhe-nv
263c6c0ad0
test: skip post blackwell (#6357)
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-08-01 13:10:14 -04:00
Ivy Zhang
71524a1a48
[https://nvbugs/5419066][fix] Use trt flow LLM (#6467)
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
2025-08-01 03:33:07 -04:00
brb-nv
0e16d1f070
test: Add time logging for lora tests (#6466)
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-07-30 14:02:43 -07:00
xinhe-nv
4fbb344caf
test: [CI] Add failed cases into waives.txt (#6423)
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-07-29 19:00:30 +10:00
Iman Tabrizian
5fceaa6153
Revert "tests: add timeout_manager to tensorrt flow test cases (#5942)" (#6309) 2025-07-23 23:58:10 -04:00
Ivy Zhang
eb5cb5b642
tests: add timeout_manager to tensorrt flow test cases (#5942)
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
2025-07-22 10:23:41 +08:00
wili
82d3587bb8
[refactor] Unify name of NGram speculative decoding (#5937)
Signed-off-by: wili-65535 <wili-65535@users.noreply.github.com>
Co-authored-by: wili-65535 <wili-65535@users.noreply.github.com>
2025-07-19 12:59:57 +08:00
Ivy Zhang
763012a88a
[nvbug/5359218][tests] add test llm api test case on lookahead with chunked prefill (#6051)
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
2025-07-16 16:04:08 +08:00
xavier-nvidia
b6013da198
Fix GEMM+AR fusion on blackwell (#5563)
Signed-off-by: xsimmons <xsimmons@nvidia.com>
2025-07-09 08:48:47 +08:00
xinhe-nv
3869b969a6
test: [CI] Add failed cases into waives.txt (#5718)
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-07-04 17:24:48 +09:00
brb-nv
4ef60d5fbb nvbugs-5331031; nvbugs-5344203 - address intermittent issues with Mistral Small multimodal for BS=8 (#5453)
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-07-01 20:12:55 +08:00
Yan Chunwei
a5eff139f1
[TRTLLM-5277] chore: refine llmapi examples for 1.0 (part1) (#5431)
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
Signed-off-by: Erin Ho <14718778+hchings@users.noreply.github.com>
Co-authored-by: Erin Ho <14718778+hchings@users.noreply.github.com>
2025-07-01 19:06:41 +08:00
Darragh Hanley
5437075def
ReDrafter support for Qwen (#4875)
Signed-off-by: darraghdog <darragh.hanley@gmail.com>
Signed-off-by: Darragh Hanley <darragh.hanley@gmail.com>
Co-authored-by: rakib-hasan <rhasan@nvidia.com>
2025-06-28 02:33:10 +08:00
xinhe-nv
ff2dd72df4
tests: waive tests (#5458)
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
Co-authored-by: Larry <197874197+LarryXFly@users.noreply.github.com>
2025-06-26 14:53:55 +08:00
Yan Chunwei
9bd42ecf9b
[TRTLLM-5208][BREAKING CHANGE] chore: make pytorch LLM the default (#5312)
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
2025-06-20 03:01:10 +08:00
Aurelien Chartier
d25f93c07f
chore: skip test_llm_gpt2_medium_fp8 for fp8_pc_pt + quant_lm_head (#5293)
Signed-off-by: Aurelien Chartier <2567591+achartier@users.noreply.github.com>
2025-06-18 11:13:12 -07:00
amirkl94
8451a87742
chore: Mass integration of release/0.20 (#5082)
Signed-off-by: Stanley Sun <190317771+StanleySun639@users.noreply.github.com>
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
Signed-off-by: Erin Ho <14718778+hchings@users.noreply.github.com>
Signed-off-by: Frank Di Natale <3429989+FrankD412@users.noreply.github.com>
Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
Co-authored-by: Stanley Sun <190317771+StanleySun639@users.noreply.github.com>
Co-authored-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
Co-authored-by: Erin <14718778+hchings@users.noreply.github.com>
Co-authored-by: Frank <3429989+FrankD412@users.noreply.github.com>
Co-authored-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
Co-authored-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
Co-authored-by: Yechan Kim <161688079+yechank-nvidia@users.noreply.github.com>
Co-authored-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
2025-06-17 14:32:02 +03:00
Aurelien Chartier
1389f5a4d3
feat: Add support for fp8 rowwise quantization (#4876)
Signed-off-by: Aurelien Chartier <2567591+achartier@users.noreply.github.com>
Co-authored-by: aikitoria <151776613+aikitoria@users.noreply.github.com>
2025-06-14 06:37:48 -07:00
nv-guomingz
cf35a079f9
fix:https://nvbugs/5298661 (#5022)
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
2025-06-12 20:41:44 +08:00
Omer Ullman Argov
8731f5f14f
chore: Mass integration of release/0.20 (#4898)
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
Signed-off-by: Hui Gao <huig@nvidia.com>
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>
Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
Signed-off-by: Ruodi <200874449+ruodil@users.noreply.github.com>
Signed-off-by: ruodil <200874449+ruodil@users.noreply.github.com>
Signed-off-by: Stanley Sun <190317771+StanleySun639@users.noreply.github.com>
Signed-off-by: Pamela Peng <179191831+pamelap-nvidia@users.noreply.github.com>
Signed-off-by: Anurag Mukkara <134339030+amukkara@users.noreply.github.com>
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
Signed-off-by: Faraz Khoubsirat <58580514+farazkh80@users.noreply.github.com>
Signed-off-by: moraxu <mguzek@nvidia.com>
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
Co-authored-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
Co-authored-by: Yiqing Yan <yiqingy@nvidia.com>
Co-authored-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
Co-authored-by: HuiGao-NV <huig@nvidia.com>
Co-authored-by: brb-nv <169953907+brb-nv@users.noreply.github.com>
Co-authored-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
Co-authored-by: Bo Li <22713281+bobboli@users.noreply.github.com>
Co-authored-by: Iman Tabrizian <10105175+Tabrizian@users.noreply.github.com>
Co-authored-by: ruodil <200874449+ruodil@users.noreply.github.com>
Co-authored-by: Stanley Sun <190317771+StanleySun639@users.noreply.github.com>
Co-authored-by: Pamela Peng <179191831+pamelap-nvidia@users.noreply.github.com>
Co-authored-by: Anurag Mukkara <134339030+amukkara@users.noreply.github.com>
Co-authored-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
Co-authored-by: Faraz <58580514+farazkh80@users.noreply.github.com>
Co-authored-by: Michal Guzek <moraxu@users.noreply.github.com>
Co-authored-by: Larry <197874197+LarryXFly@users.noreply.github.com>
Co-authored-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
Co-authored-by: Yechan Kim <161688079+yechank-nvidia@users.noreply.github.com>
2025-06-08 23:26:26 +08:00
Ivy Zhang
8686868531
tests: [TRTQA-2905] improve timeout report for qa test cases (#4753)
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
Co-authored-by: Larry <197874197+LarryXFly@users.noreply.github.com>
2025-06-03 12:27:27 +08:00
xinhe-nv
53794b26f8
test: skip test_llm_hf_gemma_quantization_1gpu_vswa on A100 (#4779)
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2025-05-30 15:12:12 +08:00
xinhe-nv
59f7622281
test: rcca https://nvbugs/5223130 (#4510)
* add rcca tests

Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>

* skip tests on blackwell

Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>

---------

Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2025-05-27 09:59:47 +08:00
xinhe-nv
407ef08662
tests: add qwene fp4 tests into QA test list & update sanity test list (#4478)
* update sanity test list

Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>

* update test list

Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>

---------

Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
Signed-off-by: Larry <197874197+LarryXFly@users.noreply.github.com>
Co-authored-by: Larry <197874197+LarryXFly@users.noreply.github.com>
2025-05-21 16:52:02 +08:00
xinhe-nv
14bfb5e0d6
test: FIX test_ptp_quickstart_advanced_deepseek_v3_2nodes_8gpus (#4283)
* update test_ptp_quickstart_advanced_deepseek_v3_2nodes_8gpus

Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>

* skip llava-v1.6-mistral-7b-hf-vision-trtllm on L40S

Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>

---------

Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2025-05-15 15:57:44 +08:00
brb-nv
1ef117688c
test: Validate FP8 and LoRA for Gemma3 (#3670)
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-05-13 17:28:02 -07:00
brb-nv
cd5b3d21a0
feat: Support Mistral Small 3.1 24B VLM in TRT workflow (#4183)
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-05-14 03:47:22 +08:00
xinhe-nv
0cebc16139
test: [CI] Add failed cases into waives.txt (#4205)
waive tests

Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2025-05-13 10:22:42 +08:00
xinhe-nv
849d9c343c
tests: https://nvbugs/5219534 remove failed tests from test list (#4113)
remove unsupported tests

Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2025-05-12 14:13:40 +08:00
Tracin
446f62bbab
chore: Deprecate evaltool (#4173)
Signed-off-by: Tracin <10434017+Tracin@users.noreply.github.com>
2025-05-09 20:31:53 +08:00
xinhe-nv
1d26a3fd7c
test: skip tests on b200 (#3913)
* skip tests on b200

Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>

* skip phi-3-128k

Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>

---------

Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2025-05-09 14:51:55 +08:00
Ivy Zhang
c91d03fa0a
test: move mistral / mixtral test cases in QA test list into the new accuracy test suite (#3440)
* add mistral-7b-v0.1 torch flow test case

Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>

* rearrange mistral

Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>

* rearrange mixtral case

Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>

* remove api function test

Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>

* move mistral nemo cases

Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>

* move mixtral cases

Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>

* update threshold

Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>

* fix failure

Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>

* fix name

Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>

* fix failure cases

Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>

* update list

Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>

* update threshold

Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>

* remove awq llmapi test

Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>

* adjust threshold

Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>

* fix ci

Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>

* fix partial comments

Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>

* fix path

Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>

* update thres

Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>

* update

Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>

* remove duplicate test case

Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>

* fix ci

Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>

---------

Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
2025-05-09 13:32:02 +08:00
Ivy Zhang
c2d4c2adb6
[https://nvbugspro.nvidia.com/bug/5260676]test: skip fp8 quantization case for pre-ada (#4095)
skip pre ada

Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
2025-05-09 13:30:16 +08:00
nv-guomingz
dc344b6a4f
fix:https://nvbugs/5246733 (#3989)
Signed-off-by: nv-guomingz <37257613+nv-guomingz@users.noreply.github.com>
Co-authored-by: nv-guomingz <37257613+nv-guomingz@users.noreply.github.com>
2025-05-01 22:52:31 +08:00