Commit Graph

28 Commits

Author SHA1 Message Date
Yi Zhang
7cf1209a19
[fix]: Fix main test skip issue (#5503)
Signed-off-by: Yi Zhang <187001205+yizhang-nv@users.noreply.github.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
2025-06-30 21:39:49 -04:00
Darragh Hanley
5437075def
ReDrafter support for Qwen (#4875)
Signed-off-by: darraghdog <darragh.hanley@gmail.com>
Signed-off-by: Darragh Hanley <darragh.hanley@gmail.com>
Co-authored-by: rakib-hasan <rhasan@nvidia.com>
2025-06-28 02:33:10 +08:00
Yi Zhang
9b616db13b
test: Add fixture to skip tests based on MPI world size (#5028)
Signed-off-by: Yi Zhang <187001205+yizhang-nv@users.noreply.github.com>
2025-06-16 11:25:01 +08:00
xinhe-nv
11b94feff8
test: skip disaggregated tests on arm (#5070)
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2025-06-11 17:00:10 +08:00
Yi Zhang
1fca654bfd
tests: Update gb200 test case (#4754)
Signed-off-by: Yi Zhang <187001205+yizhang-nv@users.noreply.github.com>
2025-06-04 18:49:20 +08:00
Emma Qiao
c945e92fdb
[Infra]Remove some old keyword (#4552)
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-05-31 13:50:45 +08:00
amirkl94
fbec0c3552
Release 0.20 to main (#4577)
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
Signed-off-by: Martin Marciniszyn Mehringer <11665257+MartinMarciniszyn@users.noreply.github.com>
Signed-off-by: Yuan Tong <13075180+tongyuantongyu@users.noreply.github.com>
Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
Signed-off-by: Venky <23023424+venkywonka@users.noreply.github.com>
Signed-off-by: Ruodi <200874449+ruodil@users.noreply.github.com>
Signed-off-by: Stefan Niebler <82932102+stnie@users.noreply.github.com>
Signed-off-by: Simeng Liu <simengl@nvidia.com>
Signed-off-by: Faraz Khoubsirat <58580514+farazkh80@users.noreply.github.com>
Signed-off-by: moraxu <mguzek@nvidia.com>
Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
Signed-off-by: Jinyang Yuan <154768711+jinyangyuan-nvidia@users.noreply.github.com>
Co-authored-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
Co-authored-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
Co-authored-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
Co-authored-by: Netanel Haber <58652339+netanel-haber@users.noreply.github.com>
Co-authored-by: Martin Marciniszyn Mehringer <11665257+MartinMarciniszyn@users.noreply.github.com>
Co-authored-by: Yuan Tong <13075180+tongyuantongyu@users.noreply.github.com>
Co-authored-by: Yukun He <23156053+hyukn@users.noreply.github.com>
Co-authored-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
Co-authored-by: Venky <23023424+venkywonka@users.noreply.github.com>
Co-authored-by: ruodil <200874449+ruodil@users.noreply.github.com>
Co-authored-by: stnie <82932102+stnie@users.noreply.github.com>
Co-authored-by: Simeng Liu <109828133+SimengLiu-nv@users.noreply.github.com>
Co-authored-by: Faraz <58580514+farazkh80@users.noreply.github.com>
Co-authored-by: Michal Guzek <moraxu@users.noreply.github.com>
Co-authored-by: Iman Tabrizian <10105175+Tabrizian@users.noreply.github.com>
Co-authored-by: Jinyang Yuan <154768711+jinyangyuan-nvidia@users.noreply.github.com>
2025-05-28 16:25:33 +08:00
Emma Qiao
6f626af386
[TRTLLM-4535][infra]: Add marker TIMEOUT for test level (#3905)
* Add marker for TIMEOUT

Signed-off-by: qqiao <qqiao@nvidia.com>

* Remove workspace after tests

Signed-off-by: qqiao <qqiao@nvidia.com>

* Add missed property

Signed-off-by: qqiao <qqiao@nvidia.com>

* Add some debug info

Signed-off-by: qqiao <qqiao@nvidia.com>

* Fix errors

Signed-off-by: qqiao <qqiao@nvidia.com>

* Testing

Signed-off-by: qqiao <qqiao@nvidia.com>

* Special process for unittests

Signed-off-by: qqiao <qqiao@nvidia.com>

* Move special proecessing unittests to test generating stage

Signed-off-by: qqiao <qqiao@nvidia.com>

* Process for the whole test list

Signed-off-by: qqiao <qqiao@nvidia.com>

* Test more

Signed-off-by: qqiao <qqiao@nvidia.com>

* Add another test case

Signed-off-by: qqiao <qqiao@nvidia.com>

* Change back the setting for testing

Signed-off-by: qqiao <qqiao@nvidia.com>

* Revert another config file

Signed-off-by: qqiao <qqiao@nvidia.com>

* Add descriptionf or timeout in test readme

Signed-off-by: qqiao <qqiao@nvidia.com>

---------

Signed-off-by: qqiao <qqiao@nvidia.com>
2025-05-25 23:30:40 -07:00
Iman Tabrizian
4c7191af67
Move Triton backend to TRT-LLM main (#3549)
* Move TRT-LLM backend repo to TRT-LLM repo

Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>

* Address review comments

Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>

* debug ci

Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>

* Update triton backend

Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>

* Fixes after update

Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>

---------

Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
2025-05-16 07:15:23 +08:00
Faraz
42de79d49e
test: Added tests for Llama3.1-70B-BF16 on SM120 (#4198)
* Added tests for Llama3.1-70B-BF16 on SM120

Signed-off-by: Faraz Khoubsirat <58580514+farazkh80@users.noreply.github.com>

* solve conflicts add more tests

Signed-off-by: Faraz Khoubsirat <58580514+farazkh80@users.noreply.github.com>

---------

Signed-off-by: Faraz Khoubsirat <58580514+farazkh80@users.noreply.github.com>
2025-05-14 11:57:49 -04:00
brb-nv
cd5b3d21a0
feat: Support Mistral Small 3.1 24B VLM in TRT workflow (#4183)
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-05-14 03:47:22 +08:00
Enwei Zhu
035d915fea
[TRTLLM-5081] [test] Align parametrize_with_ids to the pytest behavior (#4090)
* fix

Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>

* fix

Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>

* normalize mtp_nextn

Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>

* fix

Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>

* fix

Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>

* update test_durations

Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>

---------

Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
2025-05-13 07:41:51 +08:00
Tracin
446f62bbab
chore: Deprecate evaltool (#4173)
Signed-off-by: Tracin <10434017+Tracin@users.noreply.github.com>
2025-05-09 20:31:53 +08:00
Emma Qiao
2692daad2e
infra: Remove the WAR for test items incompletely (#3313)
* Remove the WAR for test items incompleted

Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>

* Complete test item manually

Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>

* Fix another test definition file

Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>

* Complete test name

Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>

* Fix some other test names

Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>

* Fix another test name after rebase

Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>

* Update name for waived case name, too

Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>

* Fix name for multi-gpu tests

Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>

* Fix test name after rebase

Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>

* Fix another test name

Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>

* Fix typo

Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>

* Fix test name after rebase

Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>

* Fix other qa tests

Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>

* Fix tests name after rebase

Signed-off-by: qqiao <qqiao@nvidia.com>

* Fix name after rebase

Signed-off-by: qqiao <qqiao@nvidia.com>

* Correct test names in waive.txt

Signed-off-by: qqiao <qqiao@nvidia.com>

* Add new test_durations file

Signed-off-by: qqiao <qqiao@nvidia.com>

* Fix names after rebase

Signed-off-by: qqiao <qqiao@nvidia.com>

* Update test duration to latest

Signed-off-by: qqiao <qqiao@nvidia.com>

---------

Signed-off-by: EmmaQiaoCh <qqiao@nvidia.com>
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-05-04 11:31:59 +08:00
yuxianq
f568cbb671
chore: Remove duplicated get_sm_version. (#3935)
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
2025-04-30 11:43:53 +08:00
Pamela Peng
c8649ce3aa
skip blackwell tests for sm120 (#3815)
Signed-off-by: Pamela Peng <179191831+pamelap-nvidia@users.noreply.github.com>
2025-04-29 09:53:35 -07:00
Dom Brown
8709fe8b53
chore: bump version to 0.19.0 (#3598) (#3841)
test: add test cases for 0.19 release (#3608)

* fix test name



* add quickstart test for nemotron-ultra



* add rcca multi-node test case for deepseek-v3



* add rcca info



---------




squash (#3642)



fix: nvbugs/5187237: fix deterministic mode crash (#3448)

* nvbugs/5187237 nvbugs/5112075: fix deterministic mode error

* remove waive


* Revert "remove waive"

This reverts commit 0bf5486d19906d692bfb7a6262333c296b0087ac.



* revert ar fusion



---------



update fp8 doc (#3647)




tests: change qa perf test to trtllm-bench (#3619)




 fix: FP8 quantized lm_head (NvBug 5214229) (#3567)



infra: Add PR approval protection for the release branch (#3634)



fix: nvbugs/5231298: pytorch allreduce issue (#3673)



Fix: nvbugs/5222698 variable not defined (#3630)

* Fix: nvbugs/5222698 variable not defined



* Tidy code



---------



test:sync waives.txt from main branch by disabling test_perf/gpt_350m-cppmanager case (#3685)



test:restore fp8 kv cache testing for L0 (#3671)



doc: Update DeepSeek perf docs (#3693)

* Update DeepSeek perf docs



* update



* Apply suggestions from code review




---------




tests: waive test_llm_multi_node (#3664)



fix: update test_user_buffers_mm_add_prologue atol (#3711)



Fix: cherry-pick hmac encryption from main branch (#3635)

* security fix cherry-pick changes from main



* fix hmac in remote mpi session (#3649)



---------





Un-waive DS-V3-Lite tests. (#3621)



fix: FP8 kv accuracy (#3675)

* fix FP8 kv accuracy



* update doc



---------



Fix script options for engines. (#3622)



unwaive multi-node test (#3721)



chore : Split more tests out of gpt tests (#3524) (#3674)



doc:add torch examples link into torch backend documentation (#3749)




test: Get Eagle tests working (#3593) (#3722)




Waive L0 test (#3756)



waive failed case in perf test, change default max_batch_size to 512 and write config.json to output log (#3656)





Update ds v3 parameters in stress test. (#3676)

waive gemma on L20 (#3766)



https://nvbugs/5141291: Fix convert.py script for Qwen model. (#3758)

Include Qwen2VLDecoderLayer in the smooth_qwen2_model function.



fix: PP4 fixes and cleanup (#3688)




remove benchmark test list (#3643)



skip disagg deepseek test if sm!=90 (#3720)



test: skip failed cases on B200 (#3710)

* add skip condition to tests



* fix error



---------



test: [nvbug: 5234494] skip_pre_ada for fp8 cases (#3718)

* skip_pre_ada for fp8 cases



* update



* update after rebase



---------



add know issue to deepseek doc. (#3800)



Fix ModelOpt Mixtral AWQ OOM (#3714) (#3761)




Waive L0 tests (#3826)



fix: Reduce memory usage in fused moe op associated with AutoTuning and fix moe fallback issue. (#3793)

* Reduce memory usage in fused moe op associated with AutoTuning.
* Replace pre-defined bucket size strategy with a generating function based on the tune_max_num_tokens.
* Add free_memory logic of workspace in min_latency_mode fused moe path.



* Fix fused_moe fallback issue. (#3652)

min_latency_mode is only set to False during warmup phase. Thus when it becomes true during inference, all tactics fall back to the default one and thus cause perf regression.



---------



[doc] Better document for Draft-Target-Model (DTM) speculative decoding (#3797)




Fix pre-commit



Fix again



Address some review comments for the MI

Signed-off-by: Dom Brown <3886319+DomBrown@users.noreply.github.com>
Co-authored-by: Zhanrui Sun <184402041+ZhanruiSunCh@users.noreply.github.com>
2025-04-29 16:57:22 +08:00
QI JUN
d51ae53940
move the reset models into examples/models/core directory (#3555)
* move rest models to examples/models/core directory

Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>

* update multimodal readme

Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>

* fix example path

Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>

* fix ci

Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>

* fix ci

Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>

* fix cpp test

Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>

* fix tensorrt test

Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>

* fix ci

Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>

* fix ci

Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>

* fix ci

Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>

* fix ci

Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>

* fix ci

Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>

* fix ci

Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>

* fix ci

Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>

* fix ci

Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>

* fix ci

Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>

* fix ci

Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>

* fix ci

Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>

* fix ci

Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>

* fix ci

Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>

* fix ci

Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>

* fix ci

Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>

* fix ci

Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>

* fix ci

Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>

* fix ci

Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>

* fix ci

Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>

* fix ci

Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>

---------

Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-04-19 20:48:59 -07:00
Pamela Peng
6cdfc54883
feat: Add FP8 support for SM 120 (#3248)
* Allow FP8 on SM120

Signed-off-by: Pamela Peng <179191831+pamelap-nvidia@users.noreply.github.com>

* fix sm121

Signed-off-by: Pamela Peng <179191831+pamelap-nvidia@users.noreply.github.com>

* fix

Signed-off-by: Pamela Peng <179191831+pamelap-nvidia@users.noreply.github.com>

* fix pre-commit

Signed-off-by: Pamela Peng <179191831+pamelap-nvidia@users.noreply.github.com>

* review update

Signed-off-by: Pamela Peng <179191831+pamelap-nvidia@users.noreply.github.com>

---------

Signed-off-by: Pamela Peng <179191831+pamelap-nvidia@users.noreply.github.com>
Co-authored-by: Sharan Chetlur <116769508+schetlur-nv@users.noreply.github.com>
2025-04-14 16:05:41 -07:00
yuxianq
7b03350527
Add thread leak check and fix thread/memory leak issues. (#3270)
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
2025-04-08 19:03:18 +08:00
amirkl94
e04f6a1b9b
fix: Fix p-tuning test bug (#3326)
* fix: Fix p-tuning test bug

* A change in the vocab_size calculation for T5Tokenizer,
introduced in transformers version 4.34, caused addition of incorrect vtokens for ptuning.
In general, instead of adding tokens which are outside the vocabulary, tokens inside the vocabulary were added.

Signed-off-by: Amir Klein <203507526+amirkl94@users.noreply.github.com>
2025-04-08 17:14:00 +08:00
Enwei Zhu
ba019a43d6
test: Accuracy test improvement (Part 3.3): Move DeepSeek tests (#3260)
add skip



fix



fix



update



update test list



fixqa list



move bf16 to postmerge

Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
2025-04-08 07:19:04 +08:00
bhsueh_NV
322ac565fc
chore: clean some ci of qa test (#3083)
* move some models to examples/models/contrib

Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>

* update the document

Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>

* remove arctic, blip2, cogvlm, dbrx from qa test list

Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>

* remove tests of dit, mmdit and stdit from qa test

Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>

* remove grok, jais, sdxl, skywork, smaug from qa test list

Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>

* re-organize the glm examples

Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>

* fix issues after running pre-commit

Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>

* fix some typo in glm_4_9b readme

Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>

* fix bug

Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>

---------

Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>
2025-03-31 14:30:41 +08:00
Yechan Kim
3c7cb6629c
Add EXAONE-Deep (#3054)
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
Co-authored-by: QI JUN <22017000+QiJune@users.noreply.github.com>
2025-03-26 14:24:04 +08:00
bhsueh_NV
5724c61934
chore: fix bug of model paths in confset.py (#3011)
* fix bugs of model paths of models in examples/models/contrib/

Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>

* fix bug of code layout

Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>

* fix bug of test_multimodal.py

Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>

* add gptj_example_root back

Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>

---------

Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>
2025-03-25 17:00:44 +08:00
Enwei Zhu
705eef68c2
test: Accuracy test improvement (Part 2): Incorporate mmlu to accuracy test suite (#2982)
* Accuracy test improvement (Part 2)

Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>

* WAR OOM

Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>

update

Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>

* fix

Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>

* fix

Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>

---------

Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
2025-03-25 07:34:10 +08:00
nv-guomingz
ec4f43a0ab
test:remove opt/mpt/gptj/gptneox/bloom/falcon/baichuan/internlm/deep_… (#2987)
* test:remove opt/mpt/gptj/gptneox/bloom/falcon/baichuan/internlm/deep_seek_v2 test cases.

Signed-off-by: nv-guomingz <37257613+nv-guomingz@users.noreply.github.com>

* updatet test case per review comments

Signed-off-by: nv-guomingz <37257613+nv-guomingz@users.noreply.github.com>

---------

Signed-off-by: nv-guomingz <37257613+nv-guomingz@users.noreply.github.com>
Co-authored-by: nv-guomingz <37257613+nv-guomingz@users.noreply.github.com>
2025-03-24 14:18:06 +08:00
Kaiyu Xie
2631f21089
Update (#2978)
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
2025-03-23 16:39:35 +08:00