Anurag Mukkara
b15f57763d
tests: PyTorch multimodal using keyword match ( #4215 )
...
* keyword accuracy check for pytorch multimodal
Signed-off-by: Anurag Mukkara <134339030+amukkara@users.noreply.github.com>
* Change keywords for some prompts
Signed-off-by: Anurag Mukkara <134339030+amukkara@users.noreply.github.com>
* Delete full text answers
Signed-off-by: Anurag Mukkara <134339030+amukkara@users.noreply.github.com>
* Cleanup debug code
Signed-off-by: Anurag Mukkara <134339030+amukkara@users.noreply.github.com>
---------
Signed-off-by: Anurag Mukkara <134339030+amukkara@users.noreply.github.com>
2025-05-14 17:18:43 +08:00
Yiqing Yan
a66a02a75a
[Infra] Waive L0 test ( #4295 )
...
Waive L0 test
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
2025-05-14 16:38:33 +08:00
Zongfei Jing
bb17649517
test: Add UT for moe trtllmgen ( #4258 )
...
* Add ut for moe trtllmgen
Signed-off-by: Zongfei Jing <20381269+zongfeijing@users.noreply.github.com>
* Update tests/unittest/_torch/modeling/test_modeling_deepseek.py
Co-authored-by: hlu1 <14827759+hlu1@users.noreply.github.com>
Signed-off-by: Zongfei Jing <20381269+zongfeijing@users.noreply.github.com>
---------
Signed-off-by: Zongfei Jing <20381269+zongfeijing@users.noreply.github.com>
Co-authored-by: hlu1 <14827759+hlu1@users.noreply.github.com>
2025-05-14 15:22:58 +08:00
bhsueh_NV
1a9298bc66
CI: add fp8/fp4 ci on Qwen3-30B-A3B ( #4266 )
...
add fp8/fp4 ci on Qwen3-30B-A3B
Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>
2025-05-14 14:38:04 +08:00
brb-nv
8280c3d4f2
feat: Support Gemma3-1b-it in Pytorch workflow ( #3999 )
...
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-05-14 14:02:44 +08:00
Yi Zhang
86ae506b9d
[fix] Enable pp tests ( #3978 )
...
Fix misrebase issue
Signed-off-by: Yi Zhang <187001205+yizhang-nv@users.noreply.github.com>
2025-05-14 10:51:20 +08:00
brb-nv
1ef117688c
test: Validate FP8 and LoRA for Gemma3 ( #3670 )
...
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-05-13 17:28:02 -07:00
Iman Tabrizian
f408de2d99
Waive disagg kv cache load balancer test ( #4276 )
...
Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
2025-05-14 06:03:24 +08:00
brb-nv
cd5b3d21a0
feat: Support Mistral Small 3.1 24B VLM in TRT workflow ( #4183 )
...
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-05-14 03:47:22 +08:00
Yiqing Yan
290649b6aa
[Infra] Waive L0 test ( #4269 )
...
Waive L0 test
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
2025-05-13 23:06:13 +08:00
Yiqing Yan
bfa16a63d4
[Infra] Waive L0 test ( #4268 )
...
Waive L0 test
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
2025-05-13 22:43:17 +08:00
dominicshanshan
44d6adfb68
Waive stress test. ( #4262 )
...
* Waive stress test.
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
* Apply suggestions from code review
Co-authored-by: Yiqing Yan <yiqingy@nvidia.com>
Signed-off-by: dominicshanshan <30051912+dominicshanshan@users.noreply.github.com>
---------
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
Signed-off-by: dominicshanshan <30051912+dominicshanshan@users.noreply.github.com>
Co-authored-by: Yiqing Yan <yiqingy@nvidia.com>
2025-05-13 21:01:57 +08:00
Enwei Zhu
8f68d56cc1
[ https://nvbugs/5220763 ] [test] Unwaive Mixtral FP8 TP2 test ( #4252 )
...
unwaive
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
2025-05-13 15:55:33 +08:00
Yiqing Yan
fda8b0277a
[Infra][TRTLLM-4374] Upgrade TRT 10.10.0 GA, CUDA 12.9 GA and DLFW 25.04 ( #4049 )
...
* [TRTLLM-4374] Upgrade TRT 10.10.0 GA, CUDA 12.9 GA and DLFW 25.04
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
* fix review
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
* update images
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
* Update jenkins/L0_Test.groovy
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
* update image name
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
---------
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
2025-05-13 14:59:12 +08:00
ruodil
d555fe2530
test: fix for perf test script issue ( #4230 )
...
fix for perf test script issue
Signed-off-by: Ruodi <200874449+ruodil@users.noreply.github.com>
Co-authored-by: Larry <197874197+LarryXFly@users.noreply.github.com>
2025-05-13 10:29:20 +08:00
xinhe-nv
0cebc16139
test: [CI] Add failed cases into waives.txt ( #4205 )
...
waive tests
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2025-05-13 10:22:42 +08:00
xinhe-nv
7ebae4dcaa
test: [CI] Add failed cases into waives.txt ( #4203 )
...
* update waive list
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
* update waives
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
---------
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2025-05-13 10:08:02 +08:00
Enwei Zhu
035d915fea
[TRTLLM-5081] [test] Align parametrize_with_ids to the pytest behavior ( #4090 )
...
* fix
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
* fix
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
* normalize mtp_nextn
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
* fix
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
* fix
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
* update test_durations
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
---------
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
2025-05-13 07:41:51 +08:00
wili
eba3623a54
Feat: Variable-Beam-Width-Search (VBWS) part4 ( #3979 )
...
* feat/vbws-part4-v1.8: rebase
Signed-off-by: wili-65535 <wili-65535@users.noreply.github.com>
* feat/vbws-part4-v1.9: fix incorrect output when using short output length
Signed-off-by: wili-65535 <wili-65535@users.noreply.github.com>
* v1.9.1: remove useless variables
Signed-off-by: wili-65535 <wili-65535@users.noreply.github.com>
* v1.9.2:fix incorrect output when using short output length
Signed-off-by: wili-65535 <wili-65535@users.noreply.github.com>
* v1.9.3: rebase
Signed-off-by: wili-65535 <wili-65535@users.noreply.github.com>
* v1.9.4: rebase
Signed-off-by: wili-65535 <wili-65535@users.noreply.github.com>
* v1.9.5: remove API change
Signed-off-by: wili-65535 <wili-65535@users.noreply.github.com>
---------
Signed-off-by: wili-65535 <wili-65535@users.noreply.github.com>
Co-authored-by: wili-65535 <wili-65535@users.noreply.github.com>
2025-05-12 22:32:29 +02:00
Enwei Zhu
c31ca1688c
[ https://nvbugs/5214229 ] [fix] Unwaive lm_head quantization case ( #4222 )
...
unwaive
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
2025-05-12 20:23:06 +08:00
Zheng Duan
c9e2a963e0
feat: add kv cache aware router ( #3831 )
...
* kv cache aware router
Signed-off-by: Zheng Duan <200704041+zhengd-nv@users.noreply.github.com>
* add tests
Signed-off-by: Zheng Duan <200704041+zhengd-nv@users.noreply.github.com>
* router config
Signed-off-by: Zheng Duan <200704041+zhengd-nv@users.noreply.github.com>
* eviction test
Signed-off-by: Zheng Duan <200704041+zhengd-nv@users.noreply.github.com>
add test
Signed-off-by: Zheng Duan <200704041+zhengd-nv@users.noreply.github.com>
* eviction detect in worker test
Signed-off-by: Zheng Duan <200704041+zhengd-nv@users.noreply.github.com>
* move worker tests to single gpu
Signed-off-by: Zheng Duan <200704041+zhengd-nv@users.noreply.github.com>
* reduce memory fraction
Signed-off-by: Zheng Duan <200704041+zhengd-nv@users.noreply.github.com>
* fix partial block
Signed-off-by: Zheng Duan <200704041+zhengd-nv@users.noreply.github.com>
---------
Signed-off-by: Zheng Duan <200704041+zhengd-nv@users.noreply.github.com>
2025-05-12 07:23:57 -04:00
Yixin Dong
c90ebadd84
feat: Support the Structural Tag in guided decoding ( #4066 )
...
* finish
Signed-off-by: Ubospica <ubospica@gmail.com>
* update
Signed-off-by: Ubospica <ubospica@gmail.com>
* update
Signed-off-by: Ubospica <ubospica@gmail.com>
* fix
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
* exc overlap scheduler
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
* add test
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
* fix api ref
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
---------
Signed-off-by: Ubospica <ubospica@gmail.com>
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
Co-authored-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
2025-05-12 17:24:50 +08:00
Yechan Kim
3e9bda3a09
[feat] Support HyperCLOVAX-SEED-Text language part ( #3902 )
...
* feat: support HyperCLOVAX-SEED-Text language part
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
* add Pytorch flow and remove test file
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
* revert summarize
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
* fix summarize
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
* remove from pytorch example
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
---------
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
Co-authored-by: QI JUN <22017000+QiJune@users.noreply.github.com>
2025-05-12 16:05:14 +08:00
Ivy Zhang
ee92edf2b4
[ https://nvbugspro.nvidia.com/bug/5270564 ][test] skip per-hopper for llama4 ( #4211 )
...
skip per-hopper for llama4
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
2025-05-12 15:27:15 +08:00
ruodil
9c03a7ab74
test: add llama_3.2_1B model and fix for test lora script issue ( #4139 )
...
* test: add llama_v3.1_8b_fp8 model, llama_v3.1_405b model and llama_nemotron_49b model in perf test, and modify original llama models dtype from float16 to bfloat16 according to README.md
Signed-off-by: Ruodi <200874449+ruodil@users.noreply.github.com>
* add llama_3.2_1B model and fix for lora script issue
Signed-off-by: Ruodi <200874449+ruodil@users.noreply.github.com>
---------
Signed-off-by: Ruodi <200874449+ruodil@users.noreply.github.com>
2025-05-12 14:51:59 +08:00
xinhe-nv
849d9c343c
tests: https://nvbugs/5219534 remove failed tests from test list ( #4113 )
...
remove unsupported tests
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2025-05-12 14:13:40 +08:00
Yiqing Yan
3c54e84e47
[Infra] Waive L0 test ( #4212 )
...
Waive L0 test
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
2025-05-12 11:37:49 +08:00
QI JUN
f021afa241
[CI] waive two multi-gpu test cases ( #4206 )
...
waive two multi-gpu test cases
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-05-12 08:04:48 +08:00
Enwei Zhu
7db368c72c
test: Remove CNN Dailymail tasks in favor of GSM8K ( #4187 )
...
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
2025-05-10 09:02:07 +08:00
Dom Brown
2d0f93a054
Refactor: Restructure C++ tests for better modularisation of non-shared code ( #4027 )
...
* Refactor: Restructure C++ tests for better modularisation of non-shared code
Start cleanup of pytest code for C++ tests
Signed-off-by: Dom Brown <3886319+DomBrown@users.noreply.github.com>
Clean up names and remove references to test_cpp.py
Signed-off-by: Dom Brown <3886319+DomBrown@users.noreply.github.com>
WIP
Signed-off-by: Dom Brown <3886319+DomBrown@users.noreply.github.com>
Move multi-GPU code
Signed-off-by: Dom Brown <3886319+DomBrown@users.noreply.github.com>
Update doc and try un-waiving
Signed-off-by: Dom Brown <3886319+DomBrown@users.noreply.github.com>
* Update multi GPU file check
Signed-off-by: Dom Brown <3886319+DomBrown@users.noreply.github.com>
* Address minor multi-GPU setup bug
Signed-off-by: Dom Brown <3886319+DomBrown@users.noreply.github.com>
---------
Signed-off-by: Dom Brown <3886319+DomBrown@users.noreply.github.com>
2025-05-09 19:16:51 +01:00
Mike Iovine
4b8ba7ad61
[fix][nvbug/5244009] Fix llama 4 test lists/scout accuracy issue ( #4069 )
...
[fix] Fix llama 4 test lists
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
2025-05-09 22:45:14 +08:00
Tracin
446f62bbab
chore: Deprecate evaltool ( #4173 )
...
Signed-off-by: Tracin <10434017+Tracin@users.noreply.github.com>
2025-05-09 20:31:53 +08:00
ruodil
bf5b2a2e0a
test: amend regex match for perf throughput ( #4186 )
...
amend regex match for perf throughput
Signed-off-by: Ruodi <200874449+ruodil@users.noreply.github.com>
2025-05-09 17:33:25 +08:00
xinhe-nv
9082411a50
test: [CI] Add failed cases into waives.txt ( #4165 )
...
wavie oom tests
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
Co-authored-by: Larry <197874197+LarryXFly@users.noreply.github.com>
2025-05-09 16:56:30 +08:00
ruodil
5ce5b81281
test: amend default pytorch extra-llm-api-config.yml in perf test ( #4176 )
...
* amend default pytorch extra-llm-api-config.yml
Signed-off-by: Ruodi <200874449+ruodil@users.noreply.github.com>
* add print info to separate cases in output log
Signed-off-by: Ruodi <200874449+ruodil@users.noreply.github.com>
---------
Signed-off-by: Ruodi <200874449+ruodil@users.noreply.github.com>
2025-05-09 16:46:48 +08:00
xinhe-nv
1d26a3fd7c
test: skip tests on b200 ( #3913 )
...
* skip tests on b200
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
* skip phi-3-128k
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
---------
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2025-05-09 14:51:55 +08:00
Bo Li
e3cf3fd15f
test: Add fp8kv to DS-v3-lite integration tests. ( #3950 )
...
* Add fp8 kv cache tests to DSV3-Lite integration tests.
Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>
* Refactor. Make fp8kv parallel to attention_dp, overlap_scheduler and cuda_graph.
Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>
* Update gsm8k.
Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>
* Update CI list.
Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>
* Update TestDeepSeekR1.
Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>
* Fix test list.
Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>
* Need quant_config besides pytorch_config.
Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>
* Update waive list (bug 5239087).
Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>
* Update waive list.
Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>
* Correct test name.
Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>
* Update waive list.
Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>
---------
Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>
Signed-off-by: Bo Li <bobboli0202@gmail.com>
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
Co-authored-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
2025-05-09 13:35:04 +08:00
Ivy Zhang
c91d03fa0a
test: move mistral / mixtral test cases in QA test list into the new accuracy test suite ( #3440 )
...
* add mistral-7b-v0.1 torch flow test case
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
* rearrange mistral
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
* rearrange mixtral case
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
* remove api function test
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
* move mistral nemo cases
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
* move mixtral cases
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
* update threshold
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
* fix failure
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
* fix name
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
* fix failure cases
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
* update list
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
* update threshold
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
* remove awq llmapi test
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
* adjust threshold
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
* fix ci
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
* fix partial comments
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
* fix path
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
* update thres
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
* update
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
* remove duplicate test case
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
* fix ci
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
---------
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
2025-05-09 13:32:02 +08:00
Ivy Zhang
c2d4c2adb6
[ https://nvbugspro.nvidia.com/bug/5260676 ]test: skip fp8 quantization case for pre-ada ( #4095 )
...
skip pre ada
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
2025-05-09 13:30:16 +08:00
Stanley Sun
fb31f91e15
test: add qwen3 and disaggregated serving accuracy tests to qa test list ( #4083 )
...
Signed-off-by: Stanley Sun <190317771+StanleySun639@users.noreply.github.com>
2025-05-09 11:03:02 +08:00
Enwei Zhu
74df12bbaa
[TRTLLM-4480][doc] Documentation for new accuracy test suite and trtllm-eval ( #3946 )
...
* fix formula
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
* update doc
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
* fix
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
* 1st version
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
* polish
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
* fix
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
---------
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
2025-05-08 19:35:23 +08:00
Ivy Zhang
7666bec7c4
[TRTQA-2861][test]: add nemotron and llama4 cases into qa test ( #4053 )
...
* add MMLU, GPQADiamond check for llama-4 models
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
* add nomotron cases
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
* add online quant test cases
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
* remove trt flow cases
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
* update threshold
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
* adjust parallelism strategy
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
* fix fail
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
* update sanity list
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
* fix comment
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
* skip nemotron-h test case
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
---------
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
Co-authored-by: Larry <197874197+LarryXFly@users.noreply.github.com>
2025-05-08 18:10:41 +08:00
xinhe-nv
4468158be4
test: [CI] remove closed bugs ( #4046 )
...
update waive list
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
Co-authored-by: Larry <197874197+LarryXFly@users.noreply.github.com>
2025-05-08 18:04:43 +08:00
Yiqing Yan
ce8832e80f
[Infra] Waive L0 flaky test ( #4148 )
...
Waive L0 test
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
2025-05-08 17:23:45 +08:00
yuanjingx87
6e1d2a1320
feat: Add Slurm support and enable RTX Pro 6000 testing pipeline in CI ( #4019 )
...
* Add slurm support with RTXPro6000 PostMerge Tests
Signed-off-by: Yuanjing Xue <197832395+yuanjingx87@users.noreply.github.com>
* remove H100 post merge test from testing
Signed-off-by: Yuanjing Xue <197832395+yuanjingx87@users.noreply.github.com>
---------
Signed-off-by: Yuanjing Xue <197832395+yuanjingx87@users.noreply.github.com>
2025-05-08 15:15:36 +08:00
Enwei Zhu
dae6781494
test: Waive disagg accuracy test ( #4124 )
...
* waive
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
* waive
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
---------
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
2025-05-08 13:39:07 +08:00
Ivy Zhang
d7c51c953b
test: add INTEGRATION_TEST env var to speed up integration test ( #3618 )
...
add INTEGRATION_TEST env var
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
2025-05-08 10:44:50 +08:00
ruodil
4d0e462723
tests: skip writing prepare_dataset output to logs, and add llama_v3.1_8b_fp8, llama_v3.3_70b_fp8, llama_v3.1_405b_fp4 models ( #3864 )
...
* tests: skip writing prepare_dataset output to logs
Signed-off-by: Ruodi <200874449+ruodil@users.noreply.github.com>
* test: add llama_v3.1_8b_fp8 model, llama_v3.1_405b model and llama_nemotron_49b model in perf test, and modify original llama models dtype from float16 to bfloat16 according to README.md
Signed-off-by: Ruodi <200874449+ruodil@users.noreply.github.com>
---------
Signed-off-by: Ruodi <200874449+ruodil@users.noreply.github.com>
Signed-off-by: Larry <197874197+LarryXFly@users.noreply.github.com>
Co-authored-by: Larry <197874197+LarryXFly@users.noreply.github.com>
2025-05-07 13:56:35 +08:00
Yan Chunwei
0c26059703
chore: Cleanup deprecated APIs from LLM-API (part 1/2) ( #3732 )
...
* beam_width and max_new_token
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
* remove beam_width
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
* remove min_length
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
* remove return_num_sequences
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
---------
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
2025-05-07 13:20:25 +08:00
Enwei Zhu
c28b90984f
[TRTLLM-3925, https://nvbugs/5245262 ] [fix] Normalize LLM.generate API ( #3985 )
...
* fix
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
* fix
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
---------
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
2025-05-07 11:06:23 +08:00