Xiwen Yu
5e7aa76bb4
Merge branch 'user/sm103_trtllmgen' into feat/b300_cu13
...
Signed-off-by: Xiwen Yu <13230610+VALLIS-NERIA@users.noreply.github.com>
2025-09-06 00:49:23 +08:00
Emma Qiao
d8ec546b73
[None][infra] Waive failed tests on main branch 0905 ( #7564 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-09-05 22:46:46 +08:00
Xiwen Yu
2c3f4cbeee
Merge remote-tracking branch 'origin/main' into feat/b300_cu13
...
Signed-off-by: Xiwen Yu <13230610+VALLIS-NERIA@users.noreply.github.com>
2025-09-05 15:53:43 +08:00
Xiwen Yu
22219bc37e
Add B300 & GB300 CI
...
Signed-off-by: Xiwen Yu <13230610+VALLIS-NERIA@users.noreply.github.com>
2025-09-05 15:29:50 +08:00
xinhe-nv
8e3962d278
[TRTLLM-6642][feat] add gptoss 20g tests ( #7361 )
...
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-09-05 02:20:28 -04:00
xinhe-nv
b3ba3d98d2
[None][chore] Remove closed bugs ( #7408 )
...
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2025-09-05 02:11:16 -04:00
QI JUN
ff3704897b
[None][ci] remove unnecessary test_modeling_deepseek.py ( #7542 )
...
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-09-04 20:05:27 -07:00
Jin Li
2189a2f3ff
[ https://nvbugs/5483615 ][fix] Remove unnecessary assertion to let mai… ( #7441 )
...
Signed-off-by: Jin Li <59594262+liji-nv@users.noreply.github.com>
2025-09-05 10:56:21 +08:00
Yuxian Qiu
48a5270868
[ https://nvbugs/5492485 ][fix] Use offline dataset from llm-models instead. ( #7435 )
...
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
2025-09-04 09:58:16 -07:00
Ivy Zhang
b46e0ae5d4
[None][test] update nim and full test list ( #7468 )
...
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
2025-09-04 09:06:01 -04:00
QI JUN
d38b8e3dd9
[None][ci] set TORCHINDUCTOR_COMPILE_THREADS for thop/parallel tests ( #7489 )
...
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-09-04 06:04:51 -07:00
kris1025
cce9556858
[ https://nvbugs/5485886 ][fix] Fix resource free of Eagle3ResourceManager ( #7437 )
...
Signed-off-by: linquanh <linquanh@nvidia.com>
2025-09-04 17:38:13 +08:00
Jin Li
2a2dfe273b
[ https://nvbugs/5485102 ][fix] Correctly set stride for piecewise outp… ( #7442 )
...
Signed-off-by: Jin Li <59594262+liji-nv@users.noreply.github.com>
2025-09-04 10:48:15 +08:00
Stanley Sun
db8eb0a447
[TRTLLM-7876][test] Test trtllm-serve with --extra_llm_api_options ( #7492 )
...
Signed-off-by: Stanley Sun <190317771+StanleySun639@users.noreply.github.com>
2025-09-04 10:34:38 +08:00
Lizhi Zhou
d97c1e6bd9
[ https://nvbugs/5470769 ][fix] fix disagg-serving accuracy test case ( #7338 )
...
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
2025-09-04 09:11:01 +08:00
Enwei Zhu
5ff3a65b23
[TRTLLM-7028][feat] Enable guided decoding with speculative decoding (part 2: one-model engine) ( #6948 )
...
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
2025-09-03 15:16:11 -07:00
Lizhi Zhou
7c73c2ff4b
[ https://nvbugs/5485593 ][fix] improve accuracy/test_disaggregated_serving.py ( #7366 )
...
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
2025-09-03 09:38:53 -04:00
Stanley Sun
cebbf48b74
[TRTLLM-7363][test] Add 8-GPU test cases for RTX6000 ( #7083 )
...
Signed-off-by: Stanley Sun <190317771+StanleySun639@users.noreply.github.com>
2025-09-03 08:36:52 -04:00
Mike Iovine
79d93f9419
[ https://nvbugs/5488141 ][fix] Unwaive llama3 test_eagle3 ( #7486 )
...
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
2025-09-03 14:10:40 +08:00
Wanli Jiang
4223a9aada
[TRTLLM-7261][feat] Support phi-4 model in pytorch backend ( #7371 )
...
Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>
2025-09-03 10:27:42 +08:00
Xiwen Yu
5bd50d477e
update mha cubins and support 103a
...
Signed-off-by: Xiwen Yu <xiweny@nvidia.com>
2025-09-02 19:26:24 -07:00
Simeng Liu
bcc55bcdf3
[ https://nvbugs/5470782 ][fix] Add specific test names for test_deepseek.py ( #7318 )
...
Signed-off-by: Simeng Liu <simengl@nvidia.com>
2025-09-02 10:31:40 -07:00
Emma Qiao
aae5d22bfe
[None][infra] Waive failed tests on main branch 0902 ( #7482 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-09-02 10:16:49 -04:00
peaceh-nv
90479c50fb
[ https://nvbugs/5453992 ][unwaive] Unwaive llama quickstart test ( #7242 )
...
Signed-off-by: peaceh <103117813+peaceh-nv@users.noreply.github.com>
2025-09-02 20:28:32 +08:00
JunyiXu-nv
eefe5f2093
[TRTLLM-7208][feat] Implement basic functionalities for Responses API ( #7341 )
...
Signed-off-by: Junyi Xu <junyix@nvidia.com>
2025-09-02 07:08:22 -04:00
HuiGao-NV
7279297717
[None][infra] waive test case failed on post-merge ( #7471 )
...
Signed-off-by: Hui Gao <huig@nvidia.com>
2025-09-02 06:20:08 -04:00
aalanwyr
c3c95736a1
[TRTLLM-6643][feat] Add DeepSeek-v3-0324 e2e torch test ( #7413 )
...
Signed-off-by: Yaran Wu <28771492+aalanwyr@users.noreply.github.com>
2025-09-02 17:21:27 +08:00
Ivy Zhang
3799e5d460
[None][test] auto reuse torch empty cache on qa test ( #7421 )
...
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
2025-09-02 04:44:47 -04:00
Yan Chunwei
f90375f37c
[ https://nvbugs/5476580 ][fix] unwaive test_nvfp4_4gpus ( #7454 )
...
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
2025-09-02 04:17:14 -04:00
Xiwen Yu
62a78973a8
Merge remote-tracking branch 'origin/main' into user/xiweny/merge_0901
...
Signed-off-by: Xiwen Yu <13230610+VALLIS-NERIA@users.noreply.github.com>
2025-09-02 10:12:30 +08:00
Emma Qiao
01dfd3af1b
[None][infra] Waive failed case on main 0901 ( #7447 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-09-01 23:27:24 +08:00
bhsueh_NV
16e9d1121c
[ https://nvbugs/5481087 ][fix] fix bug of ci when we use mocker ( #7332 )
...
Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>
2025-09-01 16:22:45 +08:00
nvamyt
efaefca2c8
[None][test] Update case that not support passing quantization fp8 for pytorch backend ( #7302 )
...
Signed-off-by: nvamyt <amyt@nvidia.com>
2025-09-01 12:59:21 +08:00
Xiwen Yu
38ef850552
Merge remote-tracking branch 'gitlab/main' into user/xiweny/merge_0901
...
Signed-off-by: Xiwen Yu <13230610+VALLIS-NERIA@users.noreply.github.com>
2025-09-01 11:46:44 +08:00
Yiqing Yan
21291f3d8e
[None][chore] Remove duplicate test waives ( #6999 )
...
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
2025-09-01 11:02:31 +08:00
Emma Qiao
09bca7ca82
[None][infra] Waive failed tests for release branch 0818 ( #6993 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
2025-09-01 11:02:31 +08:00
peaceh-nv
f4dc1ed39c
[ https://nvbugs/5449218 ][fix] Fix KvCacheConfig error in test_perf ( #6937 )
...
Signed-off-by: peaceh <103117813+peaceh-nv@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
2025-09-01 11:02:31 +08:00
Ivy Zhang
29cdcdb56a
[None][fix] update skip config ( #6891 )
...
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
2025-09-01 11:02:31 +08:00
Guoming Zhang
d5bc5cd4f2
[ https://nvbugs/5375646 ][fix] update waives.txt for nvbug 5375646 ( #6847 )
...
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
2025-09-01 11:02:31 +08:00
William Zhang
d15dcdc4ae
[ https://nvbugs/5448525 ][fix] Mistral Small 3.1 accuracy tests ( #6909 )
...
This commit lowers the GPU memory allocated for KV cache in accuracy
tests, and adjusts a threshold for Mistral Small 3.1 24B for FP8.
Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
2025-09-01 11:02:31 +08:00
Yan Chunwei
ac07418968
[None][ci] unwaive test_ptp_star_attention_example ( #6943 )
...
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
2025-09-01 11:02:31 +08:00
xinhe-nv
b4d41d6604
[TRTLLM-7048][feat] add benchmark TRT flow test for MIG ( #6884 )
...
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
2025-09-01 11:02:31 +08:00
Yan Chunwei
612c26be22
[None][doc] add legacy section for tensorrt engine ( #6724 )
...
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
2025-09-01 11:02:31 +08:00
2ez4bz
2480aedb73
[TRTLLM-5252][feat] Add fp8 support for Mistral Small 3.1 ( #6731 )
...
This commit adds some level of FP8 support to Mistral Small 3.1 by:
* disabling quantization for the vision sub-model since `modelopt` does
support quantizing it (yet).
* extending existing accuracy tests to use a modelopt produced FP8
checkpoint.
Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
2025-09-01 11:02:31 +08:00
Guoming Zhang
3e99744201
[ https://nvbugs/5375594 ][fix] fix oom issue on structural_tag test case ( #6838 )
...
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
Co-authored-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
2025-09-01 11:02:31 +08:00
Ivy Zhang
deba2885c1
[None][fix] fix Llama3 eagle3 test case OOM ( #6832 )
...
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
2025-09-01 11:02:31 +08:00
xinhe-nv
7841ea6255
[None][chore] waive GB300 known issues ( #6812 )
...
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
2025-09-01 11:02:31 +08:00
Ivy Zhang
c7147d25dc
[TRTLLM-6975][test] Add multi-turn test cases for VLM models ( #6749 )
...
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
2025-09-01 11:02:31 +08:00
Tian Zheng
e257cb3533
[None][feat] Support NVFP4 KV Cache ( #6244 )
...
Signed-off-by: Tian Zheng <29906817+Tom-Zheng@users.noreply.github.com>
2025-09-01 09:24:52 +08:00
xinhe-nv
5f939b9121
[None][chore] Add failed cases into waives.txt ( #7342 )
...
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-08-30 00:49:14 -04:00