Commit Graph

2054 Commits

Author SHA1 Message Date
mpikulski
cddc7549d1
[TRTLLM-9191][feat] support out-of-tree models in trtllm-serve (#9269)
Signed-off-by: ixlmar <206748156+ixlmar@users.noreply.github.com>
2025-11-21 04:23:47 -08:00
mpikulski
095b6864a8
[TRTLLM-8650][fix] beam search request validation (#8433) (#9228)
Signed-off-by: ixlmar <206748156+ixlmar@users.noreply.github.com>
2025-11-21 04:08:45 -08:00
Emma Qiao
041564188c
[None][infra] Waive failed cases in main post-merge on 11/21 (#9360)
Signed-off-by: qqiao <qqiao@nvidia.com>
Signed-off-by: Emma Qiao <qqiao@nvidia.com>
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
2025-11-21 18:01:53 +08:00
QI JUN
b6483ef3e7
[None][ci] waive a test case of test_ad_build_small_multi.py (#9355)
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-11-21 16:25:04 +08:00
Ivy Zhang
28e9bf6167
[None][chore] add periodic junit xml path in conftest (#9337)
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
2025-11-20 22:46:25 -08:00
QI JUN
e2a372a3b1
[None][ci] waive test_llm_context_only_timed_out_kv_cache_exhausted (#9351)
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-11-20 20:20:57 -08:00
Barry Kang
a3433dd54e [https://nvbugs/5325296][fix] Enable relaxed acceptance test on Blackwell (#8709)
Signed-off-by: Barry Kang <43644113+Barry-Delaney@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-11-20 12:43:13 -05:00
Zhanrui Sun
62e20a5441 [None][infra] Remove invaild waived tests which not in release branch (#8841)
Signed-off-by: ZhanruiSunCh <184402041+ZhanruiSunCh@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-11-20 12:43:13 -05:00
Jin Li
6185225501 [https://nvbugs/5488118][fix] Unwaive passed tests (#8758)
Signed-off-by: Jin Li <59594262+liji-nv@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-11-20 12:43:13 -05:00
Dom Brown
0c8de1f45d [https://nvbugs/5575841] [test] Move test_moe.py to serial tests to improve stability + unwaive FP4 MoE torch unit tests (#8422)
Signed-off-by: Dom Brown <3886319+DomBrown@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-11-20 12:43:13 -05:00
xiweny
05aabfbc1e [https://nvbugs/5601203] [fix]Restrict fp8 blockscale moe case (#8583)
Signed-off-by: Xiwen Yu <13230610+VALLIS-NERIA@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-11-20 12:43:13 -05:00
Eran Geva
3d66e56adb [https://nvbugs/5572320][fix] Ported test_ad_trtllm_bench.py from main (#8671)
Signed-off-by: Eran Geva <19514940+MrGeva@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-11-20 12:43:13 -05:00
Yukun He
9a79f32f7a [https://nvbugs/5608489][fix] Fix output unpack issues for Llama3/4 NVFP4 models. (#8679)
Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-11-20 12:43:13 -05:00
Ivy Zhang
25c0624750 [None][test] Clean cache for certain easily hang cases (#8619)
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
Co-authored-by: Larry Xu <197874197+LarryXFly@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-11-20 12:43:13 -05:00
Jie Li
36e244f35e [https://nvbugs/5587456][fix] Remove multimodal test cases using TRT backend (#8611)
Signed-off-by: Jie Li <lijie@nvidia.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-11-20 12:43:13 -05:00
Lizhi Zhou
348668e3ae [https://nvbugs/5575902][fix] set max_batch_size=1 to stabilize accuracy test result (#8609)
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-11-20 12:43:13 -05:00
Lizhi Zhou
33b0b945c7 [https://nvbugs/5582277][fix] rework DisaggPPTerminationHandler to fix hang issue (#8519)
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-11-20 12:43:13 -05:00
Pengyun Lin
81fd9be87d [https://nvbugs/5575829][fix] Unwaive gpt-oss test (#8576)
Signed-off-by: Pengyun Lin <81065165+LinPoly@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-11-20 12:43:13 -05:00
Bo Deng
4ca6fe83d8 [https://nvbugs/5565549][fix] unwaive test_disaggregated_spec_dec_bat… (#8500)
Signed-off-by: Bo Deng <deemod@nvidia.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-11-20 12:43:13 -05:00
Guoming Zhang
af3900a195 [https://nvbugs/5504095][fix] Unwaive test_user_specify_workspace case. (#8316)
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-11-20 12:43:13 -05:00
Simeng Liu
9286223288 [https://nvbugs/5515753][ci] Add NCCL_DEBUG=INFO flag to collect more info with CI failure. (#8440)
Signed-off-by: Simeng Liu <simengl@nvidia.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-11-20 12:43:13 -05:00
JunyiXu-nv
ee6944bfa2 [https://nvbugs/5569713][fix] Disable fp8 deep gemm for EXAONE-4.0-32B-FP8 (#8429)
Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-11-20 12:43:13 -05:00
yufeiwu-nv
0e746fad45
[https://nvbugs/5667454][test] Fix Test Case as Chunked Attention not Supported on sm_120 (#9260)
Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>
2025-11-20 00:58:42 -08:00
Liao Lanyu
04ad9f96fa
[https://nvbugs/5667687][fix] Set correct lm_head_tp_size_upper_bound (#9300)
Signed-off-by: Lanyu Liao <lancelly@users.noreply.github.com>
Co-authored-by: Lanyu Liao <lancelly@users.noreply.github.com>
2025-11-20 00:41:00 -08:00
Emma Qiao
b018b2698d
[TRTLLM-9164][infra] Enable checking duplicate items in waives.txt in pre-commit (#9265)
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-11-20 15:47:23 +08:00
mpikulski
a39e8c5567
[TRTLLM-9295][fix] use greedy decoding in test_openai_compatible_json_schema (#9305)
Signed-off-by: ixlmar <206748156+ixlmar@users.noreply.github.com>
2025-11-20 08:32:23 +01:00
QI JUN
1bdd3ba173
[None][ci] waive test_disagg_server_restart (#9326)
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-11-19 22:34:03 -08:00
Yechan Kim
d5622b2689
[None][fix] Multimodal InputProcessor dummy builder fix (#8916)
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
2025-11-19 22:32:21 -08:00
Chang Liu
79a6c9742b
[None][fix] Use fp32 for indexer weight_proj GEMM (#9243)
Signed-off-by: Chang Liu (Enterprise Products) <9713593+chang-l@users.noreply.github.com>
2025-11-19 21:52:38 -08:00
Chenghao Zhang
cd44f80abd
[#9316][feat] AutoDeploy: Add the accuracy test for Nemotron MOE models (#9317)
Signed-off-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com>
2025-11-19 21:48:50 -08:00
Bo Deng
2128f73d58
[TRTLLM-9247][infra] Upgrade NIXL to 0.7.1 (#9055)
Signed-off-by: Bo Deng <deemod@nvidia.com>
Signed-off-by: jthomson04 <jwillthomson19@gmail.com>
Co-authored-by: jthomson04 <jwillthomson19@gmail.com>
2025-11-20 11:01:02 +08:00
Yukun He
b6bced83c0
[TRTLLM-7963][feat] Use CUDAGraph to improve the tuning accuracy for AutoTuner. (#9089)
Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>
2025-11-20 08:54:29 +08:00
brb-nv
f6ec6e2222
[None][chore] Waive tests timing out on main (#9315)
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2025-11-19 13:10:06 -08:00
NVShreyas
1eae941d77
[#9237][feat] enable iter stats in autodeploy (#9278)
Signed-off-by: Shreyas Misra <shreyasm@nvidia.com>
2025-11-19 19:29:29 +01:00
Neta Zmora
7ab02ad7b5
[None][feature] AutoDeploy: tighter MoE UT thresholds (#9195)
Scale down the weights in the MoE test so that the output has reasonable magnitude, allowing for tighter atol and rtol

Signed-off-by: Neta Zmora <96238833+nzmora-nvidia@users.noreply.github.com>
2025-11-19 08:37:51 -08:00
Bo Li
d8b05894ee
[None][perf] Adjust select_alltoall_method_type. (#8950)
Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>
2025-11-19 07:43:55 -08:00
mpikulski
46dd9886bb
[https://nvbugs/5661877][fix] fix test regression in TestBatchedSampling::test_samples (#9215)
Signed-off-by: ixlmar <206748156+ixlmar@users.noreply.github.com>
2025-11-19 01:44:44 -08:00
xinhe-nv
0f77fec932
[None][chore] Add failed cases into waives.txt (#9289)
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2025-11-19 17:03:43 +08:00
CarstyYou
ee941ac779
[https://nvbugs/5456493][feat] add fp8 dense for sm120 (#9174)
Signed-off-by: CarstyYou <186021327+CarstyYou@users.noreply.github.com>
2025-11-19 14:40:34 +08:00
nvxuanyuc
a79c0dfb43
[None][fix] Update GLM model accuracy test (#9286)
Signed-off-by: Xuanyu Chen <xuanyuc@nvidia.com>
2025-11-18 21:59:01 -08:00
Emma Qiao
67d3eb26af
[None][infra] Waive failed cases for main branch on 11/17 (#9266)
Signed-off-by: qqiao <qqiao@nvidia.com>
2025-11-18 20:07:03 -08:00
ChristinaZ
941a54c66a
[None][feat] Update the indexer topK (#9255)
Signed-off-by: Christina Zhang <83400082+ChristinaZ@users.noreply.github.com>
2025-11-19 11:49:00 +08:00
xinhe-nv
286ace22ed
[None][chore] Add failed cases into waives.txt (#9242)
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
2025-11-18 19:27:55 -08:00
Ivy Zhang
782dfca7e8
[TRTLLM-9050][test] add llama4 disagg case to cover kv cache overflow error (#9172)
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
2025-11-18 18:26:32 -08:00
Patrice Castonguay
9b0f45298f
[None][feat] Have ability to cancel disagg request if KV cache resource are exhausted (#9155)
Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>
2025-11-18 20:59:17 -05:00
xinhe-nv
35658eab55
[None][chore] Add failed cases into waives.txt (#9193)
Signed-off-by: xinhe-nv <200704525+xinhe-nv@users.noreply.github.com>
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
2025-11-18 17:47:55 -08:00
Enwei Zhu
7c4777a571
[TRTLLM-9286][feat] Integration of CuteDSL NVFP4 grouped GEMM (#8880)
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
2025-11-18 17:40:12 -08:00
Lizhi Zhou
c789000a62
[https://nvbugs/5649010][fix] increase status-checking interval to avoid instability (#9203)
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
2025-11-19 08:55:42 +08:00
Bo Deng
34f845bf69
[TRTLLM-9287][infra] Use NIXL backend for accuracy tests (#9247)
Signed-off-by: Bo Deng <deemod@nvidia.com>
2025-11-18 14:46:20 -08:00
Ajinkya Rasane
8d7cda2318
[None][chore] Update the Flux autodeploy example (#8434)
Signed-off-by: ajrasane <131806219+ajrasane@users.noreply.github.com>
Co-authored-by: Frida Hou <201670829+Fridah-nv@users.noreply.github.com>
2025-11-18 14:16:04 -08:00