Bo Li
|
582dec5bb5
|
[https://nvbugs/5774869][infra] Use 2 GPUs to test skip softmax attention on H100. (#10420)
Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>
|
2026-01-14 07:03:01 -05:00 |
|
jmydurant
|
e7882d5c74
|
[None][feat] MiniMax M2 support (#10532)
Signed-off-by: Mingyang Jiang <13463932+jmydurant@users.noreply.github.com>
|
2026-01-14 17:38:58 +08:00 |
|
mpikulski
|
052c36ddd2
|
[TRTLLM-9522][feat] support image_embeds in OpenAI API (#9715)
Signed-off-by: ixlmar <206748156+ixlmar@users.noreply.github.com>
|
2026-01-14 10:31:03 +01:00 |
|
xinhe-nv
|
07d9390e9b
|
[None][test] add test into qa test list (#10627)
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
|
2026-01-13 22:43:00 -05:00 |
|
Balaram Buddharaju
|
ccdfa43a6e
|
[https://nvbugs/5791900][fix] Fix HelixCpMnnvlMemory init with PP (#10533)
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
|
2026-01-13 15:48:42 -05:00 |
|
Guoming Zhang
|
bdaee87895
|
[TRTLLM-10060][feat] Enable attention dp for Nemotron Super v3. (#10347)
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
|
2026-01-13 17:13:55 +08:00 |
|
JunyiXu-nv
|
e291a834db
|
[TRTLLM-8462][feat] Support GET/DELETE v1/responses/{response_id} (#9937)
Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
|
2026-01-13 03:57:14 -05:00 |
|
JennyLiu
|
2967d299fb
|
[TRTLLM-10271][test] Add Spark QA functional and performance cases (#10564)
Signed-off-by: Jenny Liu <JennyLiu-nv+JennyLiu@users.noreply.github.com>
Co-authored-by: Jenny Liu <JennyLiu-nv+JennyLiu@users.noreply.github.com>
|
2026-01-13 13:20:15 +08:00 |
|
fredricz-20070104
|
bbe535fddf
|
[None][chore] Fix disagg assert (#10596)
Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com>
|
2026-01-12 21:39:57 -05:00 |
|
Iman Tabrizian
|
48b09e5a25
|
[https://nvbugs/5689235][fix] Fix cancellation+chunked prefill+disagg (#10111)
Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
|
2026-01-12 18:23:26 -05:00 |
|
Anish Shanbhag
|
dacc881993
|
[https://nvbugs/5761391][fix] Use correct model names for config database regression tests (#10192)
Signed-off-by: Anish Shanbhag <ashanbhag@nvidia.com>
|
2026-01-12 10:55:07 -08:00 |
|
Suyog Gupta
|
a1385243e1
|
[#10580][fix] re-enable NemotronH MOE MMLU test (#10594)
Signed-off-by: Suyog Gupta <41447211+suyoggupta@users.noreply.github.com>
|
2026-01-12 09:26:07 -08:00 |
|
Wanli Jiang
|
11da7e3605
|
[None][fix] Solve pillow version conflict (#10537)
Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>
|
2026-01-12 04:05:54 -05:00 |
|
yingguo-trt
|
c5914f9085
|
[None][chore] update deepseekv3.2 test parameter (#10595)
Signed-off-by: yingguo-trt <244492186+yingguo-trt@users.noreply.github.com>
|
2026-01-12 01:43:22 -05:00 |
|
chenfeiz0326
|
54459377d2
|
[TRTLLM-10248][feat] Support Bot to Send Perf Regression Msg to Slack Channel (#10489)
Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>
|
2026-01-12 14:23:23 +08:00 |
|
Eran Geva
|
c5d5af9e7f
|
[#8391][chore] removed llama and added deepseek to AutoDeploy's L0 perf test (#10585)
Signed-off-by: Eran Geva <19514940+MrGeva@users.noreply.github.com>
|
2026-01-11 16:31:24 -05:00 |
|
fredricz-20070104
|
f6045fac09
|
[None][chore] Fix Gitlab CI termination issues (#10576)
Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com>
Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>
Co-authored-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>
|
2026-01-10 07:51:18 -05:00 |
|
William Zhang
|
ff7eb93f31
|
[https://nvbugs/5669097][tests] Add MMMU test for mistral small (#10530)
Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>
|
2026-01-09 16:09:28 -08:00 |
|
yingguo-trt
|
d80f01d205
|
[None][feat] Add support for DeepSeek v3.2 tests (#10561)
Signed-off-by: yingguo-trt <244492186+yingguo-trt@users.noreply.github.com>
|
2026-01-09 10:20:29 -05:00 |
|
Yechan Kim
|
7295af68ba
|
[None][fix] Enable AttentionDP on Qwen3-VL and fix test (#10435)
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
|
2026-01-10 00:13:26 +09:00 |
|
Jie Li
|
627d306df9
|
[None][chore] remove some model support; add device constraint (#10563)
Signed-off-by: Jie Li <lijie@nvidia.com>
|
2026-01-09 09:36:23 -05:00 |
|
ruodil
|
2b72d33fdc
|
[TRTLLM-9932][test] add kimi_k2 single node perf test (#10436)
Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com>
Co-authored-by: Ruodi Lu <ruodil@users.noreply.github.com>
|
2026-01-09 05:36:50 -05:00 |
|
JadoTu
|
4c498bfe58
|
[TRTLLM-9676][fix] Fix mamba_cache_manager when enabling cuda_graph_padding and let test cover this case (#9873)
Signed-off-by: JadoTu <107457950+JadoTu@users.noreply.github.com>
|
2026-01-09 14:50:16 +08:00 |
|
ruodil
|
d707286ca8
|
[None][test] restrict max_num_tokens in disagg mtp config (#10442)
Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com>
Co-authored-by: Ruodi Lu <ruodil@users.noreply.github.com>
|
2026-01-08 21:53:24 -05:00 |
|
bhsueh_NV
|
bea61bb17d
|
[None][fix] Mistral large 3 few code refine (#10405)
Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>
|
2026-01-08 06:38:49 -05:00 |
|
Emma Qiao
|
43839c7d9b
|
[TRTLLM-9642][infra] Increase pytest verbosity for failed tests (#9657)
Signed-off-by: qqiao <qqiao@nvidia.com>
Signed-off-by: Emma Qiao <qqiao@nvidia.com>
|
2026-01-08 02:33:48 -05:00 |
|
Barry Kang
|
f57aab5255
|
[https://nvbugs/5775402][fix] Fix concurrency list in Wide-EP perf tests (#10529)
Signed-off-by: Barry Kang <43644113+Barry-Delaney@users.noreply.github.com>
|
2026-01-08 01:58:55 -05:00 |
|
yingguo-trt
|
f8b2a8fd30
|
[None][chore] Support multiple job submission at the same time (#10492)
Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com>
Co-authored-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com>
|
2026-01-07 21:51:36 -05:00 |
|
yingguo-trt
|
cbf8357e5f
|
[https://nvbugs/5726086][fix] update kimi-k2-1k1k dataset (#10473)
Signed-off-by: yingguo-trt <244492186+yingguo-trt@users.noreply.github.com>
|
2026-01-07 01:24:08 -05:00 |
|
Fanrong Li
|
a34aa63685
|
[https://nvbugs/5767223][feat] add pp support for DeepSeek-v3.2 (#10449)
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
|
2026-01-07 12:29:51 +08:00 |
|
Ivy Zhang
|
4a1b2e23b3
|
[https://nvbugs/5698434][test] add qwen3-4b accuracy test case (#10382)
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
|
2026-01-06 21:56:34 -05:00 |
|
JunyiXu-nv
|
7d62773c6c
|
[https://nvbugs/5760726][fix] Use random port in container port section (#10432)
Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
|
2026-01-06 23:25:46 +08:00 |
|
Ivy Zhang
|
1e828587e5
|
[TRTLLM-9896][test] add vswa test cases coverage (#10146)
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
|
2026-01-06 02:02:29 -05:00 |
|
Ivy Zhang
|
22a1d31a27
|
[None][test] update test case constraint (#10381)
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
|
2026-01-06 12:28:59 +08:00 |
|
chenfeiz0326
|
8a04c05079
|
[None][fix] Only Use Throughput Metrics to Check Regression (#10404)
Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>
|
2026-01-06 09:21:15 +08:00 |
|
Mike Iovine
|
91ff46d418
|
[https://nvbugs/5745152][fix] Unwaive gpt oss spec decode test (#10370)
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
|
2026-01-05 16:06:58 -05:00 |
|
Mike Iovine
|
7a2dab8e85
|
[https://nvbugs/5695984][fix] Unwaive llama3 eagle test (#10092)
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
|
2026-01-05 16:03:35 -05:00 |
|
Mike Iovine
|
db2614ef10
|
[https://nvbugs/5772414][fix] Fix draft token tree depth=1 corner case (#10385)
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
|
2026-01-05 17:20:14 +01:00 |
|
Gal Hubara-Agam
|
e98c27ee4f
|
[TRTLLM-10053][feat] AutoDeploy: Add Super v3 config file, improve test runtime (#10397)
Signed-off-by: Gal Hubara Agam <96368689+galagam@users.noreply.github.com>
|
2026-01-05 18:17:27 +02:00 |
|
Balaram Buddharaju
|
a792c23dcf
|
[TRTLLM-9465][fix] Swap TP-CP grouping order (#10350)
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
|
2026-01-05 20:08:03 +08:00 |
|
xinhe-nv
|
b1733d56f6
|
[TRTLLM-9381][test] add disag-serving kimi k2 thinking tests (#10357)
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
|
2026-01-05 05:15:52 -05:00 |
|
Fanrong Li
|
4931c5eb3a
|
[None][feat] update deepgemm to the DeepGEMM/nv_dev branch (#9898)
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
|
2026-01-05 16:43:42 +08:00 |
|
HuiGao-NV
|
2f768b76f8
|
[https://nvbugs/5715568][fix] Force release torch memory when LLM is destroyed (#10314)
Signed-off-by: Hui Gao <huig@nvidia.com>
|
2026-01-05 15:30:18 +08:00 |
|
Fanrong Li
|
b5a1e10bc0
|
[https://nvbugs/5779534][fix] fix buffer reuse for CUDA graph attention metadata (#10393)
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
|
2026-01-05 09:43:44 +08:00 |
|
Wanli Jiang
|
da0830670a
|
[TRTLLM-10065][feat] Add accuracy tests for super-v3 with multiple-gpus (#10234)
Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>
|
2026-01-05 09:41:49 +08:00 |
|
Lizhi Zhou
|
82c1ba84a7
|
[https://nvbugs/5649010][fix] use 0 port as arbitrary port when disagg service discovery is enabled (#10383)
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
|
2026-01-05 09:40:40 +08:00 |
|
chenfeiz0326
|
a65b0d4efa
|
[None][fix] Decrease Pre Merge Perf Tests (#10390)
Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
Co-authored-by: Yanchao Lu <yanchaol@nvidia.com>
|
2026-01-04 12:21:34 -05:00 |
|
Yanchao Lu
|
c4f27fa4c0
|
[None][ci] Some tweaks for the CI pipeline (#10359)
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
|
2026-01-04 11:10:47 -05:00 |
|
dongfengy
|
afc533193d
|
[None][feat] Support nvfp4 for gptoss (#8956)
Signed-off-by: Dongfeng Yu <dongfengy@nvidia.com>
|
2026-01-04 08:57:44 -05:00 |
|
Yuxian Qiu
|
6ba04eba06
|
[https://nvbugs/5748683][fix] Use get_free_port_in_ci to avoid port conflict. (#10392)
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
|
2026-01-04 19:04:58 +08:00 |
|