Balaram Buddharaju
|
c7322d95d6
|
Merge 31f2ecd3cb into 6df2c8a074
|
2026-01-13 05:11:17 -08:00 |
|
benzh-2025
|
6df2c8a074
|
[None][feat] add fp4 gemm + allreduce (#9729)
Signed-off-by: benzh
Signed-off-by: benzh-2025
|
2026-01-13 21:11:13 +08:00 |
|
Guoming Zhang
|
c1b0b7350f
|
[None][test] Unwaive qwen3 next test case. (#9877)
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
|
2026-01-13 20:42:31 +08:00 |
|
Tailing Yuan
|
38296a472b
|
[None][feat] Layer-wise benchmarks: make model init more general and support weights loading (#10562)
Signed-off-by: Tailing Yuan <yuantailing@gmail.com>
|
2026-01-13 19:17:03 +08:00 |
|
mpikulski
|
50c78179dd
|
[TRTLLM-8425][doc] document Torch Sampler details (#10606)
Signed-off-by: ixlmar <206748156+ixlmar@users.noreply.github.com>
|
2026-01-13 12:01:20 +01:00 |
|
Erin
|
55580f8ec1
|
[NVBUG-5670458][chore] Unwaive lp tests (#10524)
Signed-off-by: Erin Ho <14718778+hchings@users.noreply.github.com>
Signed-off-by: Erin <14718778+hchings@users.noreply.github.com>
|
2026-01-13 04:31:27 -05:00 |
|
Void
|
7d16f3a28b
|
[https://nvbugs/5788127][fix] Use uint64_t as the dtype of lamport_buffer_size to avoid overflow (#10499)
Signed-off-by: Yilin Zhang <18275976+yilin-void@users.noreply.github.com>
|
2026-01-13 17:16:22 +08:00 |
|
Guoming Zhang
|
bdaee87895
|
[TRTLLM-10060][feat] Enable attention dp for Nemotron Super v3. (#10347)
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
|
2026-01-13 17:13:55 +08:00 |
|
JunyiXu-nv
|
e291a834db
|
[TRTLLM-8462][feat] Support GET/DELETE v1/responses/{response_id} (#9937)
Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
|
2026-01-13 03:57:14 -05:00 |
|
Yuxian Qiu
|
04b112651b
|
[None][feat] Hang detection for executor loop and worker. (#10480)
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
|
2026-01-13 02:34:32 -05:00 |
|
Yiteng Niu
|
50c22b80d7
|
[None][infra] Update allowlist 2026.01.08 (#10535)
Signed-off-by: Yiteng Niu <6831097+niukuo@users.noreply.github.com>
|
2026-01-13 15:28:53 +08:00 |
|
tburt-nv
|
7d41475954
|
[None][infra] try removing shared cache dir mount (#10609)
Signed-off-by: Tyler Burt <195370667+tburt-nv@users.noreply.github.com>
|
2026-01-13 15:07:12 +08:00 |
|
JennyLiu
|
2967d299fb
|
[TRTLLM-10271][test] Add Spark QA functional and performance cases (#10564)
Signed-off-by: Jenny Liu <JennyLiu-nv+JennyLiu@users.noreply.github.com>
Co-authored-by: Jenny Liu <JennyLiu-nv+JennyLiu@users.noreply.github.com>
|
2026-01-13 13:20:15 +08:00 |
|
Balaram Buddharaju
|
31f2ecd3cb
|
address comments from Jin, Chuang and Yuxian
|
2026-01-13 03:28:51 +00:00 |
|
TensorRT LLM
|
ba1cb6831d
|
[None][infra] Check in most recent lock file from nightly pipeline
Signed-off-by: TensorRT LLM <90828364+tensorrt-cicd@users.noreply.github.com>
|
2026-01-13 03:08:08 +00:00 |
|
fredricz-20070104
|
bbe535fddf
|
[None][chore] Fix disagg assert (#10596)
Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com>
|
2026-01-12 21:39:57 -05:00 |
|
xxi
|
ba1037ca4a
|
[https://nvbugs/5762336][fix] support to parse the keyword modules_to_not_convert of the HF model config" (#10527)
Signed-off-by: xxi <xxi@nvidia.com>
|
2026-01-12 20:21:01 -05:00 |
|
Iman Tabrizian
|
48b09e5a25
|
[https://nvbugs/5689235][fix] Fix cancellation+chunked prefill+disagg (#10111)
Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
|
2026-01-12 18:23:26 -05:00 |
|
Gal Hubara-Agam
|
18a33764b5
|
[None][chore] Print correct backend name in benchmark report (#10597)
Signed-off-by: Gal Hubara Agam <96368689+galagam@users.noreply.github.com>
|
2026-01-12 14:46:00 -05:00 |
|
Anish Shanbhag
|
dacc881993
|
[https://nvbugs/5761391][fix] Use correct model names for config database regression tests (#10192)
Signed-off-by: Anish Shanbhag <ashanbhag@nvidia.com>
|
2026-01-12 10:55:07 -08:00 |
|
Suyog Gupta
|
a1385243e1
|
[#10580][fix] re-enable NemotronH MOE MMLU test (#10594)
Signed-off-by: Suyog Gupta <41447211+suyoggupta@users.noreply.github.com>
|
2026-01-12 09:26:07 -08:00 |
|
Emma Qiao
|
9f044b9dd9
|
[None][infra] Waive failed tests for main 01/12 (#10604)
Signed-off-by: qqiao <qqiao@nvidia.com>
|
2026-01-12 10:24:54 -05:00 |
|
mpikulski
|
bf7998f1b8
|
[TRTLLM-9522][test] cover LLM API multi_modal_embeddings (#9963)
Signed-off-by: ixlmar <206748156+ixlmar@users.noreply.github.com>
|
2026-01-12 11:38:22 +01:00 |
|
Wanli Jiang
|
11da7e3605
|
[None][fix] Solve pillow version conflict (#10537)
Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>
|
2026-01-12 04:05:54 -05:00 |
|
Zhenhuan Chen
|
3bd319dc8e
|
[https://nvbugs/5794796][chore] waive test blocking premerge (#10593)
Signed-off-by: Zhenhuan Chen <zhenhuanc@nvidia.com>
|
2026-01-12 15:39:07 +08:00 |
|
yufeiwu-nv
|
8e806abac3
|
[None][test] Remove most TRT-backend test cases in llm_perf_nim.yml (#10572)
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>
Co-authored-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
|
2026-01-12 15:34:55 +08:00 |
|
yingguo-trt
|
c5914f9085
|
[None][chore] update deepseekv3.2 test parameter (#10595)
Signed-off-by: yingguo-trt <244492186+yingguo-trt@users.noreply.github.com>
|
2026-01-12 01:43:22 -05:00 |
|
chenfeiz0326
|
54459377d2
|
[TRTLLM-10248][feat] Support Bot to Send Perf Regression Msg to Slack Channel (#10489)
Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>
|
2026-01-12 14:23:23 +08:00 |
|
Xianjie Qiao
|
3a9a00b544
|
[None][feat] Add ExpertStatistic and DUMMY_ALLREDUCE for configurable_moe (#10401)
Signed-off-by: Xianjie <5410381+qiaoxj07@users.noreply.github.com>
|
2026-01-12 14:10:31 +08:00 |
|
Jie Li
|
5e0dbba0c9
|
[None][chore]: update waive list (#10577)
Signed-off-by: Jie Li <lijie@nvidia.com>
|
2026-01-11 22:18:04 -05:00 |
|
TensorRT LLM
|
2de22f1a70
|
[None][infra] Check in most recent lock file from nightly pipeline
Signed-off-by: TensorRT LLM <90828364+tensorrt-cicd@users.noreply.github.com>
|
2026-01-12 03:09:53 +00:00 |
|
Balaram Buddharaju
|
4e456350c0
|
update test lists
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
|
2026-01-11 18:10:12 -08:00 |
|
Pengbo Wang
|
c0e25e5418
|
[TRTLLM-10022][feat] Add hopper xqa decode support for skip softmax attention (#10264)
Signed-off-by: Pengbo Wang <221450789+pengbowang-nv@users.noreply.github.com>
|
2026-01-11 19:26:10 -05:00 |
|
Balaram Buddharaju
|
8e1e538ad0
|
replace tp_allgather with tp_cp_allgather where apt
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
|
2026-01-11 15:54:42 -08:00 |
|
Balaram Buddharaju
|
9f8ee9800d
|
simplify further
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
|
2026-01-11 15:51:32 -08:00 |
|
Balaram Buddharaju
|
0452a58a32
|
append group_entries[0]
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
|
2026-01-11 15:51:31 -08:00 |
|
Balaram Buddharaju
|
2b746f0ad4
|
condition on tp/cp size for allgather
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
|
2026-01-11 15:51:30 -08:00 |
|
Balaram Buddharaju
|
2fe16ab305
|
cleaner mapping definitions
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
|
2026-01-11 15:51:28 -08:00 |
|
Balaram Buddharaju
|
9eb8e60c29
|
Simplify by removing comms repurpose
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
|
2026-01-11 15:51:27 -08:00 |
|
Balaram Buddharaju
|
61f63d06b8
|
[TRTLLM-10264][feat] Enable attention DP + Helix CP
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
|
2026-01-11 15:51:22 -08:00 |
|
Eran Geva
|
c5d5af9e7f
|
[#8391][chore] removed llama and added deepseek to AutoDeploy's L0 perf test (#10585)
Signed-off-by: Eran Geva <19514940+MrGeva@users.noreply.github.com>
|
2026-01-11 16:31:24 -05:00 |
|
Ivy Zhang
|
7f018c89e9
|
[None][test] update core test list (#10538)
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
|
2026-01-11 14:08:20 -05:00 |
|
Yechan Kim
|
8e0d20d901
|
[TRTLLM-10195][feat] K-EXAONE support (#10355)
Signed-off-by: Jaedeok Kim <jaedeokk@nvidia.com>
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
Co-authored-by: Jaedeok Kim <jaedeokk@nvidia.com>
|
2026-01-12 00:29:51 +09:00 |
|
Yanchao Lu
|
80649a8b78
|
[None][ci] Workaround OCI-NRT slowdown issue (#10587)
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
|
2026-01-11 22:08:19 +08:00 |
|
Guoming Zhang
|
0371cbfd88
|
[None][doc] Update Qwen3-Next doc by adding known issues section (#10582)
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
|
2026-01-11 14:47:47 +08:00 |
|
TensorRT LLM
|
b2e2538fcd
|
[None][infra] Check in most recent lock file from nightly pipeline
Signed-off-by: TensorRT LLM <90828364+tensorrt-cicd@users.noreply.github.com>
|
2026-01-11 03:07:48 +00:00 |
|
HuiGao-NV
|
3c65ec3c55
|
[None][chore] waive test case (#10581)
Signed-off-by: Hui Gao <huig@nvidia.com>
|
2026-01-10 18:53:36 -05:00 |
|
fredricz-20070104
|
f6045fac09
|
[None][chore] Fix Gitlab CI termination issues (#10576)
Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com>
Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>
Co-authored-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>
|
2026-01-10 07:51:18 -05:00 |
|
tcherckez-nvidia
|
f6c4dd885f
|
[None][chore] Update AutoDeploy model list (#10505)
Signed-off-by: Tal Cherckez <127761168+tcherckez-nvidia@users.noreply.github.com>
|
2026-01-10 08:47:37 +02:00 |
|
TensorRT LLM
|
6ab996d635
|
[None][infra] Check in most recent lock file from nightly pipeline
Signed-off-by: TensorRT LLM <90828364+tensorrt-cicd@users.noreply.github.com>
|
2026-01-10 03:09:09 +00:00 |
|