mpikulski
|
2d45b482e0
|
Merge 01cf98132a into 6df2c8a074
|
2026-01-13 21:25:08 +08:00 |
|
benzh-2025
|
6df2c8a074
|
[None][feat] add fp4 gemm + allreduce (#9729)
Signed-off-by: benzh
Signed-off-by: benzh-2025
|
2026-01-13 21:11:13 +08:00 |
|
Guoming Zhang
|
c1b0b7350f
|
[None][test] Unwaive qwen3 next test case. (#9877)
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
|
2026-01-13 20:42:31 +08:00 |
|
Tailing Yuan
|
38296a472b
|
[None][feat] Layer-wise benchmarks: make model init more general and support weights loading (#10562)
Signed-off-by: Tailing Yuan <yuantailing@gmail.com>
|
2026-01-13 19:17:03 +08:00 |
|
mpikulski
|
50c78179dd
|
[TRTLLM-8425][doc] document Torch Sampler details (#10606)
Signed-off-by: ixlmar <206748156+ixlmar@users.noreply.github.com>
|
2026-01-13 12:01:20 +01:00 |
|
Erin
|
55580f8ec1
|
[NVBUG-5670458][chore] Unwaive lp tests (#10524)
Signed-off-by: Erin Ho <14718778+hchings@users.noreply.github.com>
Signed-off-by: Erin <14718778+hchings@users.noreply.github.com>
|
2026-01-13 04:31:27 -05:00 |
|
Void
|
7d16f3a28b
|
[https://nvbugs/5788127][fix] Use uint64_t as the dtype of lamport_buffer_size to avoid overflow (#10499)
Signed-off-by: Yilin Zhang <18275976+yilin-void@users.noreply.github.com>
|
2026-01-13 17:16:22 +08:00 |
|
Guoming Zhang
|
bdaee87895
|
[TRTLLM-10060][feat] Enable attention dp for Nemotron Super v3. (#10347)
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
|
2026-01-13 17:13:55 +08:00 |
|
JunyiXu-nv
|
e291a834db
|
[TRTLLM-8462][feat] Support GET/DELETE v1/responses/{response_id} (#9937)
Signed-off-by: Junyi Xu <219237550+JunyiXu-nv@users.noreply.github.com>
|
2026-01-13 03:57:14 -05:00 |
|
Yuxian Qiu
|
04b112651b
|
[None][feat] Hang detection for executor loop and worker. (#10480)
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
|
2026-01-13 02:34:32 -05:00 |
|
Yiteng Niu
|
50c22b80d7
|
[None][infra] Update allowlist 2026.01.08 (#10535)
Signed-off-by: Yiteng Niu <6831097+niukuo@users.noreply.github.com>
|
2026-01-13 15:28:53 +08:00 |
|
tburt-nv
|
7d41475954
|
[None][infra] try removing shared cache dir mount (#10609)
Signed-off-by: Tyler Burt <195370667+tburt-nv@users.noreply.github.com>
|
2026-01-13 15:07:12 +08:00 |
|
JennyLiu
|
2967d299fb
|
[TRTLLM-10271][test] Add Spark QA functional and performance cases (#10564)
Signed-off-by: Jenny Liu <JennyLiu-nv+JennyLiu@users.noreply.github.com>
Co-authored-by: Jenny Liu <JennyLiu-nv+JennyLiu@users.noreply.github.com>
|
2026-01-13 13:20:15 +08:00 |
|
TensorRT LLM
|
ba1cb6831d
|
[None][infra] Check in most recent lock file from nightly pipeline
Signed-off-by: TensorRT LLM <90828364+tensorrt-cicd@users.noreply.github.com>
|
2026-01-13 03:08:08 +00:00 |
|
fredricz-20070104
|
bbe535fddf
|
[None][chore] Fix disagg assert (#10596)
Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com>
|
2026-01-12 21:39:57 -05:00 |
|
xxi
|
ba1037ca4a
|
[https://nvbugs/5762336][fix] support to parse the keyword modules_to_not_convert of the HF model config" (#10527)
Signed-off-by: xxi <xxi@nvidia.com>
|
2026-01-12 20:21:01 -05:00 |
|
Iman Tabrizian
|
48b09e5a25
|
[https://nvbugs/5689235][fix] Fix cancellation+chunked prefill+disagg (#10111)
Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
|
2026-01-12 18:23:26 -05:00 |
|
Gal Hubara-Agam
|
18a33764b5
|
[None][chore] Print correct backend name in benchmark report (#10597)
Signed-off-by: Gal Hubara Agam <96368689+galagam@users.noreply.github.com>
|
2026-01-12 14:46:00 -05:00 |
|
Anish Shanbhag
|
dacc881993
|
[https://nvbugs/5761391][fix] Use correct model names for config database regression tests (#10192)
Signed-off-by: Anish Shanbhag <ashanbhag@nvidia.com>
|
2026-01-12 10:55:07 -08:00 |
|
Suyog Gupta
|
a1385243e1
|
[#10580][fix] re-enable NemotronH MOE MMLU test (#10594)
Signed-off-by: Suyog Gupta <41447211+suyoggupta@users.noreply.github.com>
|
2026-01-12 09:26:07 -08:00 |
|
ixlmar
|
01cf98132a
|
fix: remove unused import
Signed-off-by: ixlmar <206748156+ixlmar@users.noreply.github.com>
|
2026-01-12 17:44:04 +01:00 |
|
ixlmar
|
240eff4bd8
|
fix: conform to upstream
Signed-off-by: ixlmar <206748156+ixlmar@users.noreply.github.com>
|
2026-01-12 17:44:04 +01:00 |
|
ixlmar
|
eb9a24e6f0
|
chore: refine MultimodalDataTracker.add_data
Signed-off-by: ixlmar <206748156+ixlmar@users.noreply.github.com>
|
2026-01-12 17:44:04 +01:00 |
|
ixlmar
|
e4233671a9
|
address remaining review comments
Signed-off-by: ixlmar <206748156+ixlmar@users.noreply.github.com>
|
2026-01-12 17:44:04 +01:00 |
|
ixlmar
|
e42a8f7d64
|
chore: use torch.save/load
Signed-off-by: ixlmar <206748156+ixlmar@users.noreply.github.com>
|
2026-01-12 17:44:04 +01:00 |
|
ixlmar
|
db14542c35
|
chore: add is_embedding to MultimodalData
Signed-off-by: ixlmar <206748156+ixlmar@users.noreply.github.com>
|
2026-01-12 17:44:04 +01:00 |
|
ixlmar
|
0543bf01fb
|
fix: update docs
Signed-off-by: ixlmar <206748156+ixlmar@users.noreply.github.com>
|
2026-01-12 17:44:04 +01:00 |
|
ixlmar
|
bebc2e4317
|
do not run 'test_multimodal_content_mm_encoder' twice
Signed-off-by: ixlmar <206748156+ixlmar@users.noreply.github.com>
|
2026-01-12 17:44:04 +01:00 |
|
ixlmar
|
b2a328c706
|
add nested "data"
Signed-off-by: ixlmar <206748156+ixlmar@users.noreply.github.com>
|
2026-01-12 17:44:04 +01:00 |
|
ixlmar
|
045331d494
|
fix: run test_single_chat_session_image_embeds on L40S
Signed-off-by: ixlmar <206748156+ixlmar@users.noreply.github.com>
|
2026-01-12 17:44:04 +01:00 |
|
ixlmar
|
5e3c26ebfb
|
feat: support image_embeds in OpenAI API
Signed-off-by: ixlmar <206748156+ixlmar@users.noreply.github.com>
|
2026-01-12 17:44:04 +01:00 |
|
Emma Qiao
|
9f044b9dd9
|
[None][infra] Waive failed tests for main 01/12 (#10604)
Signed-off-by: qqiao <qqiao@nvidia.com>
|
2026-01-12 10:24:54 -05:00 |
|
mpikulski
|
bf7998f1b8
|
[TRTLLM-9522][test] cover LLM API multi_modal_embeddings (#9963)
Signed-off-by: ixlmar <206748156+ixlmar@users.noreply.github.com>
|
2026-01-12 11:38:22 +01:00 |
|
Wanli Jiang
|
11da7e3605
|
[None][fix] Solve pillow version conflict (#10537)
Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>
|
2026-01-12 04:05:54 -05:00 |
|
Zhenhuan Chen
|
3bd319dc8e
|
[https://nvbugs/5794796][chore] waive test blocking premerge (#10593)
Signed-off-by: Zhenhuan Chen <zhenhuanc@nvidia.com>
|
2026-01-12 15:39:07 +08:00 |
|
yufeiwu-nv
|
8e806abac3
|
[None][test] Remove most TRT-backend test cases in llm_perf_nim.yml (#10572)
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>
Co-authored-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
|
2026-01-12 15:34:55 +08:00 |
|
yingguo-trt
|
c5914f9085
|
[None][chore] update deepseekv3.2 test parameter (#10595)
Signed-off-by: yingguo-trt <244492186+yingguo-trt@users.noreply.github.com>
|
2026-01-12 01:43:22 -05:00 |
|
chenfeiz0326
|
54459377d2
|
[TRTLLM-10248][feat] Support Bot to Send Perf Regression Msg to Slack Channel (#10489)
Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>
|
2026-01-12 14:23:23 +08:00 |
|
Xianjie Qiao
|
3a9a00b544
|
[None][feat] Add ExpertStatistic and DUMMY_ALLREDUCE for configurable_moe (#10401)
Signed-off-by: Xianjie <5410381+qiaoxj07@users.noreply.github.com>
|
2026-01-12 14:10:31 +08:00 |
|
Jie Li
|
5e0dbba0c9
|
[None][chore]: update waive list (#10577)
Signed-off-by: Jie Li <lijie@nvidia.com>
|
2026-01-11 22:18:04 -05:00 |
|
TensorRT LLM
|
2de22f1a70
|
[None][infra] Check in most recent lock file from nightly pipeline
Signed-off-by: TensorRT LLM <90828364+tensorrt-cicd@users.noreply.github.com>
|
2026-01-12 03:09:53 +00:00 |
|
Pengbo Wang
|
c0e25e5418
|
[TRTLLM-10022][feat] Add hopper xqa decode support for skip softmax attention (#10264)
Signed-off-by: Pengbo Wang <221450789+pengbowang-nv@users.noreply.github.com>
|
2026-01-11 19:26:10 -05:00 |
|
Eran Geva
|
c5d5af9e7f
|
[#8391][chore] removed llama and added deepseek to AutoDeploy's L0 perf test (#10585)
Signed-off-by: Eran Geva <19514940+MrGeva@users.noreply.github.com>
|
2026-01-11 16:31:24 -05:00 |
|
Ivy Zhang
|
7f018c89e9
|
[None][test] update core test list (#10538)
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
|
2026-01-11 14:08:20 -05:00 |
|
Yechan Kim
|
8e0d20d901
|
[TRTLLM-10195][feat] K-EXAONE support (#10355)
Signed-off-by: Jaedeok Kim <jaedeokk@nvidia.com>
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
Co-authored-by: Jaedeok Kim <jaedeokk@nvidia.com>
|
2026-01-12 00:29:51 +09:00 |
|
Yanchao Lu
|
80649a8b78
|
[None][ci] Workaround OCI-NRT slowdown issue (#10587)
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
|
2026-01-11 22:08:19 +08:00 |
|
Guoming Zhang
|
0371cbfd88
|
[None][doc] Update Qwen3-Next doc by adding known issues section (#10582)
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
|
2026-01-11 14:47:47 +08:00 |
|
TensorRT LLM
|
b2e2538fcd
|
[None][infra] Check in most recent lock file from nightly pipeline
Signed-off-by: TensorRT LLM <90828364+tensorrt-cicd@users.noreply.github.com>
|
2026-01-11 03:07:48 +00:00 |
|
HuiGao-NV
|
3c65ec3c55
|
[None][chore] waive test case (#10581)
Signed-off-by: Hui Gao <huig@nvidia.com>
|
2026-01-10 18:53:36 -05:00 |
|
fredricz-20070104
|
f6045fac09
|
[None][chore] Fix Gitlab CI termination issues (#10576)
Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com>
Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>
Co-authored-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>
|
2026-01-10 07:51:18 -05:00 |
|