Ivy Zhang
7f018c89e9
[None][test] update core test list ( #10538 )
...
Signed-off-by: Ivy Zhang <25222398+crazydemo@users.noreply.github.com>
2026-01-11 14:08:20 -05:00
Yechan Kim
8e0d20d901
[TRTLLM-10195][feat] K-EXAONE support ( #10355 )
...
Signed-off-by: Jaedeok Kim <jaedeokk@nvidia.com>
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
Co-authored-by: Jaedeok Kim <jaedeokk@nvidia.com>
2026-01-12 00:29:51 +09:00
Yanchao Lu
80649a8b78
[None][ci] Workaround OCI-NRT slowdown issue ( #10587 )
...
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
2026-01-11 22:08:19 +08:00
Guoming Zhang
0371cbfd88
[None][doc] Update Qwen3-Next doc by adding known issues section ( #10582 )
...
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
2026-01-11 14:47:47 +08:00
TensorRT LLM
b2e2538fcd
[None][infra] Check in most recent lock file from nightly pipeline
...
Signed-off-by: TensorRT LLM <90828364+tensorrt-cicd@users.noreply.github.com>
2026-01-11 03:07:48 +00:00
HuiGao-NV
3c65ec3c55
[None][chore] waive test case ( #10581 )
...
Signed-off-by: Hui Gao <huig@nvidia.com>
2026-01-10 18:53:36 -05:00
fredricz-20070104
f6045fac09
[None][chore] Fix Gitlab CI termination issues ( #10576 )
...
Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com>
Signed-off-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>
Co-authored-by: yufeiwu-nv <230315618+yufeiwu-nv@users.noreply.github.com>
2026-01-10 07:51:18 -05:00
tcherckez-nvidia
f6c4dd885f
[None][chore] Update AutoDeploy model list ( #10505 )
...
Signed-off-by: Tal Cherckez <127761168+tcherckez-nvidia@users.noreply.github.com>
2026-01-10 08:47:37 +02:00
TensorRT LLM
6ab996d635
[None][infra] Check in most recent lock file from nightly pipeline
...
Signed-off-by: TensorRT LLM <90828364+tensorrt-cicd@users.noreply.github.com>
2026-01-10 03:09:09 +00:00
William Zhang
ff7eb93f31
[ https://nvbugs/5669097 ][tests] Add MMMU test for mistral small ( #10530 )
...
Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>
2026-01-09 16:09:28 -08:00
Chenghao Zhang
38f249b479
[ https://nvbugs/5548861 ][fix] AutoDeploy: Fix the test ( #10521 )
...
Signed-off-by: Chenghao Zhang <211069071+nvchenghaoz@users.noreply.github.com>
2026-01-09 13:30:24 -08:00
Linda
82dfef2e56
[ https://nvbugs/5628848 ][fix] Fix nanobind stub generation ( #10516 )
...
Signed-off-by: Linda-Stadter <57756729+Linda-Stadter@users.noreply.github.com>
2026-01-09 11:32:21 -08:00
Faraz
fdbdbba540
[ https://nvbugs/5752687 ][fix] Choose register model config over root config for VLM ( #10553 )
...
Signed-off-by: Faraz Khoubsirat <58580514+farazkh80@users.noreply.github.com>
2026-01-09 12:10:52 -05:00
yingguo-trt
d80f01d205
[None][feat] Add support for DeepSeek v3.2 tests ( #10561 )
...
Signed-off-by: yingguo-trt <244492186+yingguo-trt@users.noreply.github.com>
2026-01-09 10:20:29 -05:00
Yechan Kim
7295af68ba
[None][fix] Enable AttentionDP on Qwen3-VL and fix test ( #10435 )
...
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
2026-01-10 00:13:26 +09:00
Kaiyu Xie
1c69aad850
[TRTLLM-10309] [feat] Optimize qk rope/nope concat for DSA ( #10571 )
...
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
2026-01-09 09:50:57 -05:00
Iman Tabrizian
ced88424ef
[ https://nvbugs/5756008 ][fix] unwaive test ( #10523 )
...
Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
2026-01-09 09:40:07 -05:00
Jie Li
627d306df9
[None][chore] remove some model support; add device constraint ( #10563 )
...
Signed-off-by: Jie Li <lijie@nvidia.com>
2026-01-09 09:36:23 -05:00
ruodil
2b72d33fdc
[TRTLLM-9932][test] add kimi_k2 single node perf test ( #10436 )
...
Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com>
Co-authored-by: Ruodi Lu <ruodil@users.noreply.github.com>
2026-01-09 05:36:50 -05:00
Fanrong Li
4632a8642d
[None][doc] blog: Optimizing DeepSeek-V3.2 on NVIDIA Blackwell GPUs ( #10565 )
...
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
2026-01-09 05:16:00 -05:00
Yuxian Qiu
80f261ea36
[ https://nvbugs/5622938 ][feat] Run sample_async on extra stream. ( #10215 )
...
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
2026-01-09 18:15:18 +08:00
Chang Liu
78bb245554
[ https://nvbugs/5787453 ][fix] Better align MLA chunking with indexer chunking when chunked prefill enabled for DSV32 ( #10552 )
2026-01-09 00:49:39 -08:00
bhsueh_NV
4a09acd012
[ https://nvbugs/5785206 ][infra] unwaive the accuracy/test_llm_api_pytorch.py::TestQwen3_30B_A3B ( #10560 )
...
Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>
2026-01-09 03:13:29 -05:00
JadoTu
4c498bfe58
[TRTLLM-9676][fix] Fix mamba_cache_manager when enabling cuda_graph_padding and let test cover this case ( #9873 )
...
Signed-off-by: JadoTu <107457950+JadoTu@users.noreply.github.com>
2026-01-09 14:50:16 +08:00
Yukun He
c5331e6dbb
[None][fix] Setup dist for AutoTuner in Layerwise benchmarking. ( #10534 )
...
Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>
2026-01-09 14:16:39 +08:00
Jie Li
6fcd4e7099
[None][chore] Add failed cases into waives.txt ( #10541 )
...
Signed-off-by: Jie Li <lijie@nvidia.com>
2026-01-09 01:03:47 -05:00
TensorRT LLM
5df03b2ea7
[None][infra] Check in most recent lock file from nightly pipeline
...
Signed-off-by: TensorRT LLM <90828364+tensorrt-cicd@users.noreply.github.com>
2026-01-09 03:43:08 +00:00
ruodil
d707286ca8
[None][test] restrict max_num_tokens in disagg mtp config ( #10442 )
...
Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com>
Co-authored-by: Ruodi Lu <ruodil@users.noreply.github.com>
2026-01-08 21:53:24 -05:00
Yuxian Qiu
afa55c12b6
[None][fix] revert https://github.com/NVIDIA/TensorRT-LLM/pull/10445 . ( #10547 )
...
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
2026-01-08 21:50:04 -05:00
Balaram Buddharaju
56e779d09f
[None][chore] Waive tests blocking premerge 01/08 ( #10555 )
...
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
2026-01-08 20:22:28 -05:00
Mike Iovine
4092a87b6f
[ https://nvbugs/5740075 ][fix] Fix sm120 speculation ( #10049 )
...
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2026-01-08 19:55:43 -05:00
Eran Geva
489dd60312
[ #10513 ][fix] AutoDeploy: removed self.mlp_type leftovers from last moe refactor ( #10512 )
...
Signed-off-by: Eran Geva <19514940+MrGeva@users.noreply.github.com>
2026-01-08 14:49:40 -05:00
mpikulski
e0331297a6
[TRTLLM-9522][fix] broken cast ( #9975 )
...
Signed-off-by: ixlmar <206748156+ixlmar@users.noreply.github.com>
2026-01-08 06:47:39 -05:00
William Zhang
c0ae6bbdbe
[None][feat] EPD for Qwen3 VL ( #10470 )
...
* Why?
We would like to support EPD disaggregated serving for Qwen3 VL.
* What?
This commit adds such support, and extends existing unit tests for
correctness checks.
Some minor (protected) interface changes had to be made to the
weight mapper as a side-effect.
Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>
2026-01-08 06:45:54 -05:00
Eran Geva
6511dbaea0
[ #10417 ][fix] AutoDepoloy - Reverted to direct computation of minusA ( #10509 )
...
Signed-off-by: Eran Geva <19514940+MrGeva@users.noreply.github.com>
2026-01-08 13:43:41 +02:00
bhsueh_NV
bea61bb17d
[None][fix] Mistral large 3 few code refine ( #10405 )
...
Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>
2026-01-08 06:38:49 -05:00
Yiqing Yan
dc6b743fb6
[None][chore] Bump version to 1.2.0rc8 ( #10542 )
...
Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>
2026-01-08 04:51:44 -05:00
Emma Qiao
43839c7d9b
[TRTLLM-9642][infra] Increase pytest verbosity for failed tests ( #9657 )
...
Signed-off-by: qqiao <qqiao@nvidia.com>
Signed-off-by: Emma Qiao <qqiao@nvidia.com>
2026-01-08 02:33:48 -05:00
dongfengy
8d4b09dac6
[None][doc] Update GPTOSS Doc ( #10536 )
...
Signed-off-by: Dongfeng Yu <dongfengy@nvidia.com>
2026-01-08 02:30:53 -05:00
HuiGao-NV
22c81cb5fa
[None][chore] Enable seg fault cases since one race condition is fixed ( #10398 )
...
Signed-off-by: Hui Gao <huig@nvidia.com>
2026-01-08 02:15:30 -05:00
Barry Kang
f57aab5255
[ https://nvbugs/5775402 ][fix] Fix concurrency list in Wide-EP perf tests ( #10529 )
...
Signed-off-by: Barry Kang <43644113+Barry-Delaney@users.noreply.github.com>
2026-01-08 01:58:55 -05:00
Lucas Liebenwein
30f8455d29
[ https://nvbugs/5747878 ][fix] unwaive llama4 scout tests ( #10468 )
...
Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
2026-01-07 23:33:45 -05:00
TensorRT LLM
342a47bf47
[None][infra] Check in most recent lock file from nightly pipeline
...
Signed-off-by: TensorRT LLM <90828364+tensorrt-cicd@users.noreply.github.com>
2026-01-08 03:12:45 +00:00
yingguo-trt
f8b2a8fd30
[None][chore] Support multiple job submission at the same time ( #10492 )
...
Signed-off-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com>
Co-authored-by: FredricZ-2007 <226039983+fredricz-20070104@users.noreply.github.com>
2026-01-07 21:51:36 -05:00
Yuxian Qiu
b85c447ceb
[ https://nvbugs/5784543 ][fix] Setup dist before using autotuner. ( #10491 )
...
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
2026-01-08 10:32:50 +08:00
Yukun He
09d9878385
[TRTLLM-9661][chore] Further reduce tuning time for cuteDSL nvFP4 dense gemm. ( #10339 )
...
Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com>
2026-01-08 10:21:02 +08:00
xxi
81f878c279
[ https://nvbugs/5707392 ][fix] unwaive test_fused_moe_fp8_blockwise_wide_ep[NotEnabled] ( #10428 )
...
Signed-off-by: xxi <xxi@nvidia.com>
2026-01-08 09:17:59 +08:00
Lucas Liebenwein
d736c7f290
[ https://nvbugs/5761665 ][fix] AutoDeploy: handle bugs for 25.12 dlfw upgrade ( #10511 )
...
Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
2026-01-07 20:16:53 -05:00
Ziyi Xiong
7187afe7b9
[ https://nvbugs/5781589 ][fix] Skip spec dec for non-last rank ( #10445 )
...
Signed-off-by: ziyixiong-nv <219238287+ziyixiong-nv@users.noreply.github.com>
2026-01-07 13:55:45 -05:00
Patrice Castonguay
e8cceb06b2
[None][doc] Adding parallelism types in feature combination matrix ( #9849 )
...
Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>
2026-01-07 12:52:05 -05:00