bhsueh_NV
|
d3059dbd8a
|
[https://nvbugs/5547416][fix] unwaive no_cache test (#8213)
Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>
|
2025-10-10 01:50:13 -07:00 |
|
xinhe-nv
|
b555f1ff98
|
[None][chore] Add failed cases into waives.txt (#8229)
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
|
2025-10-09 23:45:28 -07:00 |
|
HuiGao-NV
|
795a051765
|
[None][chore] Print log with time for starting to load safetensor weights (#8218)
Signed-off-by: Hui Gao <huig@nvidia.com>
|
2025-10-10 13:54:54 +08:00 |
|
xinhe-nv
|
e8c9bae37e
|
[None][chore] Remove closed bugs (#8151)
Signed-off-by: Xin He (SW-GPU) <200704525+xinhe-nv@users.noreply.github.com>
|
2025-10-10 16:39:40 +11:00 |
|
Jonas Li
|
76a47c7bef
|
[None][fix] Enable FP8 ContextMLA on GB300 (#8080)
Signed-off-by: Jonas Li <6110159+longlee0622@users.noreply.github.com>
|
2025-10-10 10:20:46 +08:00 |
|
Pengbo Wang
|
7da4b05289
|
[https://nvbugs/5501820][fix] Add requirements for numba-cuda version to WAR mem corruption (#7992)
Signed-off-by: Pengbo Wang <221450789+pengbowang-nv@users.noreply.github.com>
|
2025-10-10 10:18:27 +08:00 |
|
mpikulski
|
7b6803b6e9
|
[TRTLLM-7769][chore] document the role of 'd2t' (#8174)
Signed-off-by: ixlmar <206748156+ixlmar@users.noreply.github.com>
|
2025-10-09 13:13:50 -04:00 |
|
Emma Qiao
|
ccd949ea5b
|
[None][infra] Waive failed tests on main 10/09 (#8230)
Signed-off-by: qqiao <qqiao@nvidia.com>
|
2025-10-09 22:46:07 +08:00 |
|
amitz-nv
|
d560054e1b
|
[None][chore] Restore asserts in pytorch flow LoRA tests (#8227)
Signed-off-by: Amit Zuker <203509407+amitz-nv@users.noreply.github.com>
|
2025-10-09 17:10:38 +03:00 |
|
QI JUN
|
e10121345e
|
[None][ci] pin flashinfer-python version (#8217)
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
|
2025-10-09 02:48:49 -07:00 |
|
Guoming Zhang
|
a193867f8f
|
[None][doc] Refine deployment guide by renaming TRT-LLM to TensorRT L… (#8214)
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
|
2025-10-09 17:11:24 +08:00 |
|
bhsueh_NV
|
27677a36f5
|
[https://nvbugs/5516666][fix] unwaive some Qwen3 CI tests (#8130)
Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>
|
2025-10-09 09:44:58 +08:00 |
|
Lizhi Zhou
|
fdf29ab8fa
|
[TRTLLM-7846][feat] Http disagg-cluster management implemention (#7869)
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
|
2025-10-09 09:44:01 +08:00 |
|
QI JUN
|
6884d06aed
|
[None][ci] move some llama4 test cases to pre merge (#8189)
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
|
2025-10-08 18:34:08 -07:00 |
|
dongfengy
|
9f2a3ae88c
|
[None][fix] Restrict tinygemm use to certain SMs (#8182)
Signed-off-by: Dongfeng Yu <dongfengy@nvidia.com>
Signed-off-by: dongfengy <99041270+dongfengy@users.noreply.github.com>
|
2025-10-08 17:55:57 -07:00 |
|
Liao Lanyu
|
ed8e00ad4a
|
[https://nvbugs/5522746][fix] unwaive tests caused by node issues after rebooting (#8193)
Signed-off-by: Lanyu Liao <lancelly@users.noreply.github.com>
Co-authored-by: Lanyu Liao <lancelly@users.noreply.github.com>
|
2025-10-09 08:45:56 +08:00 |
|
Mike Iovine
|
c88913dc03
|
[https://nvbugs/5541545][fix] Remove test_llama4 (#8031)
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
|
2025-10-08 15:20:15 -07:00 |
|
brb-nv
|
80517b7812
|
[None][chore] Waive some tests failing on main post merge (#8186)
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
|
2025-10-08 06:52:30 -07:00 |
|
mpikulski
|
8298e93bd8
|
[TRTLLM-8414][chore] BREAKING CHANGE: refine sampling strategy selection (#8132)
Signed-off-by: ixlmar <206748156+ixlmar@users.noreply.github.com>
|
2025-10-08 15:46:50 +02:00 |
|
xxi
|
e98616512f
|
[https://nvbugs/5550283][fix] update test case to the latest MoE API (#8165)
|
2025-10-07 22:54:34 -07:00 |
|
Liao Lanyu
|
d57b8f0951
|
[https://nvbugs/5455140][fix] unwaive tests related to GB200 OOM (#8159)
Signed-off-by: Lanyu Liao <lancelly@users.noreply.github.com>
Co-authored-by: Lanyu Liao <lancelly@users.noreply.github.com>
|
2025-10-08 13:14:12 +08:00 |
|
ruodil
|
971610e3ff
|
[None][test] add test-model-suites option in integration conftest.py (#8016)
Signed-off-by: Ruodi Lu <ruodil@users.noreply.github.com>
Co-authored-by: Ruodi Lu <ruodil@users.noreply.github.com>
|
2025-10-08 10:38:31 +08:00 |
|
Sergey Klevtsov
|
017583a949
|
[https://nvbugs/5488576][fix] Propagate disable_finalize_fusion config flag in WIDEEP MoE backend (#8141)
Signed-off-by: Sergey Klevtsov <sklevtsov@nvidia.com>
|
2025-10-07 14:44:54 -07:00 |
|
Mike Iovine
|
7facac077b
|
[None][fix] Fix MTP illegal memory access (#8161)
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
|
2025-10-07 14:02:55 -04:00 |
|
Emma Qiao
|
ca9da1f1c2
|
[None][infra] Skip failed cases for main (#8176)
Signed-off-by: qqiao <qqiao@nvidia.com>
|
2025-10-07 06:37:51 -07:00 |
|
xiweny
|
9298f1bdcc
|
[None] [test] Add B300 cases to CI (#8056)
Signed-off-by: Xiwen Yu <13230610+VALLIS-NERIA@users.noreply.github.com>
|
2025-10-06 19:23:31 -07:00 |
|
Kanghwan
|
2b8722b671
|
[None][chore] Increase operations-per-run to 1000 for stale action (#8162)
Signed-off-by: Kanghwan Jang <861393+karljang@users.noreply.github.com>
|
2025-10-06 15:02:43 -07:00 |
|
Faraz
|
27a5091fcb
|
[None][feat] GPT-OSS Sm120/Sm121 Support (#7937)
Signed-off-by: Perkz Zheng <67892460+PerkzZheng@users.noreply.github.com>
Signed-off-by: list <58580514+farazkh80@users.noreply.github.com>
Signed-off-by: Vincent Huang <vincenth@nvidia.com>
Co-authored-by: Perkz Zheng <67892460+PerkzZheng@users.noreply.github.com>
Co-authored-by: Vincent Huang <vincenth@nvidia.com>
|
2025-10-06 16:59:06 -04:00 |
|
Izzy Putterman
|
f2657c1ae9
|
[None][fix] Eagle: Attention DP (#7939)
Signed-off-by: Izzy Putterman <iputterman@nvidia.com>
|
2025-10-06 16:52:35 -04:00 |
|
Lucas Liebenwein
|
3492391feb
|
[None][chore] AutoDeploy: clean up accuracy test configs (#8134)
Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
|
2025-10-06 12:51:01 -07:00 |
|
mpikulski
|
98b3af4d4e
|
[TRTLLM-8413][chore] resolve sampling defaults in OpenAI API backend (#8121)
Signed-off-by: ixlmar <206748156+ixlmar@users.noreply.github.com>
|
2025-10-06 06:09:43 -07:00 |
|
Yan Chunwei
|
54ab9767b5
|
[None][chore] fix llmargs conflict (#8152)
Signed-off-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
|
2025-10-06 02:34:27 -07:00 |
|
Patrice Castonguay
|
fba351a211
|
[None][fix] Adding docker folder to Dockerfile (#8138)
Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>
|
2025-10-05 13:41:40 -04:00 |
|
amitz-nv
|
8060aad239
|
[https://nvbugs/5521949][fix] Re-enable test_bielik_11b_v2_2_instruct_multi_lora, fix its API use with pytorch flow LoRA (#8146)
Signed-off-by: Amit Zuker <203509407+amitz-nv@users.noreply.github.com>
|
2025-10-05 04:28:20 -07:00 |
|
Yan Chunwei
|
fb51de6c2e
|
[TRTLLM-8189][chore] enhance GenerationExecutor with RPC (part1) (#5543)
Signed-off-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
Signed-off-by: chunweiy <chunweiy@nvidia.com>
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
Signed-off-by: chunweiy <328693+Superjomn@users.noreply.github.com>
|
2025-10-05 17:28:20 +08:00 |
|
Frida Hou
|
f6654f26a4
|
[#5255][autodeploy] Update FuseAllreduceResidualRMSNorm to use pattern matcher utility; remove fuse_collective (#7545)
Signed-off-by: Frida Hou <201670829+Fridah-nv@users.noreply.github.com>
Signed-off-by: Fridah-nv <201670829+Fridah-nv@users.noreply.github.com>
|
2025-10-05 01:15:46 -07:00 |
|
Frida Hou
|
744246d316
|
[None][autodeploy] small refactors on attention matching (#8079)
Signed-off-by: Frida Hou <201670829+Fridah-nv@users.noreply.github.com>
Signed-off-by: Fridah-nv <201670829+Fridah-nv@users.noreply.github.com>
|
2025-10-03 22:00:27 -07:00 |
|
Jonas Yang CN
|
88ea2c4ee9
|
[TRTLLM-7349][feat] Adding new orchestrator type -- ray (#7520)
Signed-off-by: Erin Ho <14718778+hchings@users.noreply.github.com>
Co-authored-by: Yuan Tong <13075180+tongyuantongyu@users.noreply.github.com>
Co-authored-by: Erin Ho <14718778+hchings@users.noreply.github.com>
|
2025-10-04 08:12:24 +08:00 |
|
Lucas Liebenwein
|
9d098e3142
|
[None][feat] AutoDeploy: graph/module inputs with kwargs instead of args (#8137)
Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
|
2025-10-03 16:53:42 -07:00 |
|
Lucas Liebenwein
|
2c454e8003
|
[None][feat] AutoDeploy: Nemotron-H accuracy test (#8133)
Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
|
2025-10-03 15:39:03 -07:00 |
|
Michal Guzek
|
38da871db3
|
[TRTLLM-6496][feat] Add LoRa Torch tests for the latest NIM model list (#6806)
Signed-off-by: Michal Guzek <mguzek@nvidia.com>
|
2025-10-03 12:10:48 -07:00 |
|
Mike Iovine
|
ca8291133a
|
[None][fix] Fix MTP 2-model (#8115)
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
|
2025-10-03 10:13:50 -07:00 |
|
Lucas Liebenwein
|
aaf2c3c2e5
|
[None][feat] AutoDeploy: compiler backends based on nn.Module (#8126)
Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
|
2025-10-03 12:14:21 -04:00 |
|
Ziyi Xiong
|
7bc2d9e993
|
[https://nvbugs/5537878][fix] Reserve an extra slot for padded batch (#7998)
Signed-off-by: ziyixiong-nv <219238287+ziyixiong-nv@users.noreply.github.com>
|
2025-10-03 08:42:52 -07:00 |
|
Suyog Gupta
|
d8215241d8
|
[None][feat] AutoDeploy add autotuning when capturing cudagraphs (#8120)
Signed-off-by: Suyog Gupta <41447211+suyoggupta@users.noreply.github.com>
|
2025-10-03 08:33:21 -07:00 |
|
Aurelien Chartier
|
9db4366903
|
[None][fix] Fix Qwen3 FP8 per-tensor when requesting TRTLLM-GEN MoE backend (#8075)
Signed-off-by: Aurelien Chartier <2567591+achartier@users.noreply.github.com>
|
2025-10-03 07:52:52 -07:00 |
|
Lucas Liebenwein
|
5faa5e9dd8
|
[None][feat] AutoDeploy: dive deeper into token generation bugs + enable_block_reuse (#8108)
Signed-off-by: Lucas Liebenwein <11156568+lucaslie@users.noreply.github.com>
|
2025-10-03 04:57:26 -07:00 |
|
Robin Kobus
|
e2f69c5c23
|
[None] [refactor] Minor cleanup and improvements (#7619)
Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
|
2025-10-03 11:40:06 +02:00 |
|
Erin
|
ba3dbb6c94
|
[https://nvbugs/5548098][fix] Fix flakey unit test for dynamic spec d… (#8129)
|
2025-10-02 22:58:37 -07:00 |
|
Nikita Korobov
|
9b3d7cc3e6
|
[None][feat] Update TRT-LLM Gen MoE kernels (#7970)
Signed-off-by: Nikita Korobov <14355239+nekorobov@users.noreply.github.com>
|
2025-10-03 09:22:45 +08:00 |
|