Commit Graph

  • b3d0f4a883 add log_samples and output_path for trtllm_eval Zhenhuan Chen 2026-01-11 17:22:49 -0800
  • 3f9421ac3d Use old lm_eval version to avoid a bug for custom config Zhenhuan Chen 2026-01-12 20:51:38 -0800
  • f610f1b69c [None][chore] improve the readability of log for cutlass can only support fp8-blockwise on hopper xxi 2026-01-13 01:25:43 -0800
  • c9e0a07932
    Merge branch 'main' into user/qa/post_update_waive_20260112_LLM_FUNCTION_CLUSTER_TEST_1215 xinhe-nv 2026-01-13 17:22:42 +0800
  • 6c0a33406a add test into qa test list Xin He (SW-GPU) 2026-01-13 15:39:26 +0800
  • f2fe0da825
    Merge f08d4e3aae into 7d16f3a28b Yi Zhang 2026-01-13 09:16:55 +0000
  • 0407b94037
    Merge e5f8fcf158 into 7d16f3a28b xinhe-nv 2026-01-13 09:16:53 +0000
  • 3e60515191
    Merge 1937e725a2 into 7d16f3a28b Bo Deng 2026-01-13 09:16:48 +0000
  • 7d6fa88075
    Merge 33e2a8bfc8 into 7d16f3a28b Zongfei Jing 2026-01-13 09:16:27 +0000
  • 7d16f3a28b
    [https://nvbugs/5788127][fix] Use uint64_t as the dtype of lamport_buffer_size to avoid overflow (#10499) Void 2026-01-13 17:16:22 +0800
  • 7192210b1e revert changes Perkz Zheng 2026-01-05 04:31:59 +0000
  • b0e802e50b fix Perkz Zheng 2026-01-04 05:29:48 +0000
  • bdaee87895
    [TRTLLM-10060][feat] Enable attention dp for Nemotron Super v3. (#10347) Guoming Zhang 2026-01-13 17:13:55 +0800
  • 7077b9133b Add DeepEP backend. Bo Li 2026-01-13 09:10:49 +0000
  • 6c29b43313
    Merge d891022ebc into e291a834db Lizhi Zhou 2026-01-13 17:07:42 +0800
  • 9900551173 add gpqa accuracy test config for wideep Zhenhuan Chen 2026-01-11 19:19:31 -0800
  • 30a6a40d2d
    Merge branch 'main' into spark-weekly-newcases Larry Xu 2026-01-13 17:01:38 +0800
  • 3054100ea1 fix compilation error Perkz Zheng 2025-12-24 14:17:20 +0000
  • 55a7b4db1d fix some failing tests Perkz Zheng 2025-12-24 13:06:47 +0000
  • fe96fd7524 fix a compilation error Perkz Zheng 2025-12-24 08:25:14 +0000
  • 557646b453 update trtllm-gen to support groupsTokensHeadsQ Perkz Zheng 2025-12-24 06:12:00 +0000
  • e291a834db
    [TRTLLM-8462][feat] Support GET/DELETE v1/responses/{response_id} (#9937) JunyiXu-nv 2026-01-13 16:57:14 +0800
  • 3479d73b4f Refine quantization support. Bo Li 2026-01-13 08:49:50 +0000
  • 33e2a8bfc8 Skip finalize_fusion test for DENSEGEMM backend Zongfei Jing 2026-01-13 00:46:01 -0800
  • 5c5aec0e01
    Merge 80daa5e153 into 04b112651b Jin Li 2026-01-13 16:44:17 +0800
  • 6d0e83559d
    Merge 3ba882f7ad into 04b112651b Emma Qiao 2026-01-13 16:41:16 +0800
  • e896179263 Also update scratch in script ZhanruiSunCh 2026-01-13 00:40:28 -0800
  • ba1e63519d Add Nemotron Nano 3 30B FP8 autodeploy perf test Eran Geva 2026-01-13 00:40:20 -0800
  • db09dafbc9
    Merge branch 'main' into spark-weekly-newcases Larry Xu 2026-01-13 16:39:38 +0800
  • 65c56a1ed3
    Merge branch 'main' into user/qa/post_update_waive_20260112_LLM_FUNCTION_CLUSTER_TEST_1215 xinhe-nv 2026-01-13 16:31:36 +0800
  • 194229ad52 change testlist name and perf yml format Jenny Liu 2026-01-13 08:21:53 +0000
  • 91bc17a32e test and fix for agent Shixiaowei02 2026-01-13 08:19:31 +0000
  • ffe5c41977
    Merge 37504c823d into 04b112651b Yibin Li 2026-01-13 09:16:47 +0100
  • 538f8cec1b
    Merge branch 'main' into user/nzmora/add_mem_logs Gal Hubara-Agam 2026-01-13 10:10:51 +0200
  • ff0ecc1a2a
    Update waives.txt bhsueh_NV 2026-01-13 16:08:14 +0800
  • 1937e725a2 [none][infra] trigger multi-gpu tests when install_nixl/ucx.sh is modified Bo Deng 2026-01-13 08:04:25 +0000
  • 52bd2f82f2
    Merge 2660898db4 into 04b112651b Iman Tabrizian 2026-01-13 16:01:37 +0800
  • 8488cc6c91
    Merge 4fee1914a5 into 04b112651b Wanli Jiang 2026-01-13 16:01:16 +0800
  • a16f684e11 sleep zhengd-nv 2026-01-13 04:04:03 +0000
  • d40a314772
    Merge 26a3add5af into 04b112651b Wanli Jiang 2026-01-13 16:01:13 +0800
  • 0c51a45224 only waive mooncake+indexerkcache zhengd-nv 2025-12-24 07:29:46 +0000
  • e5f8fcf158 update waive list xinhe-nv 2026-01-13 15:59:34 +0800
  • f08d4e3aae Fix comment Yi Zhang 2026-01-12 03:29:59 +0000
  • cfda206924 Fix CI Yi Zhang 2026-01-09 10:28:38 +0000
  • 9c199efad3 Fix according to comments Yi Zhang 2025-12-16 03:00:51 +0000
  • 34b6baebe6 Avoid Double update for previous batch Yi Zhang 2025-12-11 03:02:45 +0000
  • 8bfddf5651 Revert the support of placement group list from yaml into RayPlacementConfig. Wangshanshan 2026-01-12 23:40:05 -0800
  • 04b112651b
    [None][feat] Hang detection for executor loop and worker. (#10480) Yuxian Qiu 2026-01-13 15:34:32 +0800
  • 60a621e4e1 Address comment Hui Gao 2026-01-13 07:33:33 +0000
  • 37b193614a
    Merge 6af3e00364 into 50c22b80d7 Liao Lanyu 2026-01-13 15:28:56 +0800
  • 50c22b80d7
    [None][infra] Update allowlist 2026.01.08 (#10535) Yiteng Niu 2026-01-13 15:28:53 +0800
  • 694ef7afb3
    Merge c3716885f2 into 7d41475954 Zhanrui Sun 2026-01-13 15:17:31 +0800
  • f8c8e2a342
    Merge branch 'main' into user/qa/post_update_waive_20260112_LLM_FUNCTION_CLUSTER_TEST_1215 xinhe-nv 2026-01-13 15:11:18 +0800
  • a6aa11c69e
    Update waives.txt xinhe-nv 2026-01-13 15:10:38 +0800
  • eb7fb12b76
    Merge 4356aea2c2 into 7d41475954 Void 2026-01-13 15:10:21 +0800
  • b46c5c5ba1 Only keep a limited number of performance statistic records Hui Gao 2026-01-09 06:39:13 +0000
  • b3a120de29
    Merge aa21da84a7 into 7d41475954 Yanchao Lu 2026-01-13 15:10:06 +0800
  • e0b3e26d1b
    Merge b3d794f43d into 7d41475954 Yuxian Qiu 2026-01-13 07:09:08 +0000
  • 7d41475954
    [None][infra] try removing shared cache dir mount (#10609) tburt-nv 2026-01-13 02:07:12 -0500
  • 2f17268409 fix pre-commit Yibin Li 2026-01-13 06:58:34 +0000
  • 14ad0cb4a7 add tests for serialized handles Yibin Li 2026-01-13 06:57:18 +0000
  • 4e4fa1712c replace pickle.load with RestrictedUnpickler Yibin Li 2026-01-13 06:56:46 +0000
  • a08e8f7bbc update torch_ext API and debugging test for FusedAddRMSNorm JtaoPeng 2026-01-13 06:28:18 +0000
  • 0df36fb73a
    Merge branch 'main' into emma/waive_tests_main_0107 Emma Qiao 2026-01-13 14:42:58 +0800
  • e536870187 update Chenfei Zhang 2026-01-12 22:32:19 -0800
  • 7973e8f7a8
    Merge 477b309897 into 2967d299fb Tian Zheng 2026-01-13 14:29:26 +0800
  • 52b5cc5445 update Chenfei Zhang 2026-01-12 22:27:41 -0800
  • 3c5f97bf57 fix(custom_ops): update candidates for MMA tiling and cluster shapes Zongfei Jing 2026-01-08 06:11:55 -0800
  • 45b468c66e fix(custom_ops): refactor kernel handling Zongfei Jing 2026-01-07 20:44:45 -0800
  • b6b7aa3592 fix(moe): reshape fc1_output_sf for compatibility in DenseGEMMFusedMoE Zongfei Jing 2026-01-07 16:44:41 -0800
  • a0f523c628 refactor(moe): improve FC1/FC2 get_valid_tactics and tuning config Zongfei Jing 2026-01-07 08:11:00 -0800
  • 0f2541586c Enable ray tests shuyix 2026-01-08 05:24:30 -0800
  • 672df6a422 Rename test file to test_moe_densegemm.py Zongfei Jing 2026-01-06 21:12:22 -0800
  • 41f81093e2 Add FC2 kernel integration and unit tests for MoE dense GEMM Zongfei Jing 2026-01-06 20:34:57 -0800
  • 35631a37ad Optimize gen_fc2_alpha with fused kernel Zongfei Jing 2026-01-06 19:00:17 -0800
  • 1df9e8a0fc Add missing __init__.py to moe_as_dense_gemm package Zongfei Jing 2026-01-06 18:46:22 -0800
  • 49d887f521 Fix dense GEMM integration and add scale factor validation Zongfei Jing 2026-01-06 01:45:03 -0800
  • 84dbc447f4 Add NVFP4 dense GEMM with SwiGLU fusion integration and unit tests Zongfei Jing 2026-01-06 01:19:19 -0800
  • 80daa5e153
    Merge branch 'main' into dev-liji-gpt-oss-compile Jin Li 2026-01-13 14:10:40 +0800
  • 7800f0885c
    Merge ca2219fa0c into 2967d299fb Chuang Zhu 2026-01-13 14:06:26 +0800
  • 250ad4ebde refactor(fc1): remove num_fused_gemm and compute weight_per_expert from n // expert_count Zongfei Jing 2026-01-05 22:42:35 -0800
  • 4ddfa08a28 Add SwiGLU + FP4 quantization fusion with SFC verification for fc1 kernel Zongfei Jing 2026-01-05 07:47:28 -0800
  • 12e7f117c5 Add SwiGLU activation and quantization fusion to FC1 kernel Zongfei Jing 2026-01-05 06:11:02 -0800
  • dad843fba6 Add MoE as dense GEMM kernels for Blackwell Zongfei Jing 2026-01-05 00:19:09 -0800
  • 578c0a8e28 Overlap gen_fc2_alpha with fc1 using multistream in DenseGEMMFusedMoE Zongfei Jing 2025-12-24 02:10:38 -0800
  • 5ddbe3ca76 Add DenseGEMM backend for MoE Zongfei Jing 2025-12-24 02:10:11 -0800
  • b1d57e5767 Revert pre_merge section from l0_dgx_h200.yml and add test to l0_dgx_h100.yml Yan Chunwei 2026-01-10 16:13:48 +0800
  • 89f78d27a2 fix bench script Yan Chunwei 2026-01-07 14:19:11 +0800
  • 0ed300bb2c fix Yifei Zhang 2026-01-13 05:48:17 +0000
  • 1f61d76cf0
    Merge bb24a77458 into 2967d299fb Frida Hou 2026-01-12 21:42:27 -0800
  • 42b52c38e8 Perf optimization to avoid switching the context inside a loop ziyixiong-nv 2026-01-12 19:33:24 -0800
  • 832dde7b29 Fix for CUDA graph padding ziyixiong-nv 2026-01-12 18:00:49 -0800
  • 60ae6e70fa
    Merge 0f56e191dc into 2967d299fb Bala Marimuthu 2026-01-12 21:27:21 -0800
  • 1fee67097d Waive another failure and update sbsa build process qqiao 2026-01-13 05:27:07 +0000
  • ab61aaecf7
    Merge ba83017510 into 2967d299fb ameynaik-hub 2026-01-13 13:22:55 +0800
  • 41a1d549a8
    Merge 12db3c223b into 2967d299fb Yuxian Qiu 2026-01-13 13:20:45 +0800
  • 2967d299fb
    [TRTLLM-10271][test] Add Spark QA functional and performance cases (#10564) JennyLiu 2026-01-13 13:20:15 +0800
  • 0cdf9fa3e4 num_tokens_per_expert Enwei Zhu 2026-01-13 05:19:31 +0000
  • f283a3288a [TRTLLM-10305][feat] Support customized seq len larger than model config Wanli Jiang 2026-01-12 00:05:11 -0800
  • 55b6bebbaf update Chenfei Zhang 2026-01-12 20:51:19 -0800