Commit Graph

  • f709acf93f chore: use deepgemm as moe backend for sm100 Mingyang Jiang 2026-01-13 12:41:58 +0800
  • 889e974f81 chore: move gb200 test case to b200 (because model is not sync for gb200) Mingyang Jiang 2026-01-12 20:56:14 +0800
  • 75fac06103 chore: remove fuse_qk_norm_rope Mingyang Jiang 2026-01-12 16:31:55 +0800
  • 5704619137 chore: modify by code review Mingyang Jiang 2026-01-12 16:26:29 +0800
  • 1e6e0582b3 chore: modify for code review Mingyang Jiang 2026-01-11 14:54:36 +0800
  • c73a53678d doc: modify doc Mingyang Jiang 2026-01-09 13:43:00 +0800
  • d6543ec0a2 fix: chang kv cache dtype to bf16 by default Mingyang Jiang 2026-01-09 13:19:39 +0800
  • 173661f3e7 chore: add h100 4gpu test case Mingyang Jiang 2026-01-08 20:14:43 +0800
  • f981363fac chore: add attention dp test case Mingyang Jiang 2026-01-07 14:01:14 +0800
  • 49cddcb27a fix: fix bugs when enable attention dp Mingyang Jiang 2026-01-06 16:09:48 +0800
  • 6f946b8e81 fix: fix acc issue Mingyang Jiang 2025-12-17 17:50:52 +0800
  • 6453a5face chore: gather qk to use rms norm Mingyang Jiang 2025-12-17 14:03:28 +0800
  • 8726623261 chore: directly use QKNormRoPEAttention Mingyang Jiang 2025-12-16 14:51:24 +0800
  • f451f08b76 chore: add qk norm for attention Mingyang Jiang 2025-12-15 14:58:28 +0800
  • 8c8ae8b4f6 fix: minor bug fix for acc test Mingyang Jiang 2025-12-15 13:25:23 +0800
  • a4bfd6ba7f chore: add acc test Mingyang Jiang 2025-12-12 14:33:15 +0800
  • 3b844b822b chore: add bias for moe routing method Mingyang Jiang 2025-12-11 14:58:17 +0800
  • bb3af897e9 draft: init commit Mingyang Jiang 2025-12-10 15:05:21 +0800
  • 8d998af03b [None][fix] enable EPLB for DEEPGEMM xxi 2026-01-13 03:26:50 +0000
  • d1c5a9aed6
    Merge efbc0fc6dd into ba1cb6831d Fadi Saady 2026-01-13 12:29:34 +0800
  • 1b55c765a7
    Merge 0a07a27835 into ba1cb6831d Lucas Liebenwein 2026-01-13 12:29:34 +0800
  • d7948b8d61
    Merge 30f707a8f4 into ba1cb6831d Venky 2026-01-13 12:29:34 +0800
  • d65b5e823c
    Merge dc9398147c into e4a6c9995d xxi 2026-01-13 12:29:18 +0800
  • dc9398147c [None][fix] enable EPLB for DEEPGEMM xxi 2026-01-13 03:26:50 +0000
  • 12db3c223b
    Merge branch 'main' into isolate-ds-bf16-h200 Yuxian Qiu 2026-01-13 11:53:24 +0800
  • c37c80d3ac Fix pre-reload and add cuda graph warning Shuyi Xiong 2026-01-12 19:47:01 -0800
  • 14af1593a3 add test code Yiqing Yan 2026-01-13 03:37:11 +0000
  • d42fe2b568 [None][feat] Fix regression yocox 2026-01-12 19:33:01 -0800
  • 0959a57130 Add docstring Yiqing Yan 2026-01-13 03:30:37 +0000
  • 31f2ecd3cb address comments from Jin, Chuang and Yuxian Balaram Buddharaju 2026-01-13 03:28:51 +0000
  • 4406c1d76a
    Merge 678948a8bc into ba1cb6831d Mike Iovine 2026-01-13 11:26:17 +0800
  • 0d413dcbd1 fix Yiqing Yan 2026-01-13 03:24:23 +0000
  • f8b0b4dc02
    Merge 9a39920a08 into ba1cb6831d Lizhi Zhou 2026-01-12 22:19:44 -0500
  • ba1cb6831d [None][infra] Check in most recent lock file from nightly pipeline TensorRT LLM 2026-01-13 03:08:06 +0000
  • f198b8dfcc remove test code Yiqing Yan 2025-12-12 06:00:22 +0000
  • f5976268bf Move code to python script Yiqing Yan 2025-12-11 09:41:37 +0000
  • d029276121 fix generateTimeoutTestResultsXML Yiqing Yan 2025-12-10 08:02:13 +0000
  • 7b8b749d3a fix pre-commit Yiqing Yan 2025-12-10 04:51:05 +0000
  • b3fbd56ab5 add test code Yiqing Yan 2025-12-10 03:36:14 +0000
  • 709fe1a6ae add test code Yiqing Yan 2025-12-09 07:54:09 +0000
  • 968b84c64a fix pre-commit Yiqing Yan 2025-12-09 05:47:08 +0000
  • b93e250459 fix parse test name Yiqing Yan 2025-12-09 03:07:14 +0000
  • 424e2e56bc add test code Yiqing Yan 2025-12-08 09:13:49 +0000
  • 515308866e Fix the testcase name in timeout xml Yiqing Yan 2025-12-08 09:04:52 +0000
  • 0a07a27835 separate AD tests into own stage Lucas Liebenwein 2026-01-12 17:47:32 -0800
  • ab5e836019 [TRTLLM-9581][Infra] Use /home/scratch.trt_llm_data_ci in computelab ZhanruiSunCh 2026-01-12 18:40:26 -0800
  • bbe535fddf
    [None][chore] Fix disagg assert (#10596) fredricz-20070104 2026-01-13 10:39:57 +0800
  • 0afc827387 update Chenfei Zhang 2026-01-12 18:37:25 -0800
  • ca2219fa0c enable_partial_reuse Chuang Zhu 2026-01-09 01:59:09 +0000
  • c86398c164 enable system memory to transfer active message in NIXL ucx Chuang Zhu 2026-01-12 10:29:29 +0000
  • ac02b370f8 skip nvfp4 on pre blackwell leslie-fang25 2026-01-12 18:19:03 -0800
  • ce44fabe8e add disagg_request_id in OpenAI protocol Lizhi Zhou 2026-01-11 21:13:06 -0800
  • 207cce4ba5 revert client id changes Lizhi Zhou 2026-01-07 22:57:00 -0800
  • c71c8922be fix ci failures Lizhi Zhou 2026-01-05 21:52:40 -0800
  • 4ad2427e02 fix failed tests Lizhi Zhou 2026-01-05 03:36:11 -0800
  • 4e39308bf9 multi thread test for global disagg req id Lizhi Zhou 2025-12-23 00:28:37 -0800
  • 77f3c745cc update by review comments Lizhi Zhou 2025-12-22 23:45:26 -0800
  • ab404f72b2 run tests Lizhi Zhou 2025-12-22 00:18:14 -0800
  • b337d9fff9 WIP Lizhi Zhou 2025-12-19 00:46:27 -0800
  • 8d858f912e
    Merge branch 'main' into user/yuhangh/support_export_data_in_eval heyuhhh 2026-01-13 10:01:12 +0800
  • aa21da84a7 Add more debug dump Yanchao Lu 2026-01-12 08:53:21 +0800
  • d4e83aae6e
    Merge branch 'main' into feature/fix_disagg_asset fredricz-20070104 2026-01-13 09:50:58 +0800
  • ea07026d66 [https://nvbugs/5769712][fix] fix timeout in AutoDeploy llama accuracy test Lucas Liebenwein 2026-01-06 08:04:50 -0800
  • 778cffa91b layerwise bench w/ balanced random Enwei Zhu 2026-01-09 06:38:27 +0000
  • 6ad103d061 autotuner w/ balanced random Enwei Zhu 2026-01-09 05:13:13 +0000
  • 9f0df4a835 add metadata Enwei Zhu 2026-01-08 09:08:55 +0000
  • fe2369be54 generator Enwei Zhu 2025-12-24 13:15:38 +0000
  • ff8adae4f4
    Merge a343a70be3 into ba1037ca4a Anish Shanbhag 2026-01-13 09:37:36 +0800
  • 71784c2c51
    Merge 744e749e9d into ba1037ca4a Bo Li 2026-01-13 09:37:35 +0800
  • ccf14ddc1e
    Merge 95a306adf4 into ba1037ca4a JadoTu 2026-01-13 09:37:35 +0800
  • 2d3c01d553
    Merge 72f28103af into ba1037ca4a Michal Guzek 2026-01-13 09:37:35 +0800
  • da1c5aceb3
    Merge branch 'main' into spec-skip-forward Yuxian Qiu 2026-01-13 09:30:35 +0800
  • c815ee3edc
    Merge 587c0cbf05 into ba1037ca4a Iman Tabrizian 2026-01-13 09:21:45 +0800
  • cca1acddca
    Merge 0795caa2db into ba1037ca4a Thor Johnsen 2026-01-12 19:21:04 -0600
  • ba1037ca4a
    [https://nvbugs/5762336][fix] support to parse the keyword modules_to_not_convert of the HF model config" (#10527) xxi 2026-01-13 09:21:01 +0800
  • 96f7ea5f29
    Merge 62258f6ce3 into 48b09e5a25 danielafrimi 2026-01-13 08:59:53 +0800
  • 9ce7d91f14 [#9306][refactor] Deprecate free_mem_ratio and cuda_graph_batch_sizes William Zhang 2026-01-12 14:40:36 -0800
  • ac5a5db1c5 [#9306][refactor] Inherit from TorchLlmArgs and remove duplicate fields William Zhang 2026-01-09 14:49:06 -0800
  • 205b997d8b [#9306][refactor] Inline AutoDeployConfig into LlmArgs William Zhang 2026-01-09 13:42:17 -0800
  • 23df84ee21
    Merge 5525a8544f into 48b09e5a25 Lucas Liebenwein 2026-01-12 19:43:48 -0500
  • b34957ec8d
    Merge 3fb19de304 into 48b09e5a25 Faraz 2026-01-13 08:42:32 +0800
  • 0795caa2db precommit run thorjohnsen 2026-01-13 00:39:09 +0000
  • 3beb25ceb9 Resolve merge conflicts thorjohnsen 2026-01-13 00:32:59 +0000
  • 4085bf5374 Bug fix thorjohnsen 2026-01-13 00:28:53 +0000
  • 9c75c051b7 change fp8 to skip_pre_hopper and fix confusion model name Jenny Liu 2026-01-13 00:21:44 +0000
  • be0949bcab Fix unittest failure - model_kwargs is an optional parameter, and it is not always set. - This is handled in the standard ConfigLoader, but not in custom ConfigLoader such as MistralConfigLoader - In current update, let model_kwargs be supported only by ConfigLoader and added warning in the custom loader Taylor Yeonbok Lee 2026-01-10 01:39:23 -0800
  • 4439e84db1 Fix unittest failure Taylor Yeonbok Lee 2026-01-07 22:08:47 -0800
  • 1497113a05 Applied review comment Taylor Yeonbok Lee 2026-01-06 00:40:08 -0800
  • 25524bed2f Fix CI failure due to unregistered reference parameter for LLM API Taylor Yeonbok Lee 2026-01-02 20:51:52 -0800
  • 144a903896 Added unittest Taylor Yeonbok Lee 2026-01-02 14:45:04 -0800
  • 769587f5de Applied review comment Taylor Yeonbok Lee 2026-01-02 13:36:25 -0800
  • 61f40c0159 Support model_kwargs for torch backend Taylor Yeonbok Lee 2025-12-28 01:03:24 -0800
  • 2bbd7d52c8
    Merge cc441386fc into 48b09e5a25 Eran Geva 2026-01-12 16:02:23 -0800
  • 3da678db0e
    Merge 7d7e438670 into 48b09e5a25 Chenghao Zhang 2026-01-12 15:26:52 -0800
  • 48b09e5a25
    [https://nvbugs/5689235][fix] Fix cancellation+chunked prefill+disagg (#10111) Iman Tabrizian 2026-01-12 15:23:26 -0800
  • 3c608ca972
    Merge 83770e9e34 into 18a33764b5 Yukun He 2026-01-12 23:09:12 +0000
  • 42b49cf364
    Merge a20c7383fc into 18a33764b5 Karthik 2026-01-12 22:38:10 +0000
  • d21f1e2b7e
    Merge 3182873658 into 18a33764b5 jthomson04 2026-01-12 14:22:10 -0800
  • 1392fd3172 [https://nvbugs/5670108][fix] Fix overlap scheduler race condition in KV cache rewind. SimengLiu-nv 2026-01-12 10:25:58 -0800
  • 342d5e1cdc
    Merge branch 'main' into chenghao/fi_update_0107 Chenghao Zhang 2026-01-12 12:58:28 -0800