WeiHaocheng
|
259cc66c34
|
[None][doc] scaffolding tech blog part one (#7835)
Signed-off-by: Fred Wei <20514172+WeiHaocheng@users.noreply.github.com>
Signed-off-by: zheyuf <zheyuf@NVIDIA.com>
Co-authored-by: zheyuf <zheyuf@NVIDIA.com>
|
2025-09-25 14:41:59 +08:00 |
|
Aurelien Chartier
|
98726a3bed
|
[None][chore] Update trtllm-bench documentation on setting FP8 KV cache (#7885)
Signed-off-by: Aurelien Chartier <2567591+achartier@users.noreply.github.com>
|
2025-09-25 09:28:53 +08:00 |
|
Leslie Fang
|
342014069e
|
[None][chore] Validate features combination (#7630)
Signed-off-by: leslie-fang25 <leslief@nvidia.com>
|
2025-09-25 08:01:13 +08:00 |
|
xxi
|
d471655242
|
[TRTLLM-7831][feat] Cherry-pick from #7423 Support fp8 block wide ep cherry pick (#7712)
|
2025-09-23 08:41:38 +08:00 |
|
Guoming Zhang
|
edbe270198
|
[TRTLLM-7958][doc] add 1.0 release notes (#7605)
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
Signed-off-by: pcastonguay <55748270+pcastonguay@users.noreply.github.com>
Signed-off-by: Sharan Chetlur <116769508+schetlur-nv@users.noreply.github.com>
Co-authored-by: pcastonguay <55748270+pcastonguay@users.noreply.github.com>
Co-authored-by: Sharan Chetlur <116769508+schetlur-nv@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
|
2025-09-22 14:28:38 +08:00 |
|
Yan Chunwei
|
ba2864a2c6
|
[None][doc] Enhance api reference doc by labeling stable APIs (#7751)
Signed-off-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
|
2025-09-22 14:28:38 +08:00 |
|
Guoming Zhang
|
e8a3e21b87
|
[https://nvbugs/5519525][fix] fix doc invalid link for bug 5519525 (#7753)
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
|
2025-09-22 14:28:38 +08:00 |
|
Guoming Zhang
|
bc7b50334c
|
[None][doc] Add labels description note into llm api section (#7696)
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
|
2025-09-22 14:28:38 +08:00 |
|
Guoming Zhang
|
ab915fb333
|
[None][doc] Use hash id for external link (#7641)
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
|
2025-09-22 14:28:38 +08:00 |
|
Guoming Zhang
|
5c54173054
|
[None][doc] Fix a invalid link and a typo. (#7634)
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
|
2025-09-22 14:28:38 +08:00 |
|
Guoming Zhang
|
8fed8ee066
|
[None][doc] add blackwell information into support matrix (#6740)
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
|
2025-09-22 14:28:38 +08:00 |
|
Yan Chunwei
|
2ffc33921f
|
[https://nvbugs/5416501][doc] add known issues to llmapi doc (#7560)
Signed-off-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
Co-authored-by: Ryan McCormick <mccormick.codes@gmail.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
|
2025-09-22 14:28:38 +08:00 |
|
Enwei Zhu
|
e943a39cbd
|
[None][doc] Update tech blog12 (#7884)
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
|
2025-09-20 18:15:39 +08:00 |
|
Kanghwan
|
8fcd11515d
|
[#7704][chore] Enable MathJax to fix formulas in documentation (#7744)
Signed-off-by: Kanghwan Jang <861393+karljang@users.noreply.github.com>
|
2025-09-19 08:42:26 -07:00 |
|
Enwei Zhu
|
c8cc16d38d
|
[None][doc] Tech blog: Combining Guided Decoding and Speculative Decoding: Making CPU and GPU Cooperate Seamlessly (#7864)
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
|
2025-09-19 18:38:12 +08:00 |
|
dongfengy
|
026f22eb50
|
[None][doc] Cherry-pick deployment guide update from 1.1.0rc2 branch to main branch (#7774)
Signed-off-by: Dongfeng Yu <dongfengy@nvidia.com>
|
2025-09-18 22:50:26 +08:00 |
|
Wanli Jiang
|
fe104dc20d
|
[TRTLLM-7918][feat] Support kvcache reuse and chunk prefill for phi4mm (#7723)
Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>
|
2025-09-18 17:37:16 +08:00 |
|
Wanli Jiang
|
a7ca0fff54
|
[TRTLLM-6577][feat] Support nano_v2_vlm in pytorch backend (#7207)
Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>
|
2025-09-18 16:26:20 +08:00 |
|
William Zhang
|
2614d71994
|
[TRTLLM-7410][feat] Enable KV cache reuse and chunked prefill for mistral3.1 (#7628)
Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>
|
2025-09-17 08:11:16 -07:00 |
|
QI JUN
|
39248320d4
|
[None][feat] add an example of KV cache host offloading (#7767)
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
|
2025-09-17 13:51:15 +08:00 |
|
Chang Liu
|
98f533453a
|
[TRTLLM-7398][doc] Add doc for KV cache salting support (#7772)
Signed-off-by: Chang Liu (Enterprise Products) <9713593+chang-l@users.noreply.github.com>
|
2025-09-16 14:49:14 -07:00 |
|
Guoming Zhang
|
085271eceb
|
[None][doc] Clean the doc folder and move the outdated docs into lega… (#7729)
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
|
2025-09-16 11:43:19 +08:00 |
|
Shi Xiaowei
|
809c4d20c0
|
[None][doc] Fix the link in the doc (#7713)
Signed-off-by: Shi Xiaowei <39303645+Shixiaowei02@users.noreply.github.com>
|
2025-09-16 09:50:25 +08:00 |
|
Wanli Jiang
|
e080294725
|
[TRTLLM-7918][feat] Revert "Support kvcache reuse for phi4mm (#7563)" (#7722)
Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>
|
2025-09-15 17:19:44 +08:00 |
|
Wanli Jiang
|
fc9f4c9295
|
[TRTLLM-7918][feat] Support kvcache reuse for phi4mm (#7563)
Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>
|
2025-09-15 15:47:00 +08:00 |
|
Chang Liu
|
47e37755a3
|
[TRTLLM-6903][feat] Support chunked prefill for multimodal models (#6843)
Signed-off-by: Chang Liu (Enterprise Products) <9713593+chang-l@users.noreply.github.com>
|
2025-09-14 20:10:10 -07:00 |
|
v-shobhit
|
0652514c6d
|
[None][feat] Use a shell context to install dependancies (#7383)
Signed-off-by: Shobhit Verma <shobhitv@nvidia.com>
Signed-off-by: v-shobhit <161510941+v-shobhit@users.noreply.github.com>
Co-authored-by: Zhihan Jiang <68881590+nvzhihanj@users.noreply.github.com>
|
2025-09-10 09:57:37 -07:00 |
|
Chang Liu
|
faa2f46554
|
[TRTLLM-5059][feat] Enable KV-cache reuse and add E2E tests for llava-next (#7349)
Signed-off-by: Chang Liu (Enterprise Products) <9713593+chang-l@users.noreply.github.com>
|
2025-09-09 14:51:36 -04:00 |
|
Guoming Zhang
|
7f3f658d5f
|
[None][doc] Rename TensorRT-LLM to TensorRT LLM. (#7554)
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
|
2025-09-09 12:16:03 +08:00 |
|
Guoming Zhang
|
35dac55716
|
[None][doc] Update kvcache part (#7549)
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
|
2025-09-09 12:16:03 +08:00 |
|
Guoming Zhang
|
f53fb4c803
|
[TRTLLM-5930][doc] 1.0 Documentation. (#6696)
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
|
2025-09-09 12:16:03 +08:00 |
|
dominicshanshan
|
c9dca69e1b
|
[None][chore] Mass integration of release/1.0 - 3rd (#7519)
Signed-off-by: Nave Assaf <nassaf@nvidia.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
Signed-off-by: qqiao <qqiao@nvidia.com>
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
Signed-off-by: Bo Deng <deemod@nvidia.com>
Signed-off-by: Jin Li <59594262+liji-nv@users.noreply.github.com>
Signed-off-by: Yifei Zhang <219273404+yifeizhang-c@users.noreply.github.com>
Signed-off-by: Amit Zuker <203509407+amitz-nv@users.noreply.github.com>
Signed-off-by: Erin Ho <14718778+hchings@users.noreply.github.com>
Signed-off-by: Chenfei Zhang <chenfeiz@nvidia.com>
Signed-off-by: Christina Zhang <83400082+ChristinaZ@users.noreply.github.com>
Signed-off-by: Venky Ganesh <23023424+venkywonka@users.noreply.github.com>
Signed-off-by: Pamela <179191831+pamelap-nvidia@users.noreply.github.com>
Signed-off-by: Hui Gao <huig@nvidia.com>
Signed-off-by: Alexandre Milesi <30204471+milesial@users.noreply.github.com>
Signed-off-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
Signed-off-by: Michal Guzek <mguzek@nvidia.com>
Signed-off-by: peaceh <103117813+peaceh-nv@users.noreply.github.com>
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>
Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>
Signed-off-by: ruodil <200874449+ruodil@users.noreply.github.com>
Signed-off-by: Linda-Stadter <57756729+Linda-Stadter@users.noreply.github.com>
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
Signed-off-by: Jiagan Cheng <jiaganc@nvidia.com>
Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>
Signed-off-by: Dom Brown <3886319+DomBrown@users.noreply.github.com>
Co-authored-by: Nave Assaf <55059536+Naveassaf@users.noreply.github.com>
Co-authored-by: Yechan Kim <161688079+yechank-nvidia@users.noreply.github.com>
Co-authored-by: brb-nv <169953907+brb-nv@users.noreply.github.com>
Co-authored-by: Iman Tabrizian <10105175+Tabrizian@users.noreply.github.com>
Co-authored-by: Emma Qiao <qqiao@nvidia.com>
Co-authored-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
Co-authored-by: Bo Deng <deemod@nvidia.com>
Co-authored-by: Jin Li <59594262+liji-nv@users.noreply.github.com>
Co-authored-by: yifeizhang-c <219273404+yifeizhang-c@users.noreply.github.com>
Co-authored-by: amitz-nv <203509407+amitz-nv@users.noreply.github.com>
Co-authored-by: Erin <14718778+hchings@users.noreply.github.com>
Co-authored-by: chenfeiz0326 <chenfeiz@nvidia.com>
Co-authored-by: ChristinaZ <83400082+ChristinaZ@users.noreply.github.com>
Co-authored-by: Venky <23023424+venkywonka@users.noreply.github.com>
Co-authored-by: Pamela Peng <179191831+pamelap-nvidia@users.noreply.github.com>
Co-authored-by: HuiGao-NV <huig@nvidia.com>
Co-authored-by: milesial <milesial@users.noreply.github.com>
Co-authored-by: Shi Xiaowei <39303645+Shixiaowei02@users.noreply.github.com>
Co-authored-by: Michal Guzek <moraxu@users.noreply.github.com>
Co-authored-by: peaceh-nv <103117813+peaceh-nv@users.noreply.github.com>
Co-authored-by: Guoming Zhang <137257613+nv-guomingz@users.noreply.github.com>
Co-authored-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>
Co-authored-by: pcastonguay <55748270+pcastonguay@users.noreply.github.com>
Co-authored-by: ruodil <200874449+ruodil@users.noreply.github.com>
Co-authored-by: Linda <57756729+Linda-Stadter@users.noreply.github.com>
Co-authored-by: Zhanrui Sun <184402041+ZhanruiSunCh@users.noreply.github.com>
Co-authored-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
Co-authored-by: Jiagan Cheng <jiaganc@nvidia.com>
Co-authored-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>
Co-authored-by: Larry <197874197+LarryXFly@users.noreply.github.com>
Co-authored-by: Sharan Chetlur <116769508+schetlur-nv@users.noreply.github.com>
Co-authored-by: Dom Brown <3886319+DomBrown@users.noreply.github.com>
|
2025-09-08 14:03:04 +08:00 |
|
binghanc
|
14ee43e254
|
[None][docs] refine docs for accuracy evaluation of gpt-oss models (#7252)
Signed-off-by: 176802681+binghanc@users.noreply.github.com
|
2025-09-08 09:56:23 +08:00 |
|
Enwei Zhu
|
5ff3a65b23
|
[TRTLLM-7028][feat] Enable guided decoding with speculative decoding (part 2: one-model engine) (#6948)
Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
|
2025-09-03 15:16:11 -07:00 |
|
Izzy Putterman
|
f156221c27
|
[None][doc] add GPT OSS Eagle3 blog (#7140)
Signed-off-by: Izzy Putterman <iputterman@nvidia.com>
|
2025-09-03 12:28:01 -04:00 |
|
Wanli Jiang
|
4223a9aada
|
[TRTLLM-7261][feat] Support phi-4 model in pytorch backend (#7371)
Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>
|
2025-09-03 10:27:42 +08:00 |
|
Yan Chunwei
|
612c26be22
|
[None][doc] add legacy section for tensorrt engine (#6724)
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
|
2025-09-01 11:02:31 +08:00 |
|
Robin Kobus
|
e09c025ffb
|
[None] [fix] store blog 10 media via lfs (#7375)
Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
|
2025-08-30 10:17:53 +08:00 |
|
yunruis
|
f617b03bfc
|
[None][fix] fix doc formula (#7367)
Signed-off-by: yunruis <205571022+yunruis@users.noreply.github.com>
|
2025-08-29 04:48:10 -04:00 |
|
dongfengy
|
367ff88a5e
|
[None][feat] Refactor llama4 for multimodal encoder IFB (#6844)
Signed-off-by: Dongfeng Yu <dongfengy@nvidia.com>
|
2025-08-28 13:22:19 -07:00 |
|
yunruis
|
c4f823319b
|
[None][doc] add adp balance blog (#7213)
Signed-off-by: yunruis <205571022+yunruis@users.noreply.github.com>
Co-authored-by: Kefeng-Duan <176893526+Kefeng-Duan@users.noreply.github.com>
|
2025-08-28 11:19:34 -04:00 |
|
Maurits de Groot
|
2d0c9b383f
|
[None][fix] Updated blog9_Deploying_GPT_OSS_on_TRTLLM (#7260)
Signed-off-by: Maurits de Groot <63357890+Maurits-de-Groot@users.noreply.github.com>
|
2025-08-26 11:26:19 -04:00 |
|
Guoming Zhang
|
bf377d0b8e
|
[None][doc] Display tech blog for nvidia.github.io domain. (#7241)
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
|
2025-08-26 15:36:28 +08:00 |
|
Zheng Duan
|
4f84a45899
|
[https://nvbugs/5452463][doc] update disagg doc about UCX_MAX_RNDV_RAILS (#7205)
Signed-off-by: zhengd-nv <200704041+zhengd-nv@users.noreply.github.com>
|
2025-08-25 22:42:42 -04:00 |
|
Leslie Fang
|
9df15b2104
|
[None][doc] update feature_combination_matrix doc (#6691)
Signed-off-by: leslie-fang25 <leslief@nvidia.com>
|
2025-08-26 08:25:31 +08:00 |
|
dongfengy
|
48155f52bf
|
[TRTLLM-7321][doc] Refine GPT-OSS doc (#7180)
Signed-off-by: Dongfeng Yu
|
2025-08-24 08:53:53 -04:00 |
|
Suyog Gupta
|
e3de5758a3
|
[#7136][feat] trtllm-serve + autodeploy integration (#7141)
Signed-off-by: Suyog Gupta <41447211+suyoggupta@users.noreply.github.com>
|
2025-08-22 08:30:53 -07:00 |
|
dongfengy
|
d94cc3fa3c
|
[TRTLLM-7321][doc] Add GPT-OSS Deployment Guide into official doc site (#7143)
Signed-off-by: Dongfeng Yu
|
2025-08-22 16:17:01 +08:00 |
|
Farshad Ghodsian
|
2d40e8750b
|
[None][doc] Update gpt-oss deployment guide to latest release image (#7101)
Signed-off-by: Farshad Ghodsian <47931571+farshadghodsian@users.noreply.github.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
|
2025-08-21 02:33:07 -04:00 |
|
Leslie Fang
|
3f6a9267f1
|
[None][infra] update feature_combination_matrix of disaggregated and chunked prefill (#6661)
Signed-off-by: leslie-fang25 <leslief@nvidia.com>
|
2025-08-20 13:14:34 +08:00 |
|