Venky
fd1270b9ab
[TRTC-43] [feat] Add config db and docs ( #9420 )
...
Signed-off-by: Frank Di Natale <3429989+FrankD412@users.noreply.github.com>
Signed-off-by: Venky Ganesh <23023424+venkywonka@users.noreply.github.com>
Co-authored-by: Frank Di Natale <3429989+FrankD412@users.noreply.github.com>
2025-12-12 04:00:03 +08:00
Kaiyu Xie
069b05cf3d
[TRTLLM-9706] [doc] Update wide EP documents ( #9724 )
...
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
2025-12-08 11:21:11 +08:00
QI JUN
0915c4e3a1
[TRTLLM-9086][doc] Clean up TODOs in documentation ( #9292 )
...
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-12-05 17:50:12 -05:00
Kaiyu Xie
cb87c44912
[TRTLLM-9562] [doc] Add Deployment Guide for Kimi K2 Thinking on TensorRT LLM - Blackwell ( #9711 )
...
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
2025-12-04 19:20:06 -08:00
QI JUN
d11acee22d
[TRTLLM-9085][doc] fix math formula rendering issues in github ( #9605 )
...
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-12-02 10:18:16 +08:00
Liao Lanyu
5425d96757
[TRTLLM-9513][docs] Qwen3 deployment guide ( #9488 )
...
Signed-off-by: Lanyu Liao <laliao@laliao-mlt.client.nvidia.com>
Co-authored-by: Lanyu Liao <laliao@laliao-mlt.client.nvidia.com>
2025-11-27 14:12:35 +08:00
Jiagan Cheng
14762e0287
[None][fix] Replace PYTORCH_CUDA_ALLOC_CONF with PYTORCH_ALLOC_CONF to fix deprecation warning ( #9294 )
...
Signed-off-by: Jiagan Cheng <jiaganc@nvidia.com>
2025-11-27 12:22:01 +08:00
QI JUN
c6fa042332
[TRTLLM-9085][doc] fix math formula rendering issues ( #9481 )
...
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
2025-11-27 10:09:12 +08:00
Anish Shanbhag
6a6317727b
[TRTLLM-8680][doc] Add table with one-line deployment commands to docs ( #8173 )
...
Signed-off-by: Anish Shanbhag <ashanbhag@nvidia.com>
2025-11-03 17:42:41 -08:00
dongfengy
6424f7e55f
[None][doc] Clarify the perf best practice and supported hardware for gptoss ( #8665 )
...
Signed-off-by: Dongfeng Yu <dongfengy@nvidia.com>
Signed-off-by: dongfengy <99041270+dongfengy@users.noreply.github.com>
2025-10-31 10:11:59 -07:00
Kaiyu Xie
c822c117ce
[None] [docs] Update TPOT/ITL docs ( #8378 )
...
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
2025-10-14 20:50:54 -07:00
Guoming Zhang
989c25fcba
[None][doc] Add qwen3-next doc into deployment guid and test case into L0. ( #8288 )
...
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
Co-authored-by: Faradawn Yang <faradawny@gmail.com>
Co-authored-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
2025-10-13 10:25:45 +08:00
Guoming Zhang
656d73087e
[None][doc] Fix several invalid ref links in deployment guide sections. ( #8287 )
...
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
2025-10-13 10:22:32 +08:00
Guoming Zhang
a193867f8f
[None][doc] Refine deployment guide by renaming TRT-LLM to TensorRT L… ( #8214 )
...
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
2025-10-09 17:11:24 +08:00
xxi
d471655242
[TRTLLM-7831][feat] Cherry-pick from #7423 Support fp8 block wide ep cherry pick ( #7712 )
2025-09-23 08:41:38 +08:00
dongfengy
026f22eb50
[None][doc] Cherry-pick deployment guide update from 1.1.0rc2 branch to main branch ( #7774 )
...
Signed-off-by: Dongfeng Yu <dongfengy@nvidia.com>
2025-09-18 22:50:26 +08:00
Guoming Zhang
7f3f658d5f
[None][doc] Rename TensorRT-LLM to TensorRT LLM. ( #7554 )
...
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
2025-09-09 12:16:03 +08:00
Guoming Zhang
35dac55716
[None][doc] Update kvcache part ( #7549 )
...
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
2025-09-09 12:16:03 +08:00
Guoming Zhang
f53fb4c803
[TRTLLM-5930][doc] 1.0 Documentation. ( #6696 )
...
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
2025-09-09 12:16:03 +08:00
binghanc
14ee43e254
[None][docs] refine docs for accuracy evaluation of gpt-oss models ( #7252 )
...
Signed-off-by: 176802681+binghanc@users.noreply.github.com
2025-09-08 09:56:23 +08:00
dongfengy
48155f52bf
[TRTLLM-7321][doc] Refine GPT-OSS doc ( #7180 )
...
Signed-off-by: Dongfeng Yu
2025-08-24 08:53:53 -04:00
dongfengy
d94cc3fa3c
[TRTLLM-7321][doc] Add GPT-OSS Deployment Guide into official doc site ( #7143 )
...
Signed-off-by: Dongfeng Yu
2025-08-22 16:17:01 +08:00
Kaiyu Xie
9a74ee9dae
[None] [doc] Add more documents for large scale EP ( #7029 )
...
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
2025-08-19 19:04:39 +08:00
Daniel Cámpora
53312eeebd
[TRTLLM-7157][feat] BREAKING CHANGE Introduce sampler_type, detect sampler according to options ( #6831 )
...
Signed-off-by: Daniel Campora <961215+dcampora@users.noreply.github.com>
2025-08-16 00:27:24 -04:00
JunyiXu-nv
70e352a6f7
[ https://nvbugs/5437106 ][fix] Add L4 Scout benchmarking WAR option in deploy guide ( #6829 )
...
Signed-off-by: Junyi Xu <junyix@nvidia.com>
2025-08-15 08:53:13 +08:00
Tao Li @ NVIDIA
345d3d3524
[None][doc] update moe support matrix for DS R1 ( #6883 )
...
Signed-off-by: taoli <litaotju@users.noreply.github.com>
Co-authored-by: taoli <litaotju@users.noreply.github.com>
2025-08-14 13:55:11 +08:00
Zhenhua Wang
868c5d166e
[None][chore] fix markdown format for the deployment guide ( #6879 )
...
Signed-off-by: Zhenhua Wang <zhenhuaw@nvidia.com>
2025-08-13 22:19:11 -04:00
Zhenhua Wang
8416d7fea8
[ https://nvbugs/5412885 ][doc] Add the workaround doc for H200 OOM ( #6853 )
...
Signed-off-by: Zhenhua Wang <4936589+zhenhuaw-me@users.noreply.github.com>
2025-08-13 19:51:38 +08:00
Guoming Zhang
0223de0727
[None][doc] Add deployment guide section for VDR task ( #6669 )
...
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
2025-08-07 10:30:47 -04:00