Kaiyu Xie
|
db2a42f641
|
[None][chore] Add sample yaml for wide-ep example and minor fixes (#8825)
Signed-off-by: Zero Zeng <38289304+zerollzeng@users.noreply.github.com>
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
Co-authored-by: Zero Zeng <38289304+zerollzeng@users.noreply.github.com>
|
2025-11-03 07:48:34 -08:00 |
|
Zero Zeng
|
4545700fcf
|
[None][chore] Move submit.sh to python and use yaml configuration (#8003)
Signed-off-by: Zero Zeng <38289304+zerollzeng@users.noreply.github.com>
|
2025-10-20 22:36:50 -04:00 |
|
Guoming Zhang
|
9f0f52249e
|
[None][doc] Rename TensorRT-LLM to TensorRT LLM for homepage and the … (#7850)
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
Signed-off-by: Wangshanshan <30051912+dominicshanshan@users.noreply.github.com>
|
2025-09-25 21:02:35 +08:00 |
|
Zero Zeng
|
16bb76c31d
|
[None][chore] Update benchmark script (#7860)
Signed-off-by: Zero Zeng <38289304+zerollzeng@users.noreply.github.com>
Co-authored-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
|
2025-09-23 03:15:42 -07:00 |
|
Raayan Dhar
|
82bd1871ea
|
[None][chore] update disagg readme and scripts for pipeline parallelism (#6875)
Signed-off-by: raayandhar <rdhar@nvidia.com>
|
2025-08-27 00:53:57 -04:00 |
|
Xianjie Qiao
|
19667304b5
|
[None] [chore] Update wide-ep genonly scripts (#6995)
Signed-off-by: Xianjie <5410381+qiaoxj07@users.noreply.github.com>
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
Co-authored-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
|
2025-08-19 07:44:07 -04:00 |
|
Xianjie Qiao
|
c2fe8b03a2
|
[https://nvbugs/5405041][fix] Update wide-ep doc (#6933)
Signed-off-by: Xianjie <5410381+qiaoxj07@users.noreply.github.com>
|
2025-08-15 05:32:32 -04:00 |
|
Shi Xiaowei
|
fe7dda834d
|
[TRTLLM-7030][fix] Refactor the example doc of dist-serving (#6766)
Signed-off-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
|
2025-08-13 17:39:27 +08:00 |
|
Kaiyu Xie
|
47806f09d9
|
feat: Support custom repo_dir for SLURM script (#6546)
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
Co-authored-by: xxi <xxi@nvidia.com>
|
2025-08-12 22:06:59 -04:00 |
|
Kaiyu Xie
|
aee35e2dbd
|
chore: Make example SLURM scripts more parameterized (#6511)
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
|
2025-08-01 12:53:15 +08:00 |
|
Kaiyu Xie
|
e58afa510e
|
doc: Add README for wide EP (#6356)
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
|
2025-07-29 00:36:12 -04:00 |
|
Kaiyu Xie
|
f08286c679
|
doc: Refactor documents and examples of disaggregated serving and wide ep (#6054)
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
|
2025-07-23 09:20:57 +08:00 |
|
nv-guomingz
|
4e4d18826f
|
chore: [Breaking Change] Rename cuda_graph_config padding_enabled fie… (#6003)
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
|
2025-07-15 15:50:03 +09:00 |
|
Xianjie Qiao
|
c7ffadf692
|
Fix errors in wide-ep scripts (#5992)
Signed-off-by: Xianjie <5410381+qiaoxj07@users.noreply.github.com>
|
2025-07-14 14:07:27 +09:00 |
|
nv-guomingz
|
0be41b6524
|
Revert "chore: [Breaking Change] Rename cuda_graph_config padding_enabled fie…" (#5818)
|
2025-07-08 13:15:30 +09:00 |
|
nv-guomingz
|
5a8173c121
|
chore: [Breaking Change] Rename cuda_graph_config padding_enabled fie… (#5795)
Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>
|
2025-07-08 08:52:36 +08:00 |
|
Xianjie Qiao
|
b1976c2add
|
Add wide-ep benchmarking scripts (#5760)
Signed-off-by: Xianjie <5410381+qiaoxj07@users.noreply.github.com>
Signed-off-by: Xianjie Qiao <5410381+qiaoxj07@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
|
2025-07-05 19:29:39 +08:00 |
|