Fanrong Li
ebadc13086
[doc] update mtp documents ( #5387 )
...
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
2025-06-21 16:05:52 +08:00
Shi Xiaowei
1e35be5840
doc: subsequent modifications of blog 5 ( #5366 )
...
Signed-off-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
2025-06-19 18:23:13 +08:00
Shi Xiaowei
9a53e58a58
blog: Disaggregated Serving in TensorRT-LLM ( #5353 )
...
Signed-off-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
2025-06-19 18:02:15 +08:00
Julien Demouth
bb79ba7c35
Edits for tech blog 4 ( #5006 )
...
Signed-off-by: Jun Yang <143764042+juney-nvidia@users.noreply.github.com>
Co-authored-by: Jun Yang <143764042+juney-nvidia@users.noreply.github.com>
2025-06-09 09:38:41 +08:00
juney-nvidia
a761cc2f8d
doc: refinement based on Julien's feedbacks ( #4967 )
...
Signed-off-by: Jun Yang <143764042+juney-nvidia@users.noreply.github.com>
2025-06-06 08:56:14 +08:00
Kaiyu Xie
5a5427f86e
blog: Scaling Expert Parallelism in TensorRT-LLM (Part 1: Design and Implementation of Large-scale EP) ( #4958 )
...
Signed-off-by: juney-nvidia <143764042+juney-nvidia@users.noreply.github.com>
Co-authored-by: Xianjie <5410381+qiaoxj07@users.noreply.github.com>
Co-authored-by: Dongxu Yang <78518666+dongxuy04@users.noreply.github.com>
Co-authored-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>
Co-authored-by: Jun Yang <143764042+juney-nvidia@users.noreply.github.com>
2025-06-05 22:24:04 +08:00
juney-nvidia
49f2f1f8eb
Expose new tech blog about DSR1 throughput optimization to the main R… ( #4803 )
...
Signed-off-by: Jun Yang <143764042+juney-nvidia@users.noreply.github.com>
2025-05-30 20:44:12 +08:00
Tao Li @ NVIDIA
3b7120d60e
DeepSeek R1 throughut optimization tech blog for Blackwell GPUs ( #4791 )
...
Signed-off-by: Tao Li
2025-05-30 18:54:19 +08:00
Yan Chunwei
5506f60037
chore [BREAKING CHANGE]: Flatten PyTorchConfig knobs into TorchLlmArgs ( #4603 )
...
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
2025-05-28 18:43:04 +08:00
Fanrong Li
862bde99b6
draft[doc]: add mtp tech blog ( #4580 )
...
* add mtp tech blog.
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
* update figure size.
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
* update the figure caption style and add some code/pr links.
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
* fix figure captions.
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
* fix figure size and perf data.
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
* fix.
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
* fix.
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
* fix.
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
* fix.
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
* fix based on comments
Signed-off-by: Yue Weng <25103990+yweng0828@users.noreply.github.com>
* fix figure links.
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
---------
Signed-off-by: Fanrong Li <23290157+lfr-0531@users.noreply.github.com>
Signed-off-by: Yue Weng <25103990+yweng0828@users.noreply.github.com>
Co-authored-by: Yue Weng <25103990+yweng0828@users.noreply.github.com>
2025-05-23 13:54:21 +08:00
Shi Xiaowei
a98e7ea26b
fix: replace the image links in the blog ( #4489 )
...
Signed-off-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
2025-05-20 22:39:25 +08:00
juney-nvidia
ddf01f6266
refine doc ( #4422 )
2025-05-19 06:06:22 +08:00
juney-nvidia
58e2d6ffa7
Refine doc ( #4421 )
2025-05-19 06:03:05 +08:00
juney-nvidia
ac610b394a
Refine doc ( #4420 )
2025-05-19 05:05:24 +08:00
Kefeng-Duan
f5b6d453aa
doc: DS r1 min latency blog ( #4386 )
...
* add best perf practice on DSR1
Signed-off-by: Jun Yang <143764042+juney-nvidia@users.noreply.github.com>
* add ds-r1 min latency tech blog
Signed-off-by: Jun Yang <143764042+juney-nvidia@users.noreply.github.com>
* rm redundant doc
Signed-off-by: Jun Yang <143764042+juney-nvidia@users.noreply.github.com>
* refine table content
Signed-off-by: Jun Yang <143764042+juney-nvidia@users.noreply.github.com>
* refine table content
Signed-off-by: Jun Yang <143764042+juney-nvidia@users.noreply.github.com>
* relative path for images
Signed-off-by: Jun Yang <143764042+juney-nvidia@users.noreply.github.com>
* refine precommit
Signed-off-by: Jun Yang <143764042+juney-nvidia@users.noreply.github.com>
* pr4280 is merged
Signed-off-by: Jun Yang <143764042+juney-nvidia@users.noreply.github.com>
---------
Signed-off-by: Jun Yang <143764042+juney-nvidia@users.noreply.github.com>
2025-05-16 20:20:28 +08:00