Add Latest News section (#365)

2026-01-13 22:18:36 +08:00 · 2023-11-13 20:56:22 +08:00 · 2023-11-13 20:56:22 +08:00 · ec769d63f9
commit ec769d63f9
parent 24cf8de078
4 changed files with 1 additions and 2 deletions
--- a/docs/source/blogs/H200launch.md
+++ b/docs/source/blogs/H200launch.md
@ -33,8 +33,7 @@ For practical examples of H200's performance:
 **Max Throughput TP8:**
 an online chat agent scenario (ISL/OSL=80/200) with GPT3-175B on a full HGX (TP8) H200 is 1.6x more performant than H100.

-<img src="media/H200launch_Llama70B_tps.png" alt="max throughput llama TP1" width="250" height="auto">
-<img src="media/H200launch_GPT175B_tps.png" alt="max throughput GPT TP8" width="250" height="auto">
+<img src="media/H200launch_tps.png" alt="max throughput llama TP1" width="500" height="auto">

 <sub>Preliminary measured performance, subject to change.
 TensorRT-LLM v0.5.0, TensorRT v9.1.0.4. | Llama-70B: H100 FP8 BS 8, H200 FP8 BS 32 | GPT3-175B: H100 FP8 BS 64, H200 FP8 BS 128 </sub>
--- a/docs/source/blogs/media/H200launch_GPT175B_tps.png
+++ b/docs/source/blogs/media/H200launch_GPT175B_tps.png
--- a/docs/source/blogs/media/H200launch_Llama70B_tps.png
+++ b/docs/source/blogs/media/H200launch_Llama70B_tps.png
--- a/docs/source/blogs/media/H200launch_tps.png
+++ b/docs/source/blogs/media/H200launch_tps.png