Update gh-pages for windows part doc. (#1979)

Co-authored-by: Guoming Zhang <37257613+nv-guomingz@users.noreply.github.com>
This commit is contained in:
nv-guomingz 2024-07-18 11:18:09 +08:00 committed by GitHub
parent 10588d0bfe
commit 85f78df69c
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
48 changed files with 51 additions and 51 deletions

View File

@ -4724,7 +4724,7 @@
<hr/>
<div role="contentinfo">
<jinja2.runtime.BlockReference object at 0x7f8a046e6800>
<jinja2.runtime.BlockReference object at 0x7f8e37d07880>
<div class="footer">
<p>

View File

@ -10645,7 +10645,7 @@ one more than decoding draft tokens for prediction from primary head </p>
<hr/>
<div role="contentinfo">
<jinja2.runtime.BlockReference object at 0x7f8a08542620>
<jinja2.runtime.BlockReference object at 0x7f8e36a87370>
<div class="footer">
<p>

View File

@ -4,7 +4,7 @@
```{note}
The Windows release of TensorRT-LLM is currently in beta.
We recommend checking out the [v0.10.0 tag](https://github.com/NVIDIA/TensorRT-LLM/releases/tag/v0.10.0) for the most stable experience.
We recommend checking out the [v0.11.0 tag](https://github.com/NVIDIA/TensorRT-LLM/releases/tag/v0.11.0) for the most stable experience.
```
**Prerequisites**
@ -47,7 +47,7 @@ We recommend checking out the [v0.10.0 tag](https://github.com/NVIDIA/TensorRT-L
before installing TensorRT-LLM with the following command.
```bash
pip install tensorrt_llm==0.10.0 --extra-index-url https://pypi.nvidia.com
pip install tensorrt_llm==0.11.0 --extra-index-url https://pypi.nvidia.com
```
Run the following command to verify that your TensorRT-LLM installation is working properly.

View File

@ -411,7 +411,7 @@ the TensorRT-LLM batch manager.</p>
<hr/>
<div role="contentinfo">
<jinja2.runtime.BlockReference object at 0x7f8a09129930>
<jinja2.runtime.BlockReference object at 0x7f0d22958670>
<div class="footer">
<p>

View File

@ -169,7 +169,7 @@
<hr/>
<div role="contentinfo">
<jinja2.runtime.BlockReference object at 0x7f8a0912bf10>
<jinja2.runtime.BlockReference object at 0x7f0d22922650>
<div class="footer">
<p>

View File

@ -486,7 +486,7 @@ is computed as:</p>
<hr/>
<div role="contentinfo">
<jinja2.runtime.BlockReference object at 0x7f8a091076d0>
<jinja2.runtime.BlockReference object at 0x7f0d2226a290>
<div class="footer">
<p>

View File

@ -378,7 +378,7 @@ one.</p></li>
<hr/>
<div role="contentinfo">
<jinja2.runtime.BlockReference object at 0x7f8a090f4f10>
<jinja2.runtime.BlockReference object at 0x7f0d21b56bf0>
<div class="footer">
<p>

View File

@ -349,7 +349,7 @@ techniques to optimize the underlying graph. It provides a wrapper similar to P
<hr/>
<div role="contentinfo">
<jinja2.runtime.BlockReference object at 0x7f8a092e8430>
<jinja2.runtime.BlockReference object at 0x7f0d222528f0>
<div class="footer">
<p>

View File

@ -365,7 +365,7 @@ The mandatory input tensors to create a valid <code class="docutils literal notr
<hr/>
<div role="contentinfo">
<jinja2.runtime.BlockReference object at 0x7f8a09121810>
<jinja2.runtime.BlockReference object at 0x7f0d229342b0>
<div class="footer">
<p>

View File

@ -323,7 +323,7 @@ The following tensors are for a LoRA which has a <code class="docutils literal n
<hr/>
<div role="contentinfo">
<jinja2.runtime.BlockReference object at 0x7f8a092eb700>
<jinja2.runtime.BlockReference object at 0x7f0d229341f0>
<div class="footer">
<p>

View File

@ -206,7 +206,7 @@ python3<span class="w"> </span>examples/summarize.py<span class="w"> </span><spa
<hr/>
<div role="contentinfo">
<jinja2.runtime.BlockReference object at 0x7f8a091051e0>
<jinja2.runtime.BlockReference object at 0x7f0d2294c3a0>
<div class="footer">
<p>

View File

@ -240,7 +240,7 @@ python<span class="w"> </span>../summarize.py<span class="w"> </span>--engine_di
<hr/>
<div role="contentinfo">
<jinja2.runtime.BlockReference object at 0x7f8a08927550>
<jinja2.runtime.BlockReference object at 0x7f0d22913c40>
<div class="footer">
<p>

View File

@ -506,7 +506,7 @@ trtllm-build<span class="w"> </span>--checkpoint_dir<span class="w"> </span>./op
<hr/>
<div role="contentinfo">
<jinja2.runtime.BlockReference object at 0x7f8a09121720>
<jinja2.runtime.BlockReference object at 0x7f0d22993580>
<div class="footer">
<p>

View File

@ -377,7 +377,7 @@ issues and may be less efficient in terms of GPU utilization.</p>
<hr/>
<div role="contentinfo">
<jinja2.runtime.BlockReference object at 0x7f8a09397220>
<jinja2.runtime.BlockReference object at 0x7f0d2294dfc0>
<div class="footer">
<p>

View File

@ -158,7 +158,7 @@ Server</a> to easily create web-based services for LLMs. TensorRT-LLM supports m
<hr/>
<div role="contentinfo">
<jinja2.runtime.BlockReference object at 0x7f8a08941990>
<jinja2.runtime.BlockReference object at 0x7f0d220022f0>
<div class="footer">
<p>

View File

@ -336,7 +336,7 @@ The usage of this API looks like this:</p>
<hr/>
<div role="contentinfo">
<jinja2.runtime.BlockReference object at 0x7f8a089629e0>
<jinja2.runtime.BlockReference object at 0x7f0d22002080>
<div class="footer">
<p>

View File

@ -295,7 +295,7 @@ ISL = Input Sequence Length
<hr/>
<div role="contentinfo">
<jinja2.runtime.BlockReference object at 0x7f8a093abd00>
<jinja2.runtime.BlockReference object at 0x7f0d22912fe0>
<div class="footer">
<p>

View File

@ -247,7 +247,7 @@
<hr/>
<div role="contentinfo">
<jinja2.runtime.BlockReference object at 0x7f8a08ce38e0>
<jinja2.runtime.BlockReference object at 0x7f0d22324ee0>
<div class="footer">
<p>

View File

@ -239,7 +239,7 @@ TensorRT-LLM v0.5.0, TensorRT v9.1.0.4 | H200, H100 FP8. </sub></p>
<hr/>
<div role="contentinfo">
<jinja2.runtime.BlockReference object at 0x7f8a08d0ca90>
<jinja2.runtime.BlockReference object at 0x7f0d2216f460>
<div class="footer">
<p>

View File

@ -204,7 +204,7 @@ ISL = Input Sequence Length
<hr/>
<div role="contentinfo">
<jinja2.runtime.BlockReference object at 0x7f8a08e203a0>
<jinja2.runtime.BlockReference object at 0x7f0d2216f070>
<div class="footer">
<p>

View File

@ -359,7 +359,7 @@
<hr/>
<div role="contentinfo">
<jinja2.runtime.BlockReference object at 0x7f8a08d688b0>
<jinja2.runtime.BlockReference object at 0x7f0d2216e110>
<div class="footer">
<p>

View File

@ -190,7 +190,7 @@ post-processor as <code class="docutils literal notranslate"><span class="pre">R
<hr/>
<div role="contentinfo">
<jinja2.runtime.BlockReference object at 0x7f8a04c7b640>
<jinja2.runtime.BlockReference object at 0x7f0d22d17cd0>
<div class="footer">
<p>

View File

@ -3773,7 +3773,7 @@
<hr/>
<div role="contentinfo">
<jinja2.runtime.BlockReference object at 0x7f8a08c552d0>
<jinja2.runtime.BlockReference object at 0x7f8e3781d390>
<div class="footer">
<p>

View File

@ -364,7 +364,7 @@
<hr/>
<div role="contentinfo">
<jinja2.runtime.BlockReference object at 0x7f8a08e59ae0>
<jinja2.runtime.BlockReference object at 0x7f8e35b33af0>
<div class="footer">
<p>

View File

@ -301,7 +301,7 @@ relevant classes. The associated unit tests should also be consulted for underst
<hr/>
<div role="contentinfo">
<jinja2.runtime.BlockReference object at 0x7f8a093968f0>
<jinja2.runtime.BlockReference object at 0x7f0d21e1b1f0>
<div class="footer">
<p>

View File

@ -358,7 +358,7 @@ pip<span class="w"> </span>uninstall<span class="w"> </span>-y<span class="w"> <
<hr/>
<div role="contentinfo">
<jinja2.runtime.BlockReference object at 0x7f8a08bbf370>
<jinja2.runtime.BlockReference object at 0x7f0d22326920>
<div class="footer">
<p>

View File

@ -181,7 +181,7 @@ git<span class="w"> </span>lfs<span class="w"> </span>install
<hr/>
<div role="contentinfo">
<jinja2.runtime.BlockReference object at 0x7f8a08d15ab0>
<jinja2.runtime.BlockReference object at 0x7f0d21ff4fd0>
<div class="footer">
<p>

View File

@ -135,7 +135,7 @@
<div class="admonition note">
<p class="admonition-title">Note</p>
<p>The Windows release of TensorRT-LLM is currently in beta.
We recommend checking out the <a class="reference external" href="https://github.com/NVIDIA/TensorRT-LLM/releases/tag/v0.10.0">v0.10.0 tag</a> for the most stable experience.</p>
We recommend checking out the <a class="reference external" href="https://github.com/NVIDIA/TensorRT-LLM/releases/tag/v0.11.0">v0.11.0 tag</a> for the most stable experience.</p>
</div>
<p><strong>Prerequisites</strong></p>
<ol class="arabic">
@ -177,7 +177,7 @@ pip<span class="w"> </span>uninstall<span class="w"> </span>-y<span class="w"> <
</pre></div>
</div>
<p>before installing TensorRT-LLM with the following command.</p>
<div class="highlight-bash notranslate"><div class="highlight"><pre><span></span>pip<span class="w"> </span>install<span class="w"> </span><span class="nv">tensorrt_llm</span><span class="o">==</span><span class="m">0</span>.10.0<span class="w"> </span>--extra-index-url<span class="w"> </span>https://pypi.nvidia.com
<div class="highlight-bash notranslate"><div class="highlight"><pre><span></span>pip<span class="w"> </span>install<span class="w"> </span><span class="nv">tensorrt_llm</span><span class="o">==</span><span class="m">0</span>.11.0<span class="w"> </span>--extra-index-url<span class="w"> </span>https://pypi.nvidia.com
</pre></div>
</div>
<p>Run the following command to verify that your TensorRT-LLM installation is working properly.</p>
@ -201,7 +201,7 @@ pip<span class="w"> </span>uninstall<span class="w"> </span>-y<span class="w"> <
<hr/>
<div role="contentinfo">
<jinja2.runtime.BlockReference object at 0x7f8a08e52dd0>
<jinja2.runtime.BlockReference object at 0x7f0d221498a0>
<div class="footer">
<p>

View File

@ -220,7 +220,7 @@
<hr/>
<div role="contentinfo">
<jinja2.runtime.BlockReference object at 0x7f8a08bc65f0>
<jinja2.runtime.BlockReference object at 0x7f0d22d15e40>
<div class="footer">
<p>

View File

@ -194,7 +194,7 @@
<hr/>
<div role="contentinfo">
<jinja2.runtime.BlockReference object at 0x7f8a08e45480>
<jinja2.runtime.BlockReference object at 0x7f0d21e1a320>
<div class="footer">
<p>

View File

@ -222,7 +222,7 @@
<hr/>
<div role="contentinfo">
<jinja2.runtime.BlockReference object at 0x7f8a091f68c0>
<jinja2.runtime.BlockReference object at 0x7f0d22959db0>
<div class="footer">
<p>

View File

@ -506,7 +506,7 @@ recommended to start from <code class="docutils literal notranslate"><span class
<hr/>
<div role="contentinfo">
<jinja2.runtime.BlockReference object at 0x7f8a08e58940>
<jinja2.runtime.BlockReference object at 0x7f8e35b9d990>
<div class="footer">
<p>

View File

@ -1398,7 +1398,7 @@ that can be compared with the table in the <a class="reference internal" href="#
<hr/>
<div role="contentinfo">
<jinja2.runtime.BlockReference object at 0x7f8a08902ec0>
<jinja2.runtime.BlockReference object at 0x7f0d220e4490>
<div class="footer">
<p>

View File

@ -140,7 +140,7 @@
<hr/>
<div role="contentinfo">
<jinja2.runtime.BlockReference object at 0x7f8a088bc640>
<jinja2.runtime.BlockReference object at 0x7f8e35c9dd20>
<div class="footer">
<p>

View File

@ -167,7 +167,7 @@
<hr/>
<div role="contentinfo">
<jinja2.runtime.BlockReference object at 0x7f8a088bc6d0>
<jinja2.runtime.BlockReference object at 0x7f8e35c9c4c0>
<div class="footer">
<p>

View File

@ -140,7 +140,7 @@
<hr/>
<div role="contentinfo">
<jinja2.runtime.BlockReference object at 0x7f8a088d1c90>
<jinja2.runtime.BlockReference object at 0x7f8e35bf3f40>
<div class="footer">
<p>

View File

@ -140,7 +140,7 @@
<hr/>
<div role="contentinfo">
<jinja2.runtime.BlockReference object at 0x7f8a088f2410>
<jinja2.runtime.BlockReference object at 0x7f8e35b9c040>
<div class="footer">
<p>

View File

@ -140,7 +140,7 @@
<hr/>
<div role="contentinfo">
<jinja2.runtime.BlockReference object at 0x7f8a0976c4f0>
<jinja2.runtime.BlockReference object at 0x7f8e359ca8f0>
<div class="footer">
<p>

View File

@ -140,7 +140,7 @@
<hr/>
<div role="contentinfo">
<jinja2.runtime.BlockReference object at 0x7f8a08e53eb0>
<jinja2.runtime.BlockReference object at 0x7f8e35b9ee90>
<div class="footer">
<p>

View File

@ -290,7 +290,7 @@ python<span class="w"> </span>/opt/scripts/launch_triton_server.py<span class="w
<hr/>
<div role="contentinfo">
<jinja2.runtime.BlockReference object at 0x7f8a08bcfbb0>
<jinja2.runtime.BlockReference object at 0x7f0d220f53f0>
<div class="footer">
<p>

View File

@ -276,7 +276,7 @@ Here some explanations on how these values affect the memory:</p>
<hr/>
<div role="contentinfo">
<jinja2.runtime.BlockReference object at 0x7f8a08a86f80>
<jinja2.runtime.BlockReference object at 0x7f0d21637ac0>
<div class="footer">
<p>

View File

@ -725,7 +725,7 @@ are:</p>
<hr/>
<div role="contentinfo">
<jinja2.runtime.BlockReference object at 0x7f8a08a95510>
<jinja2.runtime.BlockReference object at 0x7f0eb89db7f0>
<div class="footer">
<p>

View File

@ -286,7 +286,7 @@
<hr/>
<div role="contentinfo">
<jinja2.runtime.BlockReference object at 0x7f8a089f5660>
<jinja2.runtime.BlockReference object at 0x7f0d21e09090>
<div class="footer">
<p>

View File

@ -404,7 +404,7 @@ dedicated MPI environment, not the one provided by your Slurm allocation.</p>
<hr/>
<div role="contentinfo">
<jinja2.runtime.BlockReference object at 0x7f8a08e46a10>
<jinja2.runtime.BlockReference object at 0x7f0d22326a40>
<div class="footer">
<p>

View File

@ -791,7 +791,7 @@
<hr/>
<div role="contentinfo">
<jinja2.runtime.BlockReference object at 0x7f8a08a84430>
<jinja2.runtime.BlockReference object at 0x7f0d21696b90>
<div class="footer">
<p>

View File

@ -149,7 +149,7 @@
<hr/>
<div role="contentinfo">
<jinja2.runtime.BlockReference object at 0x7f8a08c8c9a0>
<jinja2.runtime.BlockReference object at 0x7f8e361dcd60>
<div class="footer">
<p>

File diff suppressed because one or more lines are too long

View File

@ -436,7 +436,7 @@ However, similar to any new model, you can follow the same approach to define yo
<hr/>
<div role="contentinfo">
<jinja2.runtime.BlockReference object at 0x7f8a08a70490>
<jinja2.runtime.BlockReference object at 0x7f8e35c9c6a0>
<div class="footer">
<p>