docs:update 0.19 docs (#3986)

Signed-off-by: nv-guomingz <37257613+nv-guomingz@users.noreply.github.com> Co-authored-by: nv-guomingz <37257613+nv-guomingz@users.noreply.github.com>
2026-01-13 22:18:36 +08:00 · 2025-04-30 19:25:26 +08:00 · 2025-04-30 19:25:26 +08:00 · 034f6f2d91
commit 034f6f2d91
parent cd9c7498d0
2 changed files with 2 additions and 2 deletions
--- a/docs/source/release-notes.md
+++ b/docs/source/release-notes.md
@ -778,7 +778,7 @@ Refer to the {ref}`support-matrix-software` section for a list of supported mode
 - System prompt caching
 - Enabled split-k for weight-only cutlass kernels
 - FP8 KV cache support for XQA kernel
- New Python builder API and `trtllm-build` command (already applied to [blip2](https://github.com/NVIDIA/TensorRT-LLM/tree/main/examples/models/contrib/blip2) and [OPT](https://github.com/NVIDIA/TensorRT-LLM/tree/main/examples/models/contrib/opt#3-build-tensorrt-engines))
+- New Python builder API and `trtllm-build` command (already applied to [blip2](https://github.com/NVIDIA/TensorRT-LLM/tree/main/examples/blip2) and [OPT](https://github.com/NVIDIA/TensorRT-LLM/tree/main/examples/opt#3-build-tensorrt-engines))
 - Support `StoppingCriteria` and `LogitsProcessor` in Python generate API
 - FHMA support for chunked attention and paged KV cache
 - Performance enhancements include:
--- a/examples/auto_deploy/README.md
+++ b/examples/auto_deploy/README.md
@ -250,7 +250,7 @@ llm = LLM(

 </details>

-For more examples on TRT-LLM LLM API, visit [`this page`](https://nvidia.github.io/TensorRT-LLM/llm-api-examples/).
+For more examples on TRT-LLM LLM API, visit [`this page`](https://nvidia.github.io/TensorRT-LLM/examples/llm_api_examples.html).

 ______________________________________________________________________