doc: fix invalid link in llama 4 example documentation (#6340)

Signed-off-by: Liana Koleva <43767763+lianakoleva@users.noreply.github.com>
2026-01-14 06:27:45 +08:00 · 2025-07-26 08:27:10 -07:00 · 2025-07-26 08:27:10 -07:00 · 96d004d800
commit 96d004d800
parent 54f68287fc
1 changed files with 1 additions and 1 deletions
--- a/examples/models/core/llama4/README.md
+++ b/examples/models/core/llama4/README.md
@ -134,7 +134,7 @@ python -m tensorrt_llm.serve.scripts.benchmark_serving \
 - `max_batch_size` and `max_num_tokens` can easily affect the performance. The default values for them are already carefully designed and should deliver good performance on overall cases, however, you may still need to tune it for peak performance.
 - `max_batch_size` should not be too low to bottleneck the throughput. Note with Attention DP, the the whole system's max_batch_size will be `max_batch_size*dp_size`.
 - CUDA grah `max_batch_size` should be same value as TensorRT-LLM server's `max_batch_size`.
- For more details on `max_batch_size` and `max_num_tokens`, refer to [Tuning Max Batch Size and Max Num Tokens](../performance/performance-tuning-guide/tuning-max-batch-size-and-max-num-tokens.md).
+- For more details on `max_batch_size` and `max_num_tokens`, refer to [Tuning Max Batch Size and Max Num Tokens](../../../../docs/source/performance/performance-tuning-guide/tuning-max-batch-size-and-max-num-tokens.md).

 ### Troubleshooting