[#7704][chore] Enable MathJax to fix formulas in documentation (#7744)

Signed-off-by: Kanghwan Jang <861393+karljang@users.noreply.github.com>
2026-01-14 06:27:45 +08:00 · 2025-09-19 08:42:26 -07:00 · 2025-09-19 08:42:26 -07:00 · 8fcd11515d
commit 8fcd11515d
parent 8030b540ac
2 changed files with 26 additions and 20 deletions
--- a/docs/source/blogs/tech_blog/blog10_ADP_Balance_Strategy.md
+++ b/docs/source/blogs/tech_blog/blog10_ADP_Balance_Strategy.md
@ -45,44 +45,44 @@ To address this critical performance limitation, we introduce the **ADP (Attenti

 We model and quantify the performance impact of load imbalance in Attention DP. Since workloads across ranks can be heterogeneous, the execution time for the Attention module in any given iteration is bounded by the rank with the highest workload:

-```math
+$$
 time_i = \max_{0 \leq m < N} time_{i,m}
-```
+$$

 where $time_{i,m}$ represents the execution time of rank $m$ in iteration $i$, and $N$ is the data parallel size.

 To quantify load balance and theoretical performance bounds, we define two key metrics:

 #### 1. Balance Ratio
-The $balance\\_ratio$ measures the load distribution across ranks within the Attention module for each iteration:
+The balance ratio measures the load distribution across ranks within the Attention module for each iteration:

-```math
-balance\_ratio = \frac{avg\_tokens}{max\_tokens}
-```
+$$
+balance = \frac{tokens_{avg}}{tokens_{max}}
+$$

 where:
- $avg\\_tokens$ represents the average number of tokens across all ranks
- $max\\_tokens$ represents the maximum number of tokens across all ranks
+- $tokens_{avg}$ represents the average number of tokens across all ranks  
+- $tokens_{max}$ represents the maximum number of tokens across all ranks
 - $tokens_i$ represents the number of tokens processed by rank $i$

 Note: MoE module load balancing is handled separately by the Expert Parallel Load Balancer (EPLB) module and is not considered during the early scheduling phase.

 #### 2. Speed-of-Light Throughput (SOL TPS)
-The $sol\\_tps$ represents the theoretical upper-bound throughput achievable with perfect load balancing:
+The Speed-of-Light throughput represents the theoretical upper-bound throughput achievable with perfect load balancing:

-```math
-sol\_time = \sum_{i=0}^{\infty} time_i * balance\_ratio_i
-```
+$$
+time_{sol} = \sum_{i=0}^{\infty} time_i \times balance
+$$

-```math
-sol\_tps = \frac{elapsed\_time}{sol\_time} \times actual\_tps
-```
+$$
+tps_{sol} = \frac{time_{elapsed}}{time_{sol}} \times tps_{actual}
+$$

 where:
 - $time_i$: Measured execution time of iteration $i$
- $elapsed\\_time$: Total empirically measured end-to-end execution time
- $actual\\_tps$: Observed throughput in tokens per second
- $sol\\_tps$: Theoretical maximum throughput under perfect load balance
+- $time_{elapsed}$: Total empirically measured end-to-end execution time
+- $tps_{actual}$: Observed throughput in tokens per second
+- $tps_{sol}$: Theoretical maximum throughput under perfect load balance

 This theoretical framework enables us to quantify the performance gap between current and optimal system utilization, providing clear targets for optimization.

--- a/docs/source/conf.py
+++ b/docs/source/conf.py
@ -50,6 +50,7 @@ extensions = [
    'sphinx.ext.autosummary',
    'sphinx.ext.viewcode',
    'sphinx.ext.napoleon',
+    'sphinx.ext.mathjax',
    'myst_parser',  # for markdown support
    "breathe",
    'sphinx.ext.todo',
@ -86,6 +87,8 @@ myst_heading_anchors = 4
 myst_enable_extensions = [
    "deflist",
    "substitution",
+    "dollarmath",
+    "amsmath",
 ]

 myst_substitutions = {
@ -167,8 +170,11 @@ def tag_role(name, rawtext, text, lineno, inliner, options=None, content=None):
 def setup(app):
    from helper import generate_examples, generate_llmapi

-    from tensorrt_llm.llmapi.utils import tag_llm_params
-    tag_llm_params()
+    try:
+        from tensorrt_llm.llmapi.utils import tag_llm_params
+        tag_llm_params()
+    except ImportError:
+        print("Warning: tensorrt_llm not available, skipping tag_llm_params")

    app.add_role('tag', tag_role)