diff --git a/docs/source/features/disagg-serving.md b/docs/source/features/disagg-serving.md
index 88feb11b08..6e600793fb 100644
--- a/docs/source/features/disagg-serving.md
+++ b/docs/source/features/disagg-serving.md
@@ -10,6 +10,7 @@
 - [Usage](#Usage)
   - [Dynamo](#Dynamo)
   - [trtllm-serve](#trtllm-serve)
+  - [Multiple Instances](#multiple-instances)
 - [Environment Variables](#Environment-Variables)
 - [Troubleshooting and FAQ](#Troubleshooting-and-FAQ)
 
@@ -215,6 +216,23 @@ curl http://localhost:8000/v1/completions \
 
 Please refer to [Disaggregated Inference Benchmark Scripts](../../../examples/disaggregated/slurm).
 
+### Multiple Instances
+
+To increase maximum concurrency without more GPU nodes, you can deploy multiple disaggregated server instances across different nodes, while each instance manages the same context/generation servers. This is helpful when one disaggregated server becomes a performance bottleneck or runs out of ephemeral ports.
+
+Example (two-node deployment):
+
+- **Node A**
+  - Context servers: `node-a:8001`
+  - Generation servers: `node-b:8002`
+  - Disaggregated orchestrator endpoint: `node-a:8000`
+- **Node B**
+  - Context servers: `node-a:8001`
+  - Generation servers: `node-b:8002`
+  - Disaggregated orchestrator endpoint: `node-b:8000`
+- **Client entrypoint**
+  - Send requests or use a load balancer forwarding to `node-a:8000` and `node-b:8000`
+
 ## Environment Variables
 
 TRT-LLM uses some environment variables to control the behavior of disaggregated service.