TensorRT-LLMs/tensorrt_llm/executor
William Zhang ffc0f54959
[https://nvbugs/5848756][fix] Re-take ownership of mrope tensors in prefill worker (#11217)
* Why?

Previously, the mrope tensors' IPC handles would just be forwarded from
encode -> prefill -> decode workers. While this is fine for the
prefill worker, it is not for the decode worker, since by the time it
tries to rebuild those tensors, they could have been garbage collected
due to their refcounts reaching zero in the producer (encode) worker.

This could lead to nasty runtime errors when running E/P/D
disaggregated serving.

* What?

This commit fixes this by having the prefill worker take ownership of
those reconstructed tensors, and stand up new copies for the decode
worker.

Closes: NvBug 5848756

Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>
2026-02-06 22:37:42 -05:00
..
rpc [None][doc] update readme for rpc (#9972) 2025-12-15 10:16:50 +08:00
__init__.py chore: rename ExecutorBindingsWorker/Proxy (#4716) 2025-05-29 10:32:35 +08:00
base_worker.py [TRTLLM-8921][feat] implement gen-first disagg_service (#11020) 2026-02-03 15:46:11 -05:00
executor.py [TRTLLM-8921][feat] implement gen-first disagg_service (#11020) 2026-02-03 15:46:11 -05:00
ipc.py [https://nvbugs/5720482][fix] Fix test rpc streaming (#9902) 2025-12-13 01:14:43 -08:00
postproc_worker.py [None][feat] perf_metrics endpoint functionality improvement (#8005) 2025-10-02 17:43:25 -07:00
proxy.py [None][refactor] simplify get_stats and get_kvcache_events with rpc (#9980) 2025-12-22 18:23:43 +08:00
ray_executor.py [None][chore] refine placement group in ray executor (#10235) 2026-01-23 19:31:20 +08:00
ray_gpu_worker.py [TRTLLM-9737][chore] Add rl perf reproduce script and enhance the robustness of Ray tests (#9939) 2025-12-24 15:27:01 +08:00
request.py [None][feat] Add opentelemetry tracing (#5897) 2025-10-27 18:51:07 +08:00
result.py [https://nvbugs/5848756][fix] Re-take ownership of mrope tensors in prefill worker (#11217) 2026-02-06 22:37:42 -05:00
rpc_proxy_mixin.py [None][refactor] simplify get_stats and get_kvcache_events with rpc (#9980) 2025-12-22 18:23:43 +08:00
rpc_proxy.py [None][refactor] simplify get_stats and get_kvcache_events with rpc (#9980) 2025-12-22 18:23:43 +08:00
rpc_worker_mixin.py [None][refactor] simplify get_stats and get_kvcache_events with rpc (#9980) 2025-12-22 18:23:43 +08:00
rpc_worker.py [None][fix] enable hmac in RPC (#9745) 2025-12-07 08:24:46 +08:00
utils.py [https://nvbugs/5783876][fix] fix hmac launch (#10434) 2026-01-22 23:20:53 +08:00
worker.py [None][feat] Hang detection for executor loop and worker. (#10480) 2026-01-13 02:34:32 -05:00