TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-02-17 00:04:57 +08:00

History

William Zhang ffc0f54959 [https://nvbugs/5848756 ][fix] Re-take ownership of mrope tensors in prefill worker (#11217 ) * Why? Previously, the mrope tensors' IPC handles would just be forwarded from encode -> prefill -> decode workers. While this is fine for the prefill worker, it is not for the decode worker, since by the time it tries to rebuild those tensors, they could have been garbage collected due to their refcounts reaching zero in the producer (encode) worker. This could lead to nasty runtime errors when running E/P/D disaggregated serving. * What? This commit fixes this by having the prefill worker take ownership of those reconstructed tensors, and stand up new copies for the decode worker. Closes: NvBug 5848756 Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>		2026-02-06 22:37:42 -05:00
..
rpc	[None][doc] update readme for rpc (#9972 )	2025-12-15 10:16:50 +08:00
__init__.py	chore: rename ExecutorBindingsWorker/Proxy (#4716 )	2025-05-29 10:32:35 +08:00
base_worker.py	[TRTLLM-8921][feat] implement gen-first disagg_service (#11020 )	2026-02-03 15:46:11 -05:00
executor.py	[TRTLLM-8921][feat] implement gen-first disagg_service (#11020 )	2026-02-03 15:46:11 -05:00
ipc.py	[https://nvbugs/5720482 ][fix] Fix test rpc streaming (#9902 )	2025-12-13 01:14:43 -08:00
postproc_worker.py	[None][feat] perf_metrics endpoint functionality improvement (#8005 )	2025-10-02 17:43:25 -07:00
proxy.py	[None][refactor] simplify get_stats and get_kvcache_events with rpc (#9980 )	2025-12-22 18:23:43 +08:00
ray_executor.py	[None][chore] refine placement group in ray executor (#10235 )	2026-01-23 19:31:20 +08:00
ray_gpu_worker.py	[TRTLLM-9737][chore] Add rl perf reproduce script and enhance the robustness of Ray tests (#9939 )	2025-12-24 15:27:01 +08:00
request.py	[None][feat] Add opentelemetry tracing (#5897 )	2025-10-27 18:51:07 +08:00
result.py	[https://nvbugs/5848756 ][fix] Re-take ownership of mrope tensors in prefill worker (#11217 )	2026-02-06 22:37:42 -05:00
rpc_proxy_mixin.py	[None][refactor] simplify get_stats and get_kvcache_events with rpc (#9980 )	2025-12-22 18:23:43 +08:00
rpc_proxy.py	[None][refactor] simplify get_stats and get_kvcache_events with rpc (#9980 )	2025-12-22 18:23:43 +08:00
rpc_worker_mixin.py	[None][refactor] simplify get_stats and get_kvcache_events with rpc (#9980 )	2025-12-22 18:23:43 +08:00
rpc_worker.py	[None][fix] enable hmac in RPC (#9745 )	2025-12-07 08:24:46 +08:00
utils.py	[https://nvbugs/5783876 ][fix] fix hmac launch (#10434 )	2026-01-22 23:20:53 +08:00
worker.py	[None][feat] Hang detection for executor loop and worker. (#10480 )	2026-01-13 02:34:32 -05:00