TensorRT-LLMs/tests/unittest
William Zhang ffc0f54959
[https://nvbugs/5848756][fix] Re-take ownership of mrope tensors in prefill worker (#11217)
* Why?

Previously, the mrope tensors' IPC handles would just be forwarded from
encode -> prefill -> decode workers. While this is fine for the
prefill worker, it is not for the decode worker, since by the time it
tries to rebuild those tensors, they could have been garbage collected
due to their refcounts reaching zero in the producer (encode) worker.

This could lead to nasty runtime errors when running E/P/D
disaggregated serving.

* What?

This commit fixes this by having the prefill worker take ownership of
those reconstructed tensors, and stand up new copies for the decode
worker.

Closes: NvBug 5848756

Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>
2026-02-06 22:37:42 -05:00
..
_torch [https://nvbugs/5848756][fix] Re-take ownership of mrope tensors in prefill worker (#11217) 2026-02-06 22:37:42 -05:00
api_stability [TRTLLM-9457][feat] Add cute dsl fp8 gemm for Blackwell (#10130) 2026-02-06 09:49:30 +08:00
bindings [TRTLLM-9527][feat] change context params and disagg params (step3) (#10495) 2026-01-27 16:34:17 +08:00
disaggregated [TRTLLM-9527][feat] Modularization of the transceiver for KV manager v2 (step 4) (#11225) 2026-02-06 07:15:18 -05:00
executor [https://nvbugs/5720482][fix] Fix test rpc streaming (#9902) 2025-12-13 01:14:43 -08:00
kv_cache_manager_v2_tests [None][feat] Enhance support for complex models (#11254) 2026-02-05 17:28:26 +08:00
llmapi [#11037][fix] Fix proto-to-SamplingParams conversion bugs and add gRPC tests (#11292) 2026-02-05 05:00:29 -05:00
others [https://nvbugs/5761391][fix] Include triton-kernels as a packaged dependency (#10471) 2026-01-28 19:56:32 -08:00
scaffolding [None][feat] Refactor scaffolding streaming feature and fix openai wo… (#8622) 2025-10-30 16:02:40 +08:00
tools [None][feat] Add performance alignment to layer-wise benchmarks (#11018) 2026-01-29 14:01:51 +08:00
trt [TRTLLM-8682][chore] Remove auto_parallel module (#8329) 2025-10-22 20:53:08 -04:00
utils [https://nvbugs/5761391][fix] Include triton-kernels as a packaged dependency (#10471) 2026-01-28 19:56:32 -08:00
conftest.py [TRTLLM-10415][feat] Dump thread stacks for hanging tests before time… (#10708) 2026-01-29 20:43:34 +08:00
dump_checkpoint_stats.py Update TensorRT-LLM (#2936) 2025-03-18 21:25:19 +08:00
gc_utils.py [nvbug 5273941] fix: broken cyclic reference detect (#5417) 2025-07-01 20:12:55 +08:00
profile_utils.py Update TensorRT-LLM (#2936) 2025-03-18 21:25:19 +08:00
pytest.ini [TRTLLM-9181][feat] improve disagg-server prometheus metrics; synchronize workers' clocks when workers are dynamic (#9726) 2025-12-16 05:16:32 -08:00
test_model_runner_cpp.py Update TensorRT-LLM (#2936) 2025-03-18 21:25:19 +08:00
test_pip_install.py [TRTLLM-10561][infra] Fix jaraco-context and wheel vulnerability (#10901) 2026-02-03 09:54:11 +08:00