TensorRT-LLMs/tests/unittest/_torch/multimodal
William Zhang ffc0f54959
[https://nvbugs/5848756][fix] Re-take ownership of mrope tensors in prefill worker (#11217)
* Why?

Previously, the mrope tensors' IPC handles would just be forwarded from
encode -> prefill -> decode workers. While this is fine for the
prefill worker, it is not for the decode worker, since by the time it
tries to rebuild those tensors, they could have been garbage collected
due to their refcounts reaching zero in the producer (encode) worker.

This could lead to nasty runtime errors when running E/P/D
disaggregated serving.

* What?

This commit fixes this by having the prefill worker take ownership of
those reconstructed tensors, and stand up new copies for the decode
worker.

Closes: NvBug 5848756

Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>
2026-02-06 22:37:42 -05:00
..
test_external_embedding.py [None][fix] InputProcessor config naming convention fix (#8705) 2025-11-03 22:29:21 -08:00
test_find_num_image_tokens.py [None][fix] InputProcessor config naming convention fix (#8705) 2025-11-03 22:29:21 -08:00
test_fuse_input_embeds.py [TRTLLM-7440][fix] Split fused_input_embed to separate out host sync (#7280) 2025-09-06 23:11:39 -04:00
test_mm_encoder_standalone.py [https://nvbugs/5848756][fix] Re-take ownership of mrope tensors in prefill worker (#11217) 2026-02-06 22:37:42 -05:00
test_multimodal_runtime.py [TRTLLM-6903][feat] Support chunked prefill for multimodal models (#6843) 2025-09-14 20:10:10 -07:00
test_share_multiparams.py [TRTLLM-7385][feat] Optimize Qwen2/2.5-VL performance (#7250) 2025-09-22 03:40:02 -07:00