TensorRT-LLMs/tensorrt_llm/llmapi
William Zhang a6a88985cf
[TRTLLM-9409][feat] Pass MRoPE tensors for EPD disagg (#9758)
* Why?

Certain VLMs like the Qwen family need more than just the multimodal
embeddings in the language model, and need MRoPE position IDs and
deltas. Prior to this commit, only the embeddings could be communicated
from the encoder worker to the prefill worker.

* What?

This commit extends the `DisaggregatedParams` to include the MRoPE
information. It also adjusts several pieces of code required to
communicate that between E, P and D workers.

Closes TRTLLM-9409.

Signed-off-by: William Zhang <133824995+2ez4bz@users.noreply.github.com>
2025-12-22 06:32:49 -05:00
..
__init__.py [TRTLLM-9805][feat] Skip Softmax Attention. (#9821) 2025-12-21 02:52:42 -05:00
build_cache.py [TRTLLM-8684][chore] Migrate BuildConfig to Pydantic, add a Python wrapper for KVCacheType enum (#8330) 2025-10-28 09:17:26 -07:00
disagg_utils.py [TRTLLM-8920][feat] decouple disagg service from fastapi (#8714) 2025-12-05 10:44:16 +08:00
kv_cache_type.py [TRTLLM-8684][chore] Migrate BuildConfig to Pydantic, add a Python wrapper for KVCacheType enum (#8330) 2025-10-28 09:17:26 -07:00
llm_args.py [None][feat] Support Eagle3 on Mistral Large3 (#9971) 2025-12-21 10:25:45 -05:00
llm_utils.py [https://nvbugs/5558117][fix] Allow per-layer quant config from hf_quant_config.json (#8617) 2025-10-31 04:41:44 -07:00
llm.py [TRTLLM-9409][feat] Pass MRoPE tensors for EPD disagg (#9758) 2025-12-22 06:32:49 -05:00
mgmn_leader_node.py [None][chore] replace print_colored_debug with logger_debug (#8417) 2025-10-22 17:54:38 +08:00
mgmn_worker_node.py Update TensorRT-LLM (#2333) 2024-10-15 15:28:40 +08:00
mm_encoder.py [TRTLLM-9409][feat] Pass MRoPE tensors for EPD disagg (#9758) 2025-12-22 06:32:49 -05:00
mpi_session.py [cherry-pick][https://nvbugs/5670793][fix] Solve trtllm-serve launch_disaggregated issue (#9346) 2025-11-27 16:13:58 +08:00
reasoning_parser.py [None][feat] Update reasoning parser for nano-v3 (#9944) 2025-12-15 05:39:37 -08:00
rlhf_utils.py [TRTLLM-9736][feat] AsyncLLM and verl integ (#9353) 2025-12-11 09:33:25 -08:00
tokenizer.py [TRTLLM-9654][feat] Support DeepSeek-V32 chat template (#9814) 2025-12-19 17:05:38 +08:00
tracer.py Update TensorRT-LLM (#2413) 2024-11-05 16:27:06 +08:00
tracing.py [None][feat] Add opentelemetry tracing (#5897) 2025-10-27 18:51:07 +08:00
trtllm-llmapi-launch [https://nvbugs/5569754][fix] trtllm-llmapi-launch port conflict (#8582) 2025-11-20 12:43:13 -05:00
utils.py [TRTLLM-9144][fix] enhance RPC robustness (#8711) 2025-12-02 21:37:59 +08:00