TensorRT-LLMs/tensorrt_llm/scheduling_params.py
Shunkangz 67a3fd858b
[None][feat] Add support of scheduling attention dp request (#6246)
Signed-off-by: Shunkang <182541032+Shunkangz@users.noreply.github.co>
Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>
Co-authored-by: Shunkang <182541032+Shunkangz@users.noreply.github.co>
Co-authored-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>
2025-08-01 20:38:01 -04:00

16 lines
452 B
Python

from dataclasses import dataclass
from typing import Optional
@dataclass(slots=True, kw_only=True)
class SchedulingParams:
"""Schedule parameters.
Args:
attention_dp_rank (int): The rank of target attention dp
attention_dp_relax (bool): Whether to allow the request to be scheduled to other attention dp for better throughput
"""
attention_dp_rank: Optional[int] = None
attention_dp_relax: Optional[bool] = None