mirror of
https://github.com/NVIDIA/TensorRT-LLM.git
synced 2026-02-20 17:55:20 +08:00
- Refactor get_valid_tactics for FC1 and FC2 runners to define mma_tiler_mn_candidates and cluster_shape_mn_candidates together - Use itertools.product for cleaner iteration pattern - Update get_tuning_config for FC1 and FC2 to use DynamicTensorSpec and ConstraintSpec for proper dynamic tensor handling - FC1: Add constraint for input scale factor shape inference - FC2: Add constraints for input scale factor and alpha_scale shape Signed-off-by: Zongfei Jing <20381269+zongfeijing@users.noreply.github.com> |
||
|---|---|---|
| .. | ||
| __init__.py | ||
| cpp_custom_ops.py | ||
| cute_dsl_custom_ops.py | ||
| flashinfer_custom_ops.py | ||
| torch_custom_ops.py | ||
| trtllm_gen_custom_ops.py | ||
| userbuffers_custom_ops.py | ||