TensorRT-LLMs/examples/configs/curated/qwen3.yaml
Venky fd1270b9ab
[TRTC-43] [feat] Add config db and docs (#9420)
Signed-off-by: Frank Di Natale <3429989+FrankD412@users.noreply.github.com>
Signed-off-by: Venky Ganesh <23023424+venkywonka@users.noreply.github.com>
Co-authored-by: Frank Di Natale <3429989+FrankD412@users.noreply.github.com>
2025-12-12 04:00:03 +08:00

21 lines
305 B
YAML

max_batch_size: 161
max_num_tokens: 1160
kv_cache_free_gpu_memory_fraction: 0.8
tensor_parallel_size: 1
moe_expert_parallel_size: 1
cuda_graph_config:
enable_padding: true
batch_sizes:
- 1
- 2
- 4
- 8
- 16
- 32
- 64
- 128
- 256
- 384
print_iter_log: true
enable_attention_dp: true