TensorRT-LLMs/triton_backend/inflight_batcher_llm/tests/third.json
Iman Tabrizian 4c7191af67
Move Triton backend to TRT-LLM main (#3549)
* Move TRT-LLM backend repo to TRT-LLM repo

Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>

* Address review comments

Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>

* debug ci

Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>

* Update triton backend

Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>

* Fixes after update

Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>

---------

Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>
2025-05-16 07:15:23 +08:00

26 lines
643 B
JSON

{
"parameters": {
"gpu_device_ids": {
"string_value": "0"
},
"max_beam_width": {
"string_value": "4"
},
"batch_scheduler_policy": {
"string_value": "guaranteed_no_evict"
},
"executor_worker_path": {
"string_value": "/opt/tritonserver/backends/tensorrtllm/trtllmExecutorWorker"
},
"normalize_log_probs": {
"string_value": "false"
},
"gpt_model_type": {
"string_value": "inflight_fused_batching"
}
},
"model_transaction_policy": {
"decoupled": true
}
}