TensorRT-LLMs/examples/high-level-api/run_quant_examples.sh
Kaiyu Xie 5955b8afba
Update TensorRT-LLM Release branch (#1192)
* Update TensorRT-LLM

---------

Co-authored-by: Shixiaowei02 <39303645+Shixiaowei02@users.noreply.github.com>
2024-02-29 17:20:55 +08:00

17 lines
360 B
Bash

#!/bin/bash
set -ex
PROMPT="Tell a story"
LLAMA_MODEL_DIR=$1
python3 llm_examples.py --task run_llm_with_quantization \
--prompt="$PROMPT" \
--hf_model_dir=$LLAMA_MODEL_DIR \
--quant_type="int4_awq"
python3 llm_examples.py --task run_llm_with_quantization \
--prompt="$PROMPT" \
--hf_model_dir=$LLAMA_MODEL_DIR \
--quant_type="fp8"