(add-model)= # Adding a Model This document describes how to add a typical decoder-only model in TensorRT-LLM. ## Step 1. Write Modeling Part TensorRT-LLM provides different levels of APIs: - Low-level functions, for example, `concat`, `add`, and `sum`. - Basic layers, such as, `Linear` and `LayerNorm`. - High-level layers, such as, `MLP` and `Attention`. - Base class for typical decoder-only models, such as, `DecoderModelForCausalLM`. 1. Create a model directory in `tensorrt_llm/models`, for example `my_model`. 2. Write a `model.py` with TensorRT-LLM's APIs ```python class MyDecoderLayer(Module): def __init__(self, config: PretrainedConfig, layer_idx: int): self.layer_idx = layer_idx self.config = config self.input_layernorm = LayerNorm(...) self.attention = Attention(...) self.post_layernorm = LayerNorm(...) self.mlp = MLP(...) def forward(self, hidden_states, ...): # decoder layer forward return hidden_states class MyModel(Module): def __init__(self, config: PretrainedConfig): self.config = config self.vocab_embedding = Embedding(...) self.layers = DecoderLayerList(MyDecoderLayer, config) self.ln_f = LayerNorm(...) def forward(self, input_ids, ...): # model forward return hidden_states class MyModelForCausalLM(DecoderModelForCausalLM): def __init__(self, config: PretrainedConfig): transformer = MyModel(config) lm_head = ColumnLinear(...) super().__init__(config, transformer, lm_head) ``` ## Step 2. Implement Weight Conversion The weights from source framework need to be converted and bound to the new added TensorRT-LLM model. Here is an example of converting HuggingFace weights: ```python class MyModelForCausalLM(DecoderModelForCausalLM): @classmethod def from_hugging_face( cls, hf_model_dir, dtype='float16', mapping: Optional[Mapping] = None) -> MyModelForCausalLM # create a TensorRT-LLM MyModelForCausalLM model object # convert HuggingFace checkpoint to TensorRT-LLM expected weights dict # load the weights to MyModelForCausalLM object ``` It's optional to develop a `convert_checkpoint.py` script in the `examples/my_model/` directory for the convenience of offline weights conversion. ## Step 3. Register New Model Please register the new model class `MyModelForCausalLM` in `tensorrt_llm/models/__init__.py`. ## Step 4. Verify New Model At last, let's verify the new model. The typical commands are as following: ```bash cd examples/my_model/ python convert_checkpoint.py --model_dir hf_model_dir --output_dir tllm_ckpt_dir trtllm-build --checkpoint_dir tllm_ckpt_dir --output_dir tllm_engine_dir # try the model with a single prompt python ../run.py --engine_dir tllm_engine_dir --tokenizer_dir hf_model_dir --input_text "Born in north-east France, Soyer trained as a" # run summarization task python ../summarize.py --engine_dir tllm_engine_dir --hf_model_dir hf_model_dir --test_trt_llm ``` ## Reference It's recommended to read the workflow[./workflow.md] and checkpoint[./checkpoint.md] documents for more details.