+ + +
+
+ +
+
+ +
+ +
+ + +
+ +
+ + +
+
+ + + + + +
+ +
+

LLM API with TensorRT Engine#

+

A simple inference example with TinyLlama using the LLM API:

+

For more advanced usage including distributed inference, multimodal, and speculative decoding, please refer to this README.

+
+ + +
+ + + + + + + +
+ + + +
+ + + + + +
+
+ +
+ +