TensorRT-LLMs/_sources/llm-api-examples/llm_quantization.rst.txt
2024-12-04 14:25:18 +08:00

9 lines
256 B
ReStructuredText

Generation with Quantization
============================
Source https://github.com/NVIDIA/TensorRT-LLM/tree/main/examples/llm-api/llm_quantization.py.
.. literalinclude:: ../../../examples/llm-api/llm_quantization.py
:language: python
:linenos: