mirror of
https://github.com/NVIDIA/TensorRT-LLM.git
synced 2026-02-18 08:45:05 +08:00
* Update switcher.json Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com> * Update 0.19 doc Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com> --------- Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
9 lines
256 B
ReStructuredText
9 lines
256 B
ReStructuredText
Generation with Quantization
|
|
============================
|
|
|
|
Source https://github.com/NVIDIA/TensorRT-LLM/tree/main/examples/llm-api/llm_quantization.py.
|
|
|
|
.. literalinclude:: ../../../examples/llm-api/llm_quantization.py
|
|
:language: python
|
|
:linenos:
|