mirror of
https://github.com/NVIDIA/TensorRT-LLM.git
synced 2026-01-14 06:27:45 +08:00
Signed-off-by: fredw (generated by with_the_same_user script) <20514172+WeiHaocheng@users.noreply.github.com>
405 B
405 B
This example shows how to use the StreamGenerationTask and stream_generation_handler to enable efficient streaming-based generation workflows.
How to run the example?
python stream_generation_run.py
See more detail on tensorrt_llm/scaffolding/contrib/AsyncGeneration/README.md.