mirror of
https://github.com/NVIDIA/TensorRT-LLM.git
synced 2026-01-29 07:02:56 +08:00
Add yarn support for general models(e.g. llama, qwen) other than deepseek in trt-flow. Signed-off-by: Zeyu Wang <zeyuw@nvidia.com> |
||
|---|---|---|
| .. | ||
| __init__.py | ||
| flashinfer.py | ||
| interface.py | ||
| star_flashinfer.py | ||
| trtllm.py | ||
| utils.py | ||
| vanilla.py | ||