TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-29 07:02:56 +08:00

History

Zeyu WANG 2681b26e48 [TRTLLM-2795] feat: Add yarn support for other models in trt-flow (#3840 ) Add yarn support for general models(e.g. llama, qwen) other than deepseek in trt-flow. Signed-off-by: Zeyu Wang <zeyuw@nvidia.com>		2025-05-15 11:03:57 +08:00
..
__init__.py	feat: Add group_rms_norm kernel to normalize multiple inputs in a single operator. (#3438 )	2025-05-02 13:25:30 +08:00
flashinfer.py	[feat] Enable chunked context for flashinfer (#4132 )	2025-05-15 10:59:38 +08:00
interface.py	[TRTLLM-2795] feat: Add yarn support for other models in trt-flow (#3840 )	2025-05-15 11:03:57 +08:00
star_flashinfer.py	Remove dummy forward path (#3669 )	2025-04-18 16:17:50 +08:00
trtllm.py	feat: Support Gemma3-1b-it in Pytorch workflow (#3999 )	2025-05-14 14:02:44 +08:00
utils.py	feat: Add group_rms_norm kernel to normalize multiple inputs in a single operator. (#3438 )	2025-05-02 13:25:30 +08:00
vanilla.py	Remove dummy forward path (#3669 )	2025-04-18 16:17:50 +08:00