TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-02-16 15:55:08 +08:00

History

Wanli Jiang 421eb9e39c [None][feat] Optimize NemotronH model with elementwise and nvfp4 fusion (#11273 ) Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>		2026-02-12 09:25:31 -05:00
..
layer_wise_benchmarks	[None][feat] Optimize NemotronH model with elementwise and nvfp4 fusion (#11273 )	2026-02-12 09:25:31 -05:00
plugin_gen	[None][chroe] Rename TensorRT-LLM to TensorRT LLM for source code. (#7851 )	2025-09-25 21:02:35 +08:00
profiler/nsys_profile_tools	[None] [feat] nsys profile output kernel classifier (#7020 )	2025-08-23 00:57:37 -04:00
__init__.py	Update TensorRT-LLM (#465 )	2023-11-24 22:12:26 +08:00
importlib_utils.py	[None][fix] Refactoring input prep to allow out-of-tree models (#6497 )	2025-08-12 20:29:10 -04:00
multimodal_builder.py	[TRTLLM-8310][feat] Add Qwen3-VL-MoE (#9689 )	2025-12-15 20:05:20 -08:00
onnx_utils.py	Update TensorRT-LLM (#1954 )	2024-07-16 15:30:25 +08:00
ppl.py	Update TensorRT-LLM (#302 )	2023-11-07 19:51:58 +08:00