TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-02-17 00:04:57 +08:00

History

Wanli Jiang 421eb9e39c [None][feat] Optimize NemotronH model with elementwise and nvfp4 fusion (#11273 ) Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>		2026-02-12 09:25:31 -05:00
..
__init__.py	[None][feat] Add performance alignment to layer-wise benchmarks (#11018 )	2026-01-29 14:01:51 +08:00
calibrator.py	[None][feat] Add performance alignment to layer-wise benchmarks (#11018 )	2026-01-29 14:01:51 +08:00
mark_utils.py	[None][feat] Layer-wise benchmarks: make model init more general and support weights loading (#10562 )	2026-01-13 19:17:03 +08:00
runner.py	[None][feat] Optimize NemotronH model with elementwise and nvfp4 fusion (#11273 )	2026-02-12 09:25:31 -05:00