TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-31 08:11:27 +08:00

History

Cheng Hang 64db7d27f6 [feat] Optimizations on weight-only batched gemv kernel (#5420 ) Signed-off-by: Cheng Hang <chang@nvidia.com>		2025-06-30 10:20:16 +08:00
..
attention	feat: Add FP8 support for SM 120 (#3248 )	2025-04-14 16:05:41 -07:00
functional	chore: Mass integration of release/0.20 (#4898 )	2025-06-08 23:26:26 +08:00
model	move the reset models into `examples/models/core` directory (#3555 )	2025-04-19 20:48:59 -07:00
model_api	test: reorganize tests folder hierarchy (#2996 )	2025-03-27 12:07:53 +08:00
python_plugin	test: reorganize tests folder hierarchy (#2996 )	2025-03-27 12:07:53 +08:00
quantization	[feat] Optimizations on weight-only batched gemv kernel (#5420 )	2025-06-30 10:20:16 +08:00
__init__.py	test: reorganize tests folder hierarchy (#2996 )	2025-03-27 12:07:53 +08:00