TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-02-17 00:04:57 +08:00

History

Yuxian Qiu af68c29d3d [None][chore] Reduce attention module repeated warnings. (#11335 ) Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>		2026-02-10 08:58:21 +08:00
..
batch_manager	[None][chore] Reduce attention module repeated warnings. (#11335 )	2026-02-10 08:58:21 +08:00
common	[https://nvbugs/5814309 ][fix] Use NCCL as fallback to avoid crash due to insufficient memory (#10928 )	2026-02-02 16:26:46 +08:00
cutlass_extensions/include/cutlass_extensions	[None][feat] sm100 weight-only kernel (#10190 )	2026-01-05 09:44:36 +08:00
deep_ep	[TRTLLM-9197][infra] Move thirdparty stuff to it's own listfile (#8986 )	2025-11-20 16:44:23 -08:00
deep_gemm	[None][feat] update deepgemm to the DeepGEMM/nv_dev branch (#9898 )	2026-01-05 16:43:42 +08:00
executor	[https://nvbugs/5863392 ][fix] fix partial reuse disabled for disagg (#11247 )	2026-02-06 14:23:51 -05:00
executor_worker	Update TensorRT-LLM (#2792 )	2025-02-18 21:27:39 +08:00
flash_mla	[TRTLLM-9211][infra] Minor fixes to 3rdparty/CMakelists (#9365 )	2025-11-23 22:57:02 -08:00
kernels	[None][fix] Fix amax to avoid NaN issue in fp8_blockscale_gemm_kernel. (#11256 )	2026-02-06 00:28:29 +08:00
layers	[None][feat] Support ignored prompt length for penalties via new sampling config parameter (#8127 )	2025-10-27 13:12:31 -04:00
nanobind	[https://nvbugs/5863392 ][fix] fix partial reuse disabled for disagg (#11247 )	2026-02-06 14:23:51 -05:00
plugins	[None][fix] Remove unused params in attn (#10652 )	2026-01-20 03:08:59 -05:00
runtime	[https://nvbugs/5825514 ][fix] Add null pointer check to parseNpyHeader (#10944 )	2026-01-30 03:01:33 -05:00
testing	fix: Improve chunking test and skip empty kernel calls (#5710 )	2025-07-04 09:08:15 +02:00
thop	[TRTLLM-9457][feat] Add cute dsl fp8 gemm for Blackwell (#10130 )	2026-02-06 09:49:30 +08:00
CMakeLists.txt	[None][chore] Removing pybind11 bindings and references (#10550 )	2026-01-26 08:19:12 -05:00