TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-02-06 03:01:50 +08:00

History

tomeras91 6e712dd1cc [None][fix] enable NvFP4/FP8 quantization for Nemotron-H architecture (#7589 ) Signed-off-by: Tomer Asida <57313761+tomeras91@users.noreply.github.com>		2025-09-09 11:42:22 +03:00
..
__init__.py
causal_conv1d.py
layernorm_gated.py
mamba2_metadata.py	[None][doc] fix example in docstring (#7410 )	2025-09-02 11:59:49 +03:00
mamba2_mixer.py	[None][fix] enable NvFP4/FP8 quantization for Nemotron-H architecture (#7589 )	2025-09-09 11:42:22 +03:00
selective_state_update.py
softplus.py
ssd_bmm.py
ssd_chunk_scan.py	[TRTLLM-4921][feat] Enable chunked prefill for Nemotron-H (#6334 )	2025-08-22 12:15:20 -04:00
ssd_chunk_state.py
ssd_combined.py	[TRTLLM-4921][feat] Enable chunked prefill for Nemotron-H (#6334 )	2025-08-22 12:15:20 -04:00
ssd_state_passing.py	[TRTLLM-4921][feat] Enable chunked prefill for Nemotron-H (#6334 )	2025-08-22 12:15:20 -04:00