TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-14 06:27:45 +08:00

History

tomeras91 c232ba8157 [TRTLLM-4921][feat] Enable chunked prefill for Nemotron-H (#6334 ) Signed-off-by: Tomer Asida <57313761+tomeras91@users.noreply.github.com> Signed-off-by: tomeras91 <57313761+tomeras91@users.noreply.github.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>		2025-08-22 12:15:20 -04:00
..
__init__.py	Update TensorRT-LLM (#2936 )	2025-03-18 21:25:19 +08:00
cpp_paths.py	chore: fix llm_root when LLM_ROOT is not set (#4741 )	2025-05-29 19:44:34 -07:00
llm_data.py	Update TensorRT-LLM (#2936 )	2025-03-18 21:25:19 +08:00
runtime_defaults.py	Update TensorRT-LLM (#2936 )	2025-03-18 21:25:19 +08:00
test_medusa_utils.py	Update TensorRT-LLM (#2936 )	2025-03-18 21:25:19 +08:00
torch_ref.py	[TRTLLM-4921][feat] Enable chunked prefill for Nemotron-H (#6334 )	2025-08-22 12:15:20 -04:00
util.py	[None][fix] Migrate to new cuda binding package name (#6700 )	2025-08-07 16:29:55 -04:00