TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-02-19 01:05:12 +08:00

History

Jin Li 028235404b [TRTLLM-6633][feat] Padding for piecewise cudagraph (#6750 ) Signed-off-by: Jin Li <59594262+liji-nv@users.noreply.github.com>		2025-08-26 18:31:33 -04:00
..
attention	[None][ci] move unittests to sub-directories (#6635 )	2025-08-20 05:42:22 -04:00
auto_deploy	[None][doc] Update autodeploy README.md, deprecate lm_eval in examples folder (#7233 )	2025-08-26 10:47:57 -07:00
compilation	[TRTLLM-3105][feat] Add Piecewise CUDA Graph Support (#3804 )	2025-05-09 11:04:01 +08:00
debugger	Fix: fix nvbug 5356427 (#5464 )	2025-06-25 22:24:26 +08:00
executor	fix/improve kvcache allocation in PyTorch runtime (#5933 )	2025-08-26 12:40:22 +08:00
misc	[None][perf] Make finalize fusion part of the tactic selection logic (#6915 )	2025-08-21 14:08:03 -07:00
modeling	[None][refactor] refactor the CUDA graph runner to manage all CUDA graphs (#6846 )	2025-08-25 20:52:05 +08:00
models/checkpoints/hf	[None][feat] Skip prefetching consolidated safetensors when appropriate (#7013 )	2025-08-25 23:56:21 -04:00
modules	[TRTLLM-7346][fix] Improve performance of PyTorchModelEngine._get_lora_params_from_requests (#7033 )	2025-08-25 10:37:40 +03:00
multi_gpu	[None][ci] move unittests to sub-directories (#6635 )	2025-08-20 05:42:22 -04:00
multi_gpu_modeling	[None][fix] Fix llama4 multimodal by skipping request validation (#6957 )	2025-08-20 21:58:53 -04:00
multimodal	[TRTLLM-7326][feat] Add standalone multimodal encoder (#6743 )	2025-08-19 21:42:50 -07:00
sampler	[TRTLLM-7155][feat] Unify sampler handle logits implementation. (#6867 )	2025-08-22 08:09:30 +02:00
speculative	[None][feat] Deepseek: Start Eagle work (#6210 )	2025-08-22 12:57:17 -04:00
thop	[TRTLLM-6633][feat] Padding for piecewise cudagraph (#6750 )	2025-08-26 18:31:33 -04:00
helpers.py	[None][refactor] refactor the CUDA graph runner to manage all CUDA graphs (#6846 )	2025-08-25 20:52:05 +08:00
pattern_watcher.py	[TRTLLM-3105][feat] Add Piecewise CUDA Graph Support (#3804 )	2025-05-09 11:04:01 +08:00