TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-02-15 15:33:49 +08:00

History

Kaiyu Xie 1455074c91 [None] [test] Add MNNVL AlltoAll tests to pre-merge (#7465 ) Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com> Signed-off-by: Zongfei Jing <20381269+zongfeijing@users.noreply.github.com> Co-authored-by: Zongfei Jing <20381269+zongfeijing@users.noreply.github.com>		2025-09-05 10:19:08 -07:00
..
attention	[None][ci] move unittests to sub-directories (#6635 )	2025-08-20 05:42:22 -04:00
auto_deploy	[https://nvbugs/5474453 ][fix] fix path to tested model (#7272 )	2025-08-28 08:01:48 -04:00
compilation	[TRTLLM-3105][feat] Add Piecewise CUDA Graph Support (#3804 )	2025-05-09 11:04:01 +08:00
debugger	Fix: fix nvbug 5356427 (#5464 )	2025-06-25 22:24:26 +08:00
executor	[None][opt] Balance the request based on number of tokens in AttentionDP (#7183 )	2025-08-27 11:16:12 +08:00
misc	[None][perf] Make finalize fusion part of the tactic selection logic (#6915 )	2025-08-21 14:08:03 -07:00
modeling	[None][refactor] refactor the CUDA graph runner to manage all CUDA graphs (#6846 )	2025-08-25 20:52:05 +08:00
models/checkpoints/hf	[None][feat] Skip prefetching consolidated safetensors when appropriate (#7013 )	2025-08-25 23:56:21 -04:00
modules	[None] [test] Add MNNVL AlltoAll tests to pre-merge (#7465 )	2025-09-05 10:19:08 -07:00
multi_gpu	[None][ci] move unittests to sub-directories (#6635 )	2025-08-20 05:42:22 -04:00
multi_gpu_modeling	[None][fix] Fix llama4 multimodal by skipping request validation (#6957 )	2025-08-20 21:58:53 -04:00
multimodal	[TRTLLM-7326][feat] Add standalone multimodal encoder (#6743 )	2025-08-19 21:42:50 -07:00
sampler	[TRTLLM-7155][feat] Unify sampler handle logits implementation. (#6867 )	2025-08-22 08:09:30 +02:00
speculative	[TRTLLM-7457][ci] Update & cleanup unittest parallel config (#7254 )	2025-08-27 00:45:58 -04:00
thop	[TRTLLM-7457][ci] Update unittest parallel config (#7297 )	2025-08-29 09:28:04 +08:00
helpers.py	[None][refactor] refactor the CUDA graph runner to manage all CUDA graphs (#6846 )	2025-08-25 20:52:05 +08:00
pattern_watcher.py	[TRTLLM-3105][feat] Add Piecewise CUDA Graph Support (#3804 )	2025-05-09 11:04:01 +08:00
test_connector.py	[None][feat] KV Cache Connector API (#7228 )	2025-08-28 23:09:27 -04:00