TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-02-03 17:52:19 +08:00

History

dongxuy04 19a0ea363b [TRTLLM-6743][feat] Optimize and refactor alltoall in WideEP (#6973 ) Signed-off-by: Dongxu Yang <78518666+dongxuy04@users.noreply.github.com> Signed-off-by: Fred Wei <20514172+WeiHaocheng@users.noreply.github.com> Signed-off-by: Dongxu Yang <dongxuy@nvidia.com> Co-authored-by: Fred Wei <20514172+WeiHaocheng@users.noreply.github.com>		2025-08-24 08:15:29 -04:00
..
attention	[None][ci] move unittests to sub-directories (#6635 )	2025-08-20 05:42:22 -04:00
auto_deploy	[#4403 ][refactor] Move fusion, kvcache, and compile to modular inference optimizer (#7057 )	2025-08-21 10:30:36 -07:00
compilation	[TRTLLM-3105][feat] Add Piecewise CUDA Graph Support (#3804 )	2025-05-09 11:04:01 +08:00
debugger	Fix: fix nvbug 5356427 (#5464 )	2025-06-25 22:24:26 +08:00
executor	[None][ci] move unittests to sub-directories (#6635 )	2025-08-20 05:42:22 -04:00
misc	[None][perf] Make finalize fusion part of the tactic selection logic (#6915 )	2025-08-21 14:08:03 -07:00
modeling	[TRTLLM-4921][feat] Enable chunked prefill for Nemotron-H (#6334 )	2025-08-22 12:15:20 -04:00
modules	[None][infra] Waive failed tests on main branch 8/20 (#7092 )	2025-08-20 06:33:44 -04:00
multi_gpu	[None][ci] move unittests to sub-directories (#6635 )	2025-08-20 05:42:22 -04:00
multi_gpu_modeling	[None][fix] Fix llama4 multimodal by skipping request validation (#6957 )	2025-08-20 21:58:53 -04:00
multimodal	[TRTLLM-7326][feat] Add standalone multimodal encoder (#6743 )	2025-08-19 21:42:50 -07:00
sampler	[TRTLLM-7155][feat] Unify sampler handle logits implementation. (#6867 )	2025-08-22 08:09:30 +02:00
speculative	[None][feat] Deepseek: Start Eagle work (#6210 )	2025-08-22 12:57:17 -04:00
thop	[TRTLLM-6743][feat] Optimize and refactor alltoall in WideEP (#6973 )	2025-08-24 08:15:29 -04:00
helpers.py	[TRTLLM-5863][feat] Support MoE INT8 Weight-Only-Quantization in PyTorch Workflow (#6629 )	2025-08-15 17:15:49 -04:00
pattern_watcher.py	[TRTLLM-3105][feat] Add Piecewise CUDA Graph Support (#3804 )	2025-05-09 11:04:01 +08:00