TensorRT-LLMs/tests/unittest/_torch
dongxuy04 19a0ea363b
[TRTLLM-6743][feat] Optimize and refactor alltoall in WideEP (#6973)
Signed-off-by: Dongxu Yang <78518666+dongxuy04@users.noreply.github.com>
Signed-off-by: Fred Wei <20514172+WeiHaocheng@users.noreply.github.com>
Signed-off-by: Dongxu Yang <dongxuy@nvidia.com>
Co-authored-by: Fred Wei <20514172+WeiHaocheng@users.noreply.github.com>
2025-08-24 08:15:29 -04:00
..
attention [None][ci] move unittests to sub-directories (#6635) 2025-08-20 05:42:22 -04:00
auto_deploy [#4403][refactor] Move fusion, kvcache, and compile to modular inference optimizer (#7057) 2025-08-21 10:30:36 -07:00
compilation [TRTLLM-3105][feat] Add Piecewise CUDA Graph Support (#3804) 2025-05-09 11:04:01 +08:00
debugger Fix: fix nvbug 5356427 (#5464) 2025-06-25 22:24:26 +08:00
executor [None][ci] move unittests to sub-directories (#6635) 2025-08-20 05:42:22 -04:00
misc [None][perf] Make finalize fusion part of the tactic selection logic (#6915) 2025-08-21 14:08:03 -07:00
modeling [TRTLLM-4921][feat] Enable chunked prefill for Nemotron-H (#6334) 2025-08-22 12:15:20 -04:00
modules [None][infra] Waive failed tests on main branch 8/20 (#7092) 2025-08-20 06:33:44 -04:00
multi_gpu [None][ci] move unittests to sub-directories (#6635) 2025-08-20 05:42:22 -04:00
multi_gpu_modeling [None][fix] Fix llama4 multimodal by skipping request validation (#6957) 2025-08-20 21:58:53 -04:00
multimodal [TRTLLM-7326][feat] Add standalone multimodal encoder (#6743) 2025-08-19 21:42:50 -07:00
sampler [TRTLLM-7155][feat] Unify sampler handle logits implementation. (#6867) 2025-08-22 08:09:30 +02:00
speculative [None][feat] Deepseek: Start Eagle work (#6210) 2025-08-22 12:57:17 -04:00
thop [TRTLLM-6743][feat] Optimize and refactor alltoall in WideEP (#6973) 2025-08-24 08:15:29 -04:00
helpers.py [TRTLLM-5863][feat] Support MoE INT8 Weight-Only-Quantization in PyTorch Workflow (#6629) 2025-08-15 17:15:49 -04:00
pattern_watcher.py [TRTLLM-3105][feat] Add Piecewise CUDA Graph Support (#3804) 2025-05-09 11:04:01 +08:00