mirror of
https://github.com/NVIDIA/TensorRT-LLM.git
synced 2026-02-10 13:03:34 +08:00
* remove tensorrt_llm._torch.distributed.ParallelConfig Signed-off-by: junq <22017000+QiJune@users.noreply.github.com> * fix ci Signed-off-by: junq <22017000+QiJune@users.noreply.github.com> * fix ci Signed-off-by: junq <22017000+QiJune@users.noreply.github.com> * clean Signed-off-by: junq <22017000+QiJune@users.noreply.github.com> * fix embedding test Signed-off-by: junq <22017000+QiJune@users.noreply.github.com> * fix Signed-off-by: junq <22017000+QiJune@users.noreply.github.com> * fix comments Signed-off-by: junq <22017000+QiJune@users.noreply.github.com> * polish Signed-off-by: junq <22017000+QiJune@users.noreply.github.com> * fix ci Signed-off-by: junq <22017000+QiJune@users.noreply.github.com> * rebase Signed-off-by: junq <22017000+QiJune@users.noreply.github.com> --------- Signed-off-by: junq <22017000+QiJune@users.noreply.github.com> Co-authored-by: hlu1 <14827759+hlu1@users.noreply.github.com> |
||
|---|---|---|
| .. | ||
| auto_deploy | ||
| compilation | ||
| modeling | ||
| modules/tests_lora_modules | ||
| multi_gpu | ||
| multi_gpu_modeling | ||
| speculative | ||
| thop | ||
| deep_gemm_tests.py | ||
| helpers.py | ||
| pattern_watcher.py | ||
| test_attention_no_cache.py | ||
| test_attention.py | ||
| test_autotuner.py | ||
| test_flashinfer_attention.py | ||
| test_flashinfer_star_attn.py | ||
| test_fp4_bmm_quantize.py | ||
| test_fp4_gemm_quantize.py | ||
| test_fp4_linear.py | ||
| test_fp8_block_scale_gemm.py | ||
| test_fp8_linear.py | ||
| test_fp8_quantize.py | ||
| test_fused_moe.py | ||
| test_moe_routing.py | ||
| test_moe.py | ||
| test_overlap_scheduler_input.json | ||
| test_overlap_scheduler.py | ||
| test_pytorch_model_engine.py | ||
| test_resource_manager.py | ||
| test_vanilla_attention.py | ||