..
auto_deploy
test: reorganize tests folder hierarchy ( #2996 )
2025-03-27 12:07:53 +08:00
compilation
Update ( #2978 )
2025-03-23 16:39:35 +08:00
modeling
test: reorganize tests folder hierarchy ( #2996 )
2025-03-27 12:07:53 +08:00
multi_gpu
perf: Add optimizations for deepseek in min latency mode ( #3093 )
2025-04-02 09:05:24 +08:00
multi_gpu_modeling
reduce test cases for deepseek ( #3211 )
2025-04-02 13:57:55 +08:00
speculative
Add initial EAGLE-3 implementation ( #3035 )
2025-03-29 22:31:24 +08:00
thop
perf: Add optimizations for deepseek in min latency mode ( #3093 )
2025-04-02 09:05:24 +08:00
helpers.py
Update TensorRT-LLM ( #2936 )
2025-03-18 21:25:19 +08:00
pattern_watcher.py
Update TensorRT-LLM ( #2936 )
2025-03-18 21:25:19 +08:00
test_attention.py
test: reorganize tests folder hierarchy ( #2996 )
2025-03-27 12:07:53 +08:00
test_autotuner.py
Fix minor issues in test_autotuner.py and loose the cache check for test gemms. ( #3261 )
2025-04-03 18:24:08 +08:00
test_flashinfer_attention.py
Update ( #2978 )
2025-03-23 16:39:35 +08:00
test_flashinfer_star_attn.py
Update ( #2978 )
2025-03-23 16:39:35 +08:00
test_fp4_bmm_quantize.py
test: reorganize tests folder hierarchy ( #2996 )
2025-03-27 12:07:53 +08:00
test_fp4_gemm_quantize.py
test: reorganize tests folder hierarchy ( #2996 )
2025-03-27 12:07:53 +08:00
test_fp4_linear.py
test: reorganize tests folder hierarchy ( #2996 )
2025-03-27 12:07:53 +08:00
test_fp8_block_scale_gemm.py
fix: Update FP8 sf layout for Blackwell and relax blockwise GEMM assertions ( #3144 )
2025-04-01 13:08:29 -07:00
test_fp8_linear.py
test: reorganize tests folder hierarchy ( #2996 )
2025-03-27 12:07:53 +08:00
test_fp8_quantize.py
test: reorganize tests folder hierarchy ( #2996 )
2025-03-27 12:07:53 +08:00
test_fused_moe.py
test: reorganize tests folder hierarchy ( #2996 )
2025-03-27 12:07:53 +08:00
test_moe_routing.py
Update TensorRT-LLM ( #2936 )
2025-03-18 21:25:19 +08:00
test_moe.py
test: reorganize tests folder hierarchy ( #2996 )
2025-03-27 12:07:53 +08:00
test_overlap_scheduler_input.json
Update TensorRT-LLM ( #2936 )
2025-03-18 21:25:19 +08:00
test_overlap_scheduler.py
Update TensorRT-LLM ( #2936 )
2025-03-18 21:25:19 +08:00
test_pytorch_model_engine.py
Update ( #2978 )
2025-03-23 16:39:35 +08:00
test_vanilla_attention.py
Update ( #2978 )
2025-03-23 16:39:35 +08:00