TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-02-21 10:15:46 +08:00

History

Shijie dcf5c86720 [None][feat] Unify nvfp4 gemm backend (#8963 ) Signed-off-by: Shijie Wang <jaywan@nvidia.com> Signed-off-by: Yukun He <23156053+hyukn@users.noreply.github.com> Signed-off-by: Shijie <jaywan@nvidia.com> Co-authored-by: Yukun He <23156053+hyukn@users.noreply.github.com>		2025-12-02 11:03:51 +08:00
..
deep_gemm_tests.py	[TRTLLM-7457][ci] Update unittest parallel config (#7297 )	2025-08-29 09:28:04 +08:00
test_causal_conv1d_op.py	[TRTLLM-7457][ci] Update unittest parallel config (#7297 )	2025-08-29 09:28:04 +08:00
test_cublas_mm.py	[TRTLLM-7457][ci] Update unittest parallel config (#7297 )	2025-08-29 09:28:04 +08:00
test_custom_ops.py	[TRTLLM-8160][feat] Add draft token tree runtime on CDL (#8586 )	2025-11-25 09:40:55 -05:00
test_cute_dsl_moe.py	[TRTLLM-9370][feat] Integration of CuteDSL NVFP4 grouped GEMM (Part 2: SwiGLU Fusion and Finalize Fusion) (#9288 )	2025-11-21 14:03:38 -08:00
test_dsv3_fused_a_gemm.py	[TRTLLM-7457][ci] Update unittest parallel config (#7297 )	2025-08-29 09:28:04 +08:00
test_dsv3_router_gemm.py	[TRTLLM-7457][ci] Update unittest parallel config (#7297 )	2025-08-29 09:28:04 +08:00
test_finegrained_mixed_dtype_gemm.py	[TRTLLM-7457][ci] Update unittest parallel config (#7297 )	2025-08-29 09:28:04 +08:00
test_fp4_bmm_quantize.py	[TRTLLM-7457][ci] Update unittest parallel config (#7297 )	2025-08-29 09:28:04 +08:00
test_fp4_calculate_global_scale.py	[TRTLLM-7457][ci] Update unittest parallel config (#7297 )	2025-08-29 09:28:04 +08:00
test_fp4_gemm_quantize.py	[OMNIML-2336][feat] Add NVFP4 x FP8 (#6809 )	2025-09-04 09:03:38 -07:00
test_fp4_linear.py	[None][feat] Unify nvfp4 gemm backend (#8963 )	2025-12-02 11:03:51 +08:00
test_fp4_swizzle.py	[TRTLLM-7457][ci] Update unittest parallel config (#7297 )	2025-08-29 09:28:04 +08:00
test_fp8_block_scale_gemm.py	[https://nvbugs/5456493 ][feat] add fp8 dense for sm120 (#9174 )	2025-11-19 14:40:34 +08:00
test_fp8_linear.py	[TRTLLM-7457][ci] Update unittest parallel config (#7297 )	2025-08-29 09:28:04 +08:00
test_fp8_per_tensor_scale_tllmg_gemm.py	[TRTLLM-4629] [feat] Add support of CUDA13 and sm103 devices (#7568 )	2025-09-16 09:56:18 +08:00
test_fp8_quantize.py	[None][perf] Use fp8 quant kernel in DS3.2 indexer module (#8701 )	2025-10-29 12:45:09 +08:00
test_fp8_rowwise_linear.py	[None][infra] Remove invaild waived tests which not in release branch (#8841 )	2025-11-20 12:43:13 -05:00
test_fused_qk_norm_rope.py	[None][feat] Support Yarn on QwQ-32B model (#9059 )	2025-11-25 07:27:28 +08:00
test_helix_postprocess.py	[TRTLLM-5966][feat] Helix: add full MLA support for Helix (#8104 )	2025-11-04 09:06:58 +08:00
test_indexer_topk.py	[None] [feat] Optimize the algorithm part of RocketKV (#9333 )	2025-12-01 09:04:09 +08:00
test_logits_bitmask_op.py	[TRTLLM-8209][feat] Support new structural tag API (upgrade XGrammar to 0.1.25) (#7893 )	2025-09-23 09:10:09 +08:00
test_mamba2_chunk_ss_update.py	[TRTLLM-8994][infra] upgrade to DLFW 25.10 and pytorch 2.9.0 / triton 3.5.0 (#8838 )	2025-11-04 18:59:34 +08:00
test_mamba_conv1d_op.py	[https://nvbugs/5640873 ][fix] Move thop tests to pre-merge (#9094 )	2025-11-13 13:08:13 +08:00
test_noaux_tc.py	[TRTLLM-8637][feat] Optimize the routing kernel for DeepseekV3 (MoE CUTLASS backend); Add support for KimiK2 and Qwen-next (MoE TRTLLM backend) (#7761 )	2025-10-20 10:08:31 +08:00
test_scaled_mm.py	[TRTLLM-7457][ci] Update unittest parallel config (#7297 )	2025-08-29 09:28:04 +08:00
test_selective_scan_op.py	[TRTLLM-7457][ci] Update unittest parallel config (#7297 )	2025-08-29 09:28:04 +08:00
test_tinygemm2.py	[TRTLLM-7775][feat] Integrate tinygemm2 for gpt-oss (#7916 )	2025-10-02 10:47:04 -07:00
test_tllmg_bmm.py	[TRTLLM-7457][ci] Update unittest parallel config (#7297 )	2025-08-29 09:28:04 +08:00
test_w4a8_linear.py	[TRTLLM-7457][ci] Update unittest parallel config (#7297 )	2025-08-29 09:28:04 +08:00
test_w4a8_mxfp4_mxfp8_gemm.py	[TRTLLM-7457][ci] Update unittest parallel config (#7297 )	2025-08-29 09:28:04 +08:00
test_w4a16_linear.py	[TRTLLM-7457][ci] Update unittest parallel config (#7297 )	2025-08-29 09:28:04 +08:00
test_weight_only_quant_gemm.py	[TRTLLM-7457][ci] Update unittest parallel config (#7297 )	2025-08-29 09:28:04 +08:00
test_weight_only_quant_linear.py	[TRTLLM-7457][ci] Update unittest parallel config (#7297 )	2025-08-29 09:28:04 +08:00