..
auto_deploy
chore: Rename nvsmall to nemotron nas ( #3447 )
2025-04-10 23:16:52 +08:00
compilation
Update ( #2978 )
2025-03-23 16:39:35 +08:00
modeling
chore: Rename nvsmall to nemotron nas ( #3447 )
2025-04-10 23:16:52 +08:00
modules /tests_lora_modules
lora_tests ( #3201 )
2025-04-09 18:06:52 +03:00
multi_gpu
feat: Introduce UB allocator for pytorch flow ( #3257 )
2025-04-08 18:39:49 +08:00
multi_gpu_modeling
Add Llama 4 ( #3302 )
2025-04-09 03:35:21 +08:00
speculative
Add thread leak check and fix thread/memory leak issues. ( #3270 )
2025-04-08 19:03:18 +08:00
thop
feat: Introduce UB allocator for pytorch flow ( #3257 )
2025-04-08 18:39:49 +08:00
deep_gemm_tests.py
feat: use NVRTC for DeepGEMM JIT compilation ( #3239 )
2025-04-07 20:29:23 +08:00
helpers.py
Update TensorRT-LLM ( #2936 )
2025-03-18 21:25:19 +08:00
pattern_watcher.py
Update TensorRT-LLM ( #2936 )
2025-03-18 21:25:19 +08:00
test_attention_no_cache.py
feat: no-cache attention in PyTorch workflow ( #3085 )
2025-04-05 01:54:32 +08:00
test_attention.py
Add thread leak check and fix thread/memory leak issues. ( #3270 )
2025-04-08 19:03:18 +08:00
test_autotuner.py
feat: Apply the new torch-flow compatible AutoTuner to both Fused MoE and NVFP4 Linear operators. ( #3151 )
2025-04-08 14:28:36 +08:00
test_flashinfer_attention.py
Add thread leak check and fix thread/memory leak issues. ( #3270 )
2025-04-08 19:03:18 +08:00
test_flashinfer_star_attn.py
Add thread leak check and fix thread/memory leak issues. ( #3270 )
2025-04-08 19:03:18 +08:00
test_fp4_bmm_quantize.py
feat: Apply the new torch-flow compatible AutoTuner to both Fused MoE and NVFP4 Linear operators. ( #3151 )
2025-04-08 14:28:36 +08:00
test_fp4_gemm_quantize.py
feat: trtllm-gen fp4 GEMM for pytorch workflow ( #3423 )
2025-04-11 02:28:07 +08:00
test_fp4_linear.py
feat: Apply the new torch-flow compatible AutoTuner to both Fused MoE and NVFP4 Linear operators. ( #3151 )
2025-04-08 14:28:36 +08:00
test_fp8_block_scale_gemm.py
feat: enable DeepGEMM by default ( #3341 )
2025-04-08 13:58:57 +08:00
test_fp8_linear.py
test: reorganize tests folder hierarchy ( #2996 )
2025-03-27 12:07:53 +08:00
test_fp8_quantize.py
test: reorganize tests folder hierarchy ( #2996 )
2025-03-27 12:07:53 +08:00
test_fused_moe.py
feat: Apply the new torch-flow compatible AutoTuner to both Fused MoE and NVFP4 Linear operators. ( #3151 )
2025-04-08 14:28:36 +08:00
test_moe_routing.py
Update TensorRT-LLM ( #2936 )
2025-03-18 21:25:19 +08:00
test_moe.py
test: reorganize tests folder hierarchy ( #2996 )
2025-03-27 12:07:53 +08:00
test_overlap_scheduler_input.json
Update TensorRT-LLM ( #2936 )
2025-03-18 21:25:19 +08:00
test_overlap_scheduler.py
Add thread leak check and fix thread/memory leak issues. ( #3270 )
2025-04-08 19:03:18 +08:00
test_pytorch_model_engine.py
Add thread leak check and fix thread/memory leak issues. ( #3270 )
2025-04-08 19:03:18 +08:00
test_resource_manager.py
feat: Support PeftCacheManager in Torch ( #3186 )
2025-04-04 12:38:08 +08:00
test_vanilla_attention.py
Add thread leak check and fix thread/memory leak issues. ( #3270 )
2025-04-08 19:03:18 +08:00