TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-26 21:53:30 +08:00

History

hlu1 b558232ce1 Refactor CutlassFusedMoE (#5344 ) Signed-off-by: Hao Lu <14827759+hlu1@users.noreply.github.com>		2025-06-19 00:04:07 -07:00
..
tests_lora_modules	added loraOp into lora layer + test for mlp and comparison to lora plugin (#3455 )	2025-04-17 12:48:27 +08:00
test_fused_moe.py	Refactor CutlassFusedMoE (#5344 )	2025-06-19 00:04:07 -07:00
test_moe_host_sharer.py	feat: large-scale EP(part 6: Online EP load balancer integration for GB200 nvfp4) (#4818 )	2025-06-08 10:25:18 +08:00
test_moe_load_balancer.py	fix: [nvbugs/5324229] Fix broken WInt4AFP8FusedMoEMethod since FusedMoE refactor. (#4930 )	2025-06-13 10:21:32 +08:00
test_moe_routing.py	[https://nvbugspro.nvidia.com/bug/5332927 ][fix] Fix the bug in the routing unit test (#5065 )	2025-06-11 09:44:35 +08:00