TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-02-01 08:41:13 +08:00

History

amitz-nv 66f0657716 [TRTLLM-7346][fix] Improve performance of PyTorchModelEngine._get_lora_params_from_requests (#7203 ) Signed-off-by: Amit Zuker <203509407+amitz-nv@users.noreply.github.com>		2025-08-28 16:06:32 +08:00
..
tests_lora_modules	[TRTLLM-7346][fix] Improve performance of PyTorchModelEngine._get_lora_params_from_requests (#7203 )	2025-08-28 16:06:32 +08:00
test_fused_moe.py	[None][feat] Add support for fused gate_up_proj scales for FP8 blockwise (#6496 )	2025-08-05 11:22:32 -07:00
test_moe_host_sharer.py	feat: large-scale EP(part 6: Online EP load balancer integration for GB200 nvfp4) (#4818 )	2025-06-08 10:25:18 +08:00
test_moe_load_balancer.py	feat: large-scale EP(part 8: Online EP load balancer integration for PCIe fp8) (#5226 )	2025-06-25 22:25:13 -07:00
test_moe_routing.py	[https://nvbugspro.nvidia.com/bug/5332927 ][fix] Fix the bug in the routing unit test (#5065 )	2025-06-11 09:44:35 +08:00