TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-23 04:03:22 +08:00

History

nv-guomingz 03e38c9087 chore: update trtllm-serve usage doc by removing backend parameter when it use torch as backend. (#6419 ) Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>		2025-07-30 11:11:06 -04:00
..
blog1_Pushing_Latency_Boundaries_Optimizing_DeepSeek-R1_Performance_on_NVIDIA_B200_GPUs.md
blog2_DeepSeek_R1_MTP_Implementation_and_Optimization.md
blog3_Optimizing_DeepSeek_R1_Throughput_on_NVIDIA_Blackwell_GPUs.md
blog4_Scaling_Expert_Parallelism_in_TensorRT-LLM.md	doc: Add README for wide EP (#6356 )	2025-07-29 00:36:12 -04:00
blog5_Disaggregated_Serving_in_TensorRT-LLM.md	chore: update trtllm-serve usage doc by removing backend parameter when it use torch as backend. (#6419 )	2025-07-30 11:11:06 -04:00
blog6_Llama4_maverick_eagle_guide.md	chore: update trtllm-serve usage doc by removing backend parameter when it use torch as backend. (#6419 )	2025-07-30 11:11:06 -04:00
blog_7_NGram_performance_Analysis_And_Auto_Enablement.md