TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-14 06:27:45 +08:00

History

nv-guomingz 578430e64c [TRTLLM-5530][BREAKING CHANGE]: enhance the llm args pytorch config part 1(cuda_graph_config) (#5014 ) Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>		2025-06-30 11:05:40 +08:00
..
all_models	[TRTLLM-5530][BREAKING CHANGE]: enhance the llm args pytorch config part 1(cuda_graph_config) (#5014 )	2025-06-30 11:05:40 +08:00
ci	Move Triton backend to TRT-LLM main (#3549 )	2025-05-16 07:15:23 +08:00
inflight_batcher_llm	[nvbugs/5309940] Add support for input output token counts (#5445 )	2025-06-28 04:39:39 +08:00
scripts	Add testing for trtllm-llmapi-launch with tritonserver (#5528 )	2025-06-27 11:19:52 +08:00
tools	[nvbug 5283506] fix: Fix spec decode triton test (#4845 )	2025-06-09 08:40:17 -04:00
requirements.txt	Move Triton backend to TRT-LLM main (#3549 )	2025-05-16 07:15:23 +08:00