TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-14 06:27:45 +08:00

History

amirkl94 f4f2176cd5 chore: Port leftover 0.20 (#5907 ) Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com> Signed-off-by: Yingge He <yinggeh@nvidia.com> Signed-off-by: Martin Marciniszyn Mehringer <11665257+MartinMarciniszyn@users.noreply.github.com> Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com> Co-authored-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com> Co-authored-by: Yingge He <157551214+yinggeh@users.noreply.github.com> Co-authored-by: Martin Marciniszyn Mehringer <11665257+MartinMarciniszyn@users.noreply.github.com> Co-authored-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com> Co-authored-by: zpatel <22306219+zbpatel@users.noreply.github.com>		2025-07-22 12:48:00 +08:00
..
all_models	feat: Add support for Triton request cancellation (#5898 )	2025-07-15 20:52:43 -04:00
ci	chore: Port leftover 0.20 (#5907 )	2025-07-22 12:48:00 +08:00
inflight_batcher_llm	[Chore] Replace MODEL_CACHE_DIR with LLM_MODELS_ROOT and unwaive triton_server/test_triton.py::test_gpt_ib[gpt-ib] (#5859 )	2025-07-21 15:46:37 -07:00
scripts	[nvbug/5308432] fix: extend triton exit time for test_llava (#5971 )	2025-07-12 12:56:37 +09:00
tools	feat: Add support for Triton request cancellation (#5898 )	2025-07-15 20:52:43 -04:00
requirements.txt	Move Triton backend to TRT-LLM main (#3549 )	2025-05-16 07:15:23 +08:00