TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-14 06:27:45 +08:00

History

pcastonguay add5e5cd93 feat: Add option to run disaggregated serving without ctx servers,… (#3243 ) * feat: Add option to run disaggregated serving without ctx servers, to benchmark gen only Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com> * Fixing comment in sanity check Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com> --------- Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>		2025-04-07 21:56:03 -04:00
..
batch_manager	feat: Support PeftCacheManager in Torch (#3186 )	2025-04-04 12:38:08 +08:00
common	feat: support abort disconnected requests (#3214 )	2025-04-07 16:14:58 +08:00
executor	feat: Add option to run disaggregated serving without ctx servers,… (#3243 )	2025-04-07 21:56:03 -04:00
runtime	refactor: Expose DecoderState via bindings and integrate in TRTLLMDecoder (#3139 )	2025-04-05 07:42:35 +08:00
userbuffers	Update TensorRT-LLM (#2783 )	2025-02-13 18:40:22 +08:00
bindings.cpp	feat: Support PeftCacheManager in Torch (#3186 )	2025-04-04 12:38:08 +08:00
CMakeLists.txt	Update TensorRT-LLM (#2849 )	2025-03-04 18:44:00 +08:00