TensorRT-LLMs/cpp/include/tensorrt_llm
Daniel Cámpora 64d5eba9c7
Fix: max_num_sequences calculation with overlap scheduling into release/0.20 (#4889)
Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
Signed-off-by: Daniel Campora <961215+dcampora@users.noreply.github.com>
Co-authored-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
2025-06-04 22:33:12 +08:00
..
batch_manager Fix: max_num_sequences calculation with overlap scheduling into release/0.20 (#4889) 2025-06-04 22:33:12 +08:00
common feat: NIXL interface integration (#3934) 2025-05-19 18:18:22 +08:00
deep_gemm fix: add SM90 guard for FP8 Blockscale GEMM (#3575) 2025-04-16 14:44:37 +08:00
executor feat: NIXL interface integration (#3934) 2025-05-19 18:18:22 +08:00
kernels Update TensorRT-LLM (#2873) 2025-03-11 21:13:42 +08:00
layers v1.2 (#3082) 2025-03-26 23:31:29 +08:00
plugins/api Update TensorRT-LLM (#2532) 2024-12-04 21:16:56 +08:00
runtime fix: [nvbugs/5287097] Align PP layer distribution between pytorch and TRT flow. (#4399) 2025-05-19 14:25:36 -07:00