TensorRT-LLMs/cpp/tensorrt_llm
2025-09-16 08:43:56 -04:00
..
batch_manager [TRTLLM-8044][refactor] Rename data -> cache for cacheTransceiver (#7659) 2025-09-16 08:43:56 -04:00
common [TRTLLM-4629] [feat] Add support of CUDA13 and sm103 devices (#7568) 2025-09-16 09:56:18 +08:00
cutlass_extensions/include/cutlass_extensions [TRTLLM-4629] [feat] Add support of CUDA13 and sm103 devices (#7568) 2025-09-16 09:56:18 +08:00
deep_ep [TRTLLM-4629] [feat] Add support of CUDA13 and sm103 devices (#7568) 2025-09-16 09:56:18 +08:00
deep_gemm [https://nvbugs/5433581][fix] DeepGEMM installation on SBSA (#6588) 2025-08-06 16:44:21 +08:00
executor [TRTLLM-8044][refactor] Rename data -> cache for cacheTransceiver (#7659) 2025-09-16 08:43:56 -04:00
executor_worker Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
kernels [TRTLLM-4629] [feat] Add support of CUDA13 and sm103 devices (#7568) 2025-09-16 09:56:18 +08:00
layers refactor: Remove enforced sorted order of batch slots (#3502) 2025-07-14 17:23:02 +02:00
nanobind [TRTLLM-7192][feat] optimize MLA chunked prefill && support fp8 mla chunked prefill (#7477) 2025-09-15 21:43:49 +08:00
plugins [None][feat] support gpt-oss with fp8 kv cache (#7612) 2025-09-15 02:17:37 +08:00
pybind [TRTLLM-7192][feat] optimize MLA chunked prefill && support fp8 mla chunked prefill (#7477) 2025-09-15 21:43:49 +08:00
runtime [TRTLLM-4629] [feat] Add support of CUDA13 and sm103 devices (#7568) 2025-09-16 09:56:18 +08:00
testing fix: Improve chunking test and skip empty kernel calls (#5710) 2025-07-04 09:08:15 +02:00
thop [TRTLLM-4629] [feat] Add support of CUDA13 and sm103 devices (#7568) 2025-09-16 09:56:18 +08:00
CMakeLists.txt [https://nvbugs/5453827][fix] Fix RPATH of th_common shared library to find pip-installed NCCL (#6984) 2025-08-21 17:58:30 +08:00