TensorRT-LLMs/tensorrt_llm/_torch/pyexecutor
Jin Li ef268e2062
[TRTLLM-9904][feat] Changes for future KVCacheV2 MTP support (#11029)
Signed-off-by: Jin Li <59594262+liji-nv@users.noreply.github.com>
2026-01-30 01:49:17 -05:00
..
__init__.py
_util.py
config_utils.py
cuda_graph_runner.py
executor_request_queue.py [TRTLLM-10264][feat] Support attention DP + Helix CP (#10477) 2026-01-29 02:57:13 -05:00
finish_reason.py
grammar_matcher.py
guided_decoder.py
handle_additional_outputs.py
handle_logits.py
hang_detector.py
kv_cache_connector.py
kv_cache_transceiver.py
layerwise_nvtx_marker.py
llm_request.py
make_decoding_batch_input_output.py
mamba_cache_manager.py
model_engine.py [TRTLLM-9904][feat] Changes for future KVCacheV2 MTP support (#11029) 2026-01-30 01:49:17 -05:00
model_loader.py
py_executor_creator.py [None][feat] Add performance alignment to layer-wise benchmarks (#11018) 2026-01-29 14:01:51 +08:00
py_executor.py [None][chore] Consolidate duplicate kv cache reuse variables. (#10935) 2026-01-29 11:03:27 -08:00
resource_manager.py
sampler.py [TRTLLM-10312][perf] Improve performance of _write_finish_reasons in TorchSampler (#10459) 2026-01-29 11:06:09 -05:00
sampling_utils_flashinfer.py
sampling_utils.py
scheduler.py
seq_slot_manager.py