TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-14 06:27:45 +08:00

History

Yueh-Ting (eop) Chen 85088dce05 [None][chore] Update feature combination matrix for SWA kv cache reuse (#8529 ) Signed-off-by: eopXD <yuehtingc@nvidia.com>		2025-10-21 04:41:44 -04:00
..
auto_deploy	[None][chore] AutoDeploy: cleanup old inference optimizer configs (#8039 )	2025-10-17 15:55:57 -04:00
attention.md	[None][doc] Rename TensorRT-LLM to TensorRT LLM. (#7554 )	2025-09-09 12:16:03 +08:00
checkpoint-loading.md	[None][doc] Rename TensorRT-LLM to TensorRT LLM. (#7554 )	2025-09-09 12:16:03 +08:00
disagg-serving.md	[TRTLLM-7964][infra] Set nixl to default cache transceiver backend (#7926 )	2025-10-19 19:24:43 +08:00
feature-combination-matrix.md	[None][chore] Update feature combination matrix for SWA kv cache reuse (#8529 )	2025-10-21 04:41:44 -04:00
kvcache.md	[None][doc] Rename TensorRT-LLM to TensorRT LLM for homepage and the … (#7850 )	2025-09-25 21:02:35 +08:00
long-sequence.md	[None][doc] Rename TensorRT-LLM to TensorRT LLM for homepage and the … (#7850 )	2025-09-25 21:02:35 +08:00
lora.md	[TRTLLM-5930][doc] 1.0 Documentation. (#6696 )	2025-09-09 12:16:03 +08:00
multi-modality.md	[None][doc] Rename TensorRT-LLM to TensorRT LLM. (#7554 )	2025-09-09 12:16:03 +08:00
overlap-scheduler.md	[TRTLLM-5930][doc] 1.0 Documentation. (#6696 )	2025-09-09 12:16:03 +08:00
paged-attention-ifb-scheduler.md	[None][doc] Use hash id for external link (#7641 )	2025-09-22 14:28:38 +08:00
parallel-strategy.md	[None][doc] Rename TensorRT-LLM to TensorRT LLM. (#7554 )	2025-09-09 12:16:03 +08:00
quantization.md	[None][doc] Fix a invalid link and a typo. (#7634 )	2025-09-22 14:28:38 +08:00
ray-orchestrator.md	[None][doc] Ray orchestrator initial doc (#8373 )	2025-10-14 21:17:57 -07:00
sampling.md	[None][doc] Use hash id for external link (#7641 )	2025-09-22 14:28:38 +08:00
speculative-decoding.md	[None][doc] Rename TensorRT-LLM to TensorRT LLM. (#7554 )	2025-09-09 12:16:03 +08:00