TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-02-17 00:04:57 +08:00

History

Bo Li 639051e98b [TRTLLM-10021][docs] Skip Softmax Attention blog and docs. (#10592 ) Signed-off-by: Bo Li <22713281+bobboli@users.noreply.github.com>		2026-02-06 12:11:21 +08:00
..
auto_deploy	[#10966 ][feat] AutoDeploy: kv cache manager integration [2/2] (#11149 )	2026-02-04 09:44:27 -05:00
additional-outputs.md	[TRTLLM-7159][docs] Add documentation for additional outputs (#8325 )	2025-10-27 09:52:04 +01:00
attention.md	[None][doc] Rename TensorRT-LLM to TensorRT LLM. (#7554 )	2025-09-09 12:16:03 +08:00
checkpoint-loading.md	[TRTLLM-7136][feat] Update load_weights method to include mapping parameter in checkpoint loaders (#9583 )	2025-12-05 16:07:20 +01:00
disagg-serving.md	[https://nvbugs/5834212 ][fix] prevent routing ctx and gen requests to the same worker; update doc for unique disagg ID (#11095 )	2026-02-02 09:54:33 +08:00
feature-combination-matrix.md	[None][docs] Add CUDA Graph + LoRA in Feature Combination Matrix (#11187 )	2026-02-05 15:01:59 +01:00
guided-decoding.md	[TRTC-102][docs] `--extra_llm_api_options`->`--config` in docs/examples/tests (#10005 )	2025-12-19 13:48:43 -05:00
helix.md	[None][doc] Add feature docs for helix parallelism (#9684 )	2025-12-04 18:08:40 -08:00
kv-cache-connector.md	[TRTLLM-9199][docs] KV Connector Docs (#9325 )	2025-12-05 17:50:12 -05:00
kvcache.md	[None][doc] Rename TensorRT-LLM to TensorRT LLM for homepage and the … (#7850 )	2025-09-25 21:02:35 +08:00
long-sequence.md	[None][doc] Add the missing content for model support section and fix valid links for long_sequence.md (#8869 )	2025-11-03 02:06:04 -08:00
lora.md	[TRTC-102][docs] `--extra_llm_api_options`->`--config` in docs/examples/tests (#10005 )	2025-12-19 13:48:43 -05:00
multi-modality.md	[None][fix] add missing CLI option in multimodal example (#8977 )	2025-11-07 09:06:08 +01:00
overlap-scheduler.md	[TRTLLM-5930][doc] 1.0 Documentation. (#6696 )	2025-09-09 12:16:03 +08:00
paged-attention-ifb-scheduler.md	[None] [doc] Update IFB performance guide & GPTOSS deployment guide (#10283 )	2025-12-25 05:52:04 -05:00
parallel-strategy.md	[TRTC-102][docs] `--extra_llm_api_options`->`--config` in docs/examples/tests (#10005 )	2025-12-19 13:48:43 -05:00
quantization.md	[None][feat] sm100 weight-only kernel (#10190 )	2026-01-05 09:44:36 +08:00
ray-orchestrator.md	[None][doc] Ray orchestrator initial doc (#8373 )	2025-10-14 21:17:57 -07:00
sampling.md	[TRTLLM-8425][doc] Update sampling documentation (#10083 )	2026-01-16 16:58:49 +08:00
sparse-attention.md	[TRTLLM-10021][docs] Skip Softmax Attention blog and docs. (#10592 )	2026-02-06 12:11:21 +08:00
speculative-decoding.md	[TRTC-122][feat] Eagle3 Specdec UX improvements (#10124 )	2026-01-22 07:24:11 -08:00
torch_compile_and_piecewise_cuda_graph.md	[TRTC-122][feat] Eagle3 Specdec UX improvements (#10124 )	2026-01-22 07:24:11 -08:00