TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-14 06:27:45 +08:00

History

Jatin Gangani 97b38ac403 [None] [doc] Update IFB performance guide & GPTOSS deployment guide (#10283 ) Signed-off-by: Jatin Gangani <jgangani@dc2-container-xterm-014.prd.it.nvidia.com> Co-authored-by: Jatin Gangani <jgangani@dc2-container-xterm-014.prd.it.nvidia.com>		2025-12-25 05:52:04 -05:00
..
auto_deploy	[TRTC-102][docs] `--extra_llm_api_options`->`--config` in docs/examples/tests (#10005 )	2025-12-19 13:48:43 -05:00
additional-outputs.md	[TRTLLM-7159][docs] Add documentation for additional outputs (#8325 )	2025-10-27 09:52:04 +01:00
attention.md	[None][doc] Rename TensorRT-LLM to TensorRT LLM. (#7554 )	2025-09-09 12:16:03 +08:00
checkpoint-loading.md	[TRTLLM-7136][feat] Update load_weights method to include mapping parameter in checkpoint loaders (#9583 )	2025-12-05 16:07:20 +01:00
disagg-serving.md	[None][docs] Add NIXL-Libfabric Usage to Documentation (#10205 )	2025-12-23 23:05:40 -05:00
feature-combination-matrix.md	[None][chore] Update feature combination matrix for SWA kv cache reuse (#8529 )	2025-10-21 04:41:44 -04:00
guided-decoding.md	[TRTC-102][docs] `--extra_llm_api_options`->`--config` in docs/examples/tests (#10005 )	2025-12-19 13:48:43 -05:00
helix.md	[None][doc] Add feature docs for helix parallelism (#9684 )	2025-12-04 18:08:40 -08:00
kv-cache-connector.md	[TRTLLM-9199][docs] KV Connector Docs (#9325 )	2025-12-05 17:50:12 -05:00
kvcache.md	[None][doc] Rename TensorRT-LLM to TensorRT LLM for homepage and the … (#7850 )	2025-09-25 21:02:35 +08:00
long-sequence.md	[None][doc] Add the missing content for model support section and fix valid links for long_sequence.md (#8869 )	2025-11-03 02:06:04 -08:00
lora.md	[TRTC-102][docs] `--extra_llm_api_options`->`--config` in docs/examples/tests (#10005 )	2025-12-19 13:48:43 -05:00
multi-modality.md	[None][fix] add missing CLI option in multimodal example (#8977 )	2025-11-07 09:06:08 +01:00
overlap-scheduler.md	[TRTLLM-5930][doc] 1.0 Documentation. (#6696 )	2025-09-09 12:16:03 +08:00
paged-attention-ifb-scheduler.md	[None] [doc] Update IFB performance guide & GPTOSS deployment guide (#10283 )	2025-12-25 05:52:04 -05:00
parallel-strategy.md	[TRTC-102][docs] `--extra_llm_api_options`->`--config` in docs/examples/tests (#10005 )	2025-12-19 13:48:43 -05:00
quantization.md	[https://nvbugs/5729847 ][doc] fix broken links to modelopt (#9868 )	2025-12-16 13:33:20 -05:00
ray-orchestrator.md	[None][doc] Ray orchestrator initial doc (#8373 )	2025-10-14 21:17:57 -07:00
sampling.md	[TRTLLM-9157][doc] Guided decoding doc improvement (#9359 )	2025-12-05 17:50:12 -05:00
sparse-attention.md	[None][doc] Add Sparse Attention feature doc (#9648 )	2025-12-25 00:26:18 -05:00
speculative-decoding.md	[TRTC-102][docs] `--extra_llm_api_options`->`--config` in docs/examples/tests (#10005 )	2025-12-19 13:48:43 -05:00
torch_compile_and_piecewise_cuda_graph.md	[TRTC-102][docs] `--extra_llm_api_options`->`--config` in docs/examples/tests (#10005 )	2025-12-19 13:48:43 -05:00