..
attention_backend
feat: Custom masking utils for Gemma3 VLM ( #5853 )
2025-07-10 06:18:04 +09:00
auto_deploy
[refactor] Simplification of Speculative decoding configs ( #5639 )
2025-07-10 11:37:30 -04:00
compilation
[feat] Piecewise cuda graph support for MLA ( #4467 )
2025-06-17 18:58:38 +08:00
custom_ops
[TRTLLM-6100] fix: Nvbug 5356427: autotuned TRTLLM Gen fp8 block scale MoE illegal memory access ( #5676 )
2025-07-14 17:17:30 +08:00
debug
Add debug hook to support dump tensor data and add new debug functions easily ( #5182 )
2025-06-24 17:45:28 +08:00
distributed
[feat] Support torch compile for attention dp ( #5086 )
2025-07-01 13:48:52 -04:00
models
[refactor] Move vision parts from processor to model for Gemma3 ( #5888 )
2025-07-11 15:13:51 -07:00
modules
[NvBug 5378370] fix: Fix alltoall for llama4 (apply_router_weight_on_input=True) ( #5902 )
2025-07-12 15:50:31 +09:00
peft
feat: support multi lora adapters and TP ( #3885 )
2025-05-08 23:45:45 +08:00
pyexecutor
[nvbug/5337601][fix] Fix disagg + speculative decoding ( #5558 )
2025-07-14 17:17:30 +08:00
shared_tensor
[1/N][TRTLLM-5195][feat] Share PyTorch tensor between processes ( #5396 )
2025-07-10 05:12:53 +09:00
speculative
[fix] Remove SpecConfig and fix thread leak issues ( #5931 )
2025-07-12 21:03:24 +09:00
__init__.py
[TRTLLM-5208][BREAKING CHANGE] chore: make pytorch LLM the default ( #5312 )
2025-06-20 03:01:10 +08:00
autotuner.py
[TRTLLM-5770] feat: Integrate TRT-LLM Gen FP8 block scale MoE with Pytorch workflow kernel autotuner ( #5207 )
2025-06-17 21:01:56 +08:00
expert_statistic.py
Add MTP support for Online EPLB ( #5213 )
2025-06-25 07:58:13 +08:00
llm.py
[TRTLLM-5208][BREAKING CHANGE] chore: make pytorch LLM the default ( #5312 )
2025-06-20 03:01:10 +08:00
metadata.py
feat: no-cache attention in PyTorch workflow ( #3085 )
2025-04-05 01:54:32 +08:00
model_config.py
[refactor] Simplification of Speculative decoding configs ( #5639 )
2025-07-10 11:37:30 -04:00
utils.py
[feat] Support torch compile for attention dp ( #5086 )
2025-07-01 13:48:52 -04:00