..
attention_backend
feat: TRTLLM-5574 Add phi-4-multimodal pytorch-backend support ( #5644 )
2025-07-17 06:30:58 +08:00
auto_deploy
[refactor] Simplification of Speculative decoding configs ( #5639 )
2025-07-10 11:37:30 -04:00
compilation
[feat] Piecewise cuda graph support for MLA ( #4467 )
2025-06-17 18:58:38 +08:00
custom_ops
[TRTLLM-6100] fix: Nvbug 5356427: autotuned TRTLLM Gen fp8 block scale MoE illegal memory access ( #5676 )
2025-07-14 17:17:30 +08:00
debug
Add debug hook to support dump tensor data and add new debug functions easily ( #5182 )
2025-06-24 17:45:28 +08:00
distributed
[feat] Support torch compile for attention dp ( #5086 )
2025-07-01 13:48:52 -04:00
models
[fix] Performance Optimization for MNNVL TwoShot Kernel ( #5934 )
2025-07-17 10:49:51 +08:00
modules
test: Update Llama4 Scout FP4 & FP8 accuracy tests ( #5901 )
2025-07-17 09:41:18 +08:00
peft
feat: support multi lora adapters and TP ( #3885 )
2025-05-08 23:45:45 +08:00
pyexecutor
[fix] Release slots with spec decode + disagg ( #5975 ) ( #6032 )
2025-07-17 12:58:18 +08:00
shared_tensor
[1/N][TRTLLM-5195][feat] Share PyTorch tensor between processes ( #5396 )
2025-07-10 05:12:53 +09:00
speculative
[refactor] Clean up drafter/resource manager creation logic ( #5805 )
2025-07-16 12:45:46 -07:00
__init__.py
[TRTLLM-5493] Add core infrastructure to enable loading of custom checkpoint formats ( #5372 )
2025-07-17 00:50:30 +08:00
autotuner.py
[TRTLLM-5770] feat: Integrate TRT-LLM Gen FP8 block scale MoE with Pytorch workflow kernel autotuner ( #5207 )
2025-06-17 21:01:56 +08:00
expert_statistic.py
Add MTP support for Online EPLB ( #5213 )
2025-06-25 07:58:13 +08:00
llm.py
[TRTLLM-5208][BREAKING CHANGE] chore: make pytorch LLM the default ( #5312 )
2025-06-20 03:01:10 +08:00
metadata.py
feat: no-cache attention in PyTorch workflow ( #3085 )
2025-04-05 01:54:32 +08:00
model_config.py
Fix: Enhance ModelConfig for kv cache size calculations ( #5868 )
2025-07-16 14:41:31 -07:00
utils.py
chore: Cleanup disable_fp4_allgather. ( #6006 )
2025-07-16 17:54:36 +08:00