..
attention_backend
feat: Add support for YARN in NemotronNAS models ( #4906 )
2025-06-29 09:45:49 +03:00
auto_deploy
Reintroduce with perf fixes: feature: unify new_tokens format sample state to trtllm samper tokens format ( #5513 )
2025-06-30 11:58:59 -07:00
compilation
[feat] Piecewise cuda graph support for MLA ( #4467 )
2025-06-17 18:58:38 +08:00
custom_ops
Make moe permute and final as custom op ( #5412 )
2025-06-27 15:48:33 -07:00
debug
Add debug hook to support dump tensor data and add new debug functions easily ( #5182 )
2025-06-24 17:45:28 +08:00
distributed
Feat/ds r1 min latency opt round3, add router gemm, fused a gemm, PDL ( #4560 )
2025-06-14 17:36:22 +08:00
models
feat : support duplicate_kv_weight for qwen3 blockwise scale ( #5459 )
2025-06-30 11:49:22 +08:00
modules
refactor: [TRTLLM-6150] Refactor moe permute and finalize op by removing duplicated code ( #5557 )
2025-06-30 08:48:04 -07:00
peft
feat: support multi lora adapters and TP ( #3885 )
2025-05-08 23:45:45 +08:00
pyexecutor
Reintroduce with perf fixes: feature: unify new_tokens format sample state to trtllm samper tokens format ( #5513 )
2025-06-30 11:58:59 -07:00
speculative
Reintroduce with perf fixes: feature: unify new_tokens format sample state to trtllm samper tokens format ( #5513 )
2025-06-30 11:58:59 -07:00
__init__.py
[TRTLLM-5208][BREAKING CHANGE] chore: make pytorch LLM the default ( #5312 )
2025-06-20 03:01:10 +08:00
autotuner.py
[TRTLLM-5770] feat: Integrate TRT-LLM Gen FP8 block scale MoE with Pytorch workflow kernel autotuner ( #5207 )
2025-06-17 21:01:56 +08:00
expert_statistic.py
Add MTP support for Online EPLB ( #5213 )
2025-06-25 07:58:13 +08:00
llm.py
[TRTLLM-5208][BREAKING CHANGE] chore: make pytorch LLM the default ( #5312 )
2025-06-20 03:01:10 +08:00
metadata.py
feat: no-cache attention in PyTorch workflow ( #3085 )
2025-04-05 01:54:32 +08:00
model_config.py
[TRTLLM-5825][fix] Fix torch LoRA TP ( #5338 )
2025-06-19 09:12:00 +03:00
utils.py
perf: Avoid reswizzle_sf after allgather. ( #5504 )
2025-06-29 21:25:50 +08:00