| .. |
|
attention_backend
|
[TRTLLM-9766][feat] Integration of the KVCacheManager V2 to TRTLLM Runtime (#10659)
|
2026-02-02 14:29:02 +08:00 |
|
auto_deploy
|
[#8242][feat] Add int4 GPTQ support for AutoDeploy (#8248)
|
2026-01-30 23:07:24 -08:00 |
|
compilation
|
[TRTLLM-8821][feat] Apply AutoTuner to AllReduce Op for strategy tuning. (#8531)
|
2026-01-05 15:44:37 +08:00 |
|
configs
|
|
|
|
custom_ops
|
[TRTLLM-10398][feat] Enable TRTLLM moe backend for Nemotron Super (#10791)
|
2026-01-31 13:48:25 +08:00 |
|
cute_dsl_kernels
|
[TRTLLM-9831][perf] Use TMA.RED to improve effective memory bandwidth (#10987)
|
2026-01-27 16:15:32 +08:00 |
|
debug
|
|
|
|
disaggregation
|
[TRTLLM-9527][feat] Python transceiver components (step 2) (#10494)
|
2026-01-22 10:14:50 -08:00 |
|
distributed
|
[TRTLLM-10264][feat] Support attention DP + Helix CP (#10477)
|
2026-01-29 02:57:13 -05:00 |
|
models
|
[None][feat] Perfect routing for Deepseek models (#11127)
|
2026-01-30 23:46:35 -05:00 |
|
modules
|
[TRTLLM-10398][feat] Enable TRTLLM moe backend for Nemotron Super (#10791)
|
2026-01-31 13:48:25 +08:00 |
|
peft
|
[https://nvbugs/5322131][feat] Multi-LoRA serving with CUDA Graph (#8279)
|
2026-01-22 14:01:18 +01:00 |
|
pyexecutor
|
[TRTLLM-9766][feat] Integration of the KVCacheManager V2 to TRTLLM Runtime (#10659)
|
2026-02-02 14:29:02 +08:00 |
|
shared_tensor
|
|
|
|
speculative
|
[TRTLLM-10312][perf] Improve performance of _write_finish_reasons in TorchSampler (#10459)
|
2026-01-29 11:06:09 -05:00 |
|
__init__.py
|
|
|
|
async_llm.py
|
[TRTLLM-9736][feat] AsyncLLM and verl integ (#9353)
|
2025-12-11 09:33:25 -08:00 |
|
autotuner.py
|
[TRTLLM-10264][feat] Support attention DP + Helix CP (#10477)
|
2026-01-29 02:57:13 -05:00 |
|
cublaslt_utils.py
|
|
|
|
cute_dsl_utils.py
|
|
|
|
device_mesh.py
|
[TRTLLM-9465][fix] Swap TP-CP grouping order (#10350)
|
2026-01-05 20:08:03 +08:00 |
|
expert_statistic.py
|
|
|
|
flashinfer_utils.py
|
[TRTLLM-9578][feat] make PDL enabled by default (#9695)
|
2025-12-25 07:15:24 -05:00 |
|
hostfunc.py
|
|
|
|
llm.py
|
|
|
|
memory_buffer_utils.py
|
[https://nvbugs/5811697][fix] Fix buffer reuse. (#10716)
|
2026-01-25 18:12:21 +08:00 |
|
metadata.py
|
|
|
|
model_config.py
|
[TRTLLM-9771][feat] Allow overriding quantization configs (#11062)
|
2026-01-31 10:48:51 -05:00 |
|
utils.py
|
[TRTLLM-9771][feat] Support partial update weight for fp8 (#10456)
|
2026-01-22 14:46:05 +08:00 |
|
virtual_memory.py
|
[TRTLLM-9736][feat] AsyncLLM and verl integ (#9353)
|
2025-12-11 09:33:25 -08:00 |