TensorRT-LLMs/tests/unittest/_torch
Erin 812bc8c954
[TRTLLM-8513][feat] Add back worker extension (#8482)
Signed-off-by: Erin Ho <14718778+hchings@users.noreply.github.com>
2025-10-24 20:30:28 -04:00
..
attention [TRTLLM-8535][feat] Support DeepSeek V3.2 with FP8 + BF16 KV cache/NVFP4 + BF16 KV cache (#8405) 2025-10-24 13:40:41 -04:00
auto_deploy [None][feat] Enable rms norm fusion for Nemotron MOE (#8563) 2025-10-23 00:09:42 -04:00
compilation [TRTLLM-3105][feat] Add Piecewise CUDA Graph Support (#3804) 2025-05-09 11:04:01 +08:00
debugger Fix: fix nvbug 5356427 (#5464) 2025-06-25 22:24:26 +08:00
executor [TRTLLM-8754][chore] Refine PyTorchModelEngine with llm args (#8493) 2025-10-22 20:03:18 -04:00
misc [TRTLLM-4501][feat] Add input tensor pre-hook function API for the tuning process. (#6924) 2025-10-15 21:18:11 +08:00
modeling [TRTLLM-7954][feat] Target model KV cache rellocation (#8421) 2025-10-23 09:36:50 +08:00
models/checkpoints/hf [None][feat] Skip prefetching consolidated safetensors when appropriate (#7013) 2025-08-25 23:56:21 -04:00
modules [None][feat] Update TRTLLM MoE MxFP4 cubins; autotune tileN (#8156) 2025-10-23 09:14:18 +08:00
multi_gpu [https://nvbugs/5501820][fix] Add requirements for numba-cuda version to WAR mem corruption (#7992) 2025-10-10 10:18:27 +08:00
multi_gpu_modeling [https://nvbugs/5536131][fix] Fix illegal access issue when scale is not provided in Llama3/4. (#7960) 2025-10-16 22:46:19 +08:00
multimodal [TRTLLM-8737][feat] Support media_io_kwargs on trtllm-serve (#8528) 2025-10-24 12:53:40 -04:00
ray_orchestrator [TRTLLM-8513][feat] Add back worker extension (#8482) 2025-10-24 20:30:28 -04:00
sampler [TRTLLM-8436][feat] batched sampling and top-k logprobs improvements (#8398) 2025-10-20 11:15:41 +02:00
speculative [TRTLLM-8160][feat] Add max_total_draft_tokens (#8366) 2025-10-21 11:11:04 -04:00
thop [https://nvbugs/5451205][feat] Add cuBLASLt NVFP4 GEMM backend support (#7943) 2025-10-23 15:55:10 +08:00
helpers.py [TRTLLM-7330][feat] Eagle3 cuda graph support for the first draft model inference (#7363) 2025-09-26 11:28:05 +08:00
pattern_watcher.py [TRTLLM-3105][feat] Add Piecewise CUDA Graph Support (#3804) 2025-05-09 11:04:01 +08:00
test_connector.py [None][feat] KV Cache Connector API (#7228) 2025-08-28 23:09:27 -04:00