TensorRT-LLMs/tests/unittest/_torch
Jin Li 028235404b
[TRTLLM-6633][feat] Padding for piecewise cudagraph (#6750)
Signed-off-by: Jin Li <59594262+liji-nv@users.noreply.github.com>
2025-08-26 18:31:33 -04:00
..
attention [None][ci] move unittests to sub-directories (#6635) 2025-08-20 05:42:22 -04:00
auto_deploy [None][doc] Update autodeploy README.md, deprecate lm_eval in examples folder (#7233) 2025-08-26 10:47:57 -07:00
compilation [TRTLLM-3105][feat] Add Piecewise CUDA Graph Support (#3804) 2025-05-09 11:04:01 +08:00
debugger Fix: fix nvbug 5356427 (#5464) 2025-06-25 22:24:26 +08:00
executor fix/improve kvcache allocation in PyTorch runtime (#5933) 2025-08-26 12:40:22 +08:00
misc [None][perf] Make finalize fusion part of the tactic selection logic (#6915) 2025-08-21 14:08:03 -07:00
modeling [None][refactor] refactor the CUDA graph runner to manage all CUDA graphs (#6846) 2025-08-25 20:52:05 +08:00
models/checkpoints/hf [None][feat] Skip prefetching consolidated safetensors when appropriate (#7013) 2025-08-25 23:56:21 -04:00
modules [TRTLLM-7346][fix] Improve performance of PyTorchModelEngine._get_lora_params_from_requests (#7033) 2025-08-25 10:37:40 +03:00
multi_gpu [None][ci] move unittests to sub-directories (#6635) 2025-08-20 05:42:22 -04:00
multi_gpu_modeling [None][fix] Fix llama4 multimodal by skipping request validation (#6957) 2025-08-20 21:58:53 -04:00
multimodal [TRTLLM-7326][feat] Add standalone multimodal encoder (#6743) 2025-08-19 21:42:50 -07:00
sampler [TRTLLM-7155][feat] Unify sampler handle logits implementation. (#6867) 2025-08-22 08:09:30 +02:00
speculative [None][feat] Deepseek: Start Eagle work (#6210) 2025-08-22 12:57:17 -04:00
thop [TRTLLM-6633][feat] Padding for piecewise cudagraph (#6750) 2025-08-26 18:31:33 -04:00
helpers.py [None][refactor] refactor the CUDA graph runner to manage all CUDA graphs (#6846) 2025-08-25 20:52:05 +08:00
pattern_watcher.py [TRTLLM-3105][feat] Add Piecewise CUDA Graph Support (#3804) 2025-05-09 11:04:01 +08:00