TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-29 23:23:48 +08:00

History

Jin Li d49374bc45 [TRTLLM-7408][feat] Wrap MOE with custom op. (#7277 ) Signed-off-by: Jin Li <59594262+liji-nv@users.noreply.github.com>		2025-09-09 12:18:56 -04:00
..
attention	[https://nvbugs/5453806 ][unwaive] Unwaive fp8 kvcache attention test (#7243 )	2025-09-05 12:13:57 -04:00
auto_deploy	[None][feat] Extend VLM factory and add Mistral3 factory (#7583 )	2025-09-09 02:47:18 -04:00
compilation	[TRTLLM-3105][feat] Add Piecewise CUDA Graph Support (#3804 )	2025-05-09 11:04:01 +08:00
debugger	Fix: fix nvbug 5356427 (#5464 )	2025-06-25 22:24:26 +08:00
executor	[TRTLLM-7353][feat] Implement capturable drafting loops for speculation (#7100 )	2025-09-01 14:37:44 -04:00
misc	[None][perf] Make finalize fusion part of the tactic selection logic (#6915 )	2025-08-21 14:08:03 -07:00
modeling	[None][ci] remove unnecessary test_modeling_deepseek.py (#7542 )	2025-09-04 20:05:27 -07:00
models/checkpoints/hf	[None][feat] Skip prefetching consolidated safetensors when appropriate (#7013 )	2025-08-25 23:56:21 -04:00
modules	[TRTLLM-7408][feat] Wrap MOE with custom op. (#7277 )	2025-09-09 12:18:56 -04:00
multi_gpu	[None][ci] add DGX_H100-2_GPUs-PyTorch-Others-1 pipeline (#7629 )	2025-09-09 11:06:32 -04:00
multi_gpu_modeling	[None][chore] Mass integration of release/1.0 - 3rd (#7519 )	2025-09-08 14:03:04 +08:00
multimodal	[None][feat] Update multimodal utility `get_num_tokens_per_image` for better generalization (#7544 )	2025-09-08 07:42:46 -04:00
sampler	[TRTLLM-7155][feat] Unify sampler handle logits implementation. (#6867 )	2025-08-22 08:09:30 +02:00
speculative	[https://nvbugs/5502352 ][fix] Fix 2-model CDL path (#7543 )	2025-09-06 23:53:27 -04:00
thop	[OMNIML-2336][feat] Add NVFP4 x FP8 (#6809 )	2025-09-04 09:03:38 -07:00
helpers.py	[None][chore] share input_ids buffers among different cuda graphs (#7236 )	2025-09-06 17:49:42 -04:00
pattern_watcher.py	[TRTLLM-3105][feat] Add Piecewise CUDA Graph Support (#3804 )	2025-05-09 11:04:01 +08:00
test_connector.py	[None][feat] KV Cache Connector API (#7228 )	2025-08-28 23:09:27 -04:00
test_torch_sampler.py	[TRTLLM-7153] [feat] Move stop_criteria to sample_async (#7041 )	2025-09-07 17:36:49 +03:00