TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-31 08:11:27 +08:00

History

Xiwen Yu 019b1db438 fix 5505835 Signed-off-by: Xiwen Yu <13230610+VALLIS-NERIA@users.noreply.github.com>		2025-09-08 14:52:00 +08:00
..
attention	[https://nvbugs/5453806 ][unwaive] Unwaive fp8 kvcache attention test (#7243 )	2025-09-05 12:13:57 -04:00
auto_deploy	fix 5505835	2025-09-08 14:52:00 +08:00
compilation	[TRTLLM-3105][feat] Add Piecewise CUDA Graph Support (#3804 )	2025-05-09 11:04:01 +08:00
debugger	Fix: fix nvbug 5356427 (#5464 )	2025-06-25 22:24:26 +08:00
executor	[TRTLLM-7353][feat] Implement capturable drafting loops for speculation (#7100 )	2025-09-01 14:37:44 -04:00
misc	Merge remote-tracking branch 'gitlab/main' into user/xiweny/merge_main_0819	2025-08-23 16:13:30 +08:00
modeling	remove waivers and cleanup	2025-09-08 10:24:52 +08:00
models/checkpoints/hf	[None][feat] Skip prefetching consolidated safetensors when appropriate (#7013 )	2025-08-25 23:56:21 -04:00
modules	Add B300 & GB300 CI	2025-09-05 15:29:50 +08:00
multi_gpu	update flashinfer and waive bug	2025-09-05 15:09:25 +08:00
multi_gpu_modeling	[None][fix] Fix llama4 multimodal by skipping request validation (#6957 )	2025-08-20 21:58:53 -04:00
multimodal	[TRTLLM-7410][feat] Support hashing and KV cache reuse for videos (#7360 )	2025-09-04 14:39:23 -04:00
sampler	[TRTLLM-7155][feat] Unify sampler handle logits implementation. (#6867 )	2025-08-22 08:09:30 +02:00
speculative	[None][ci] Revert "[https://nvbugs/5461761 ][fix] Remove the waiver (#7476 )" (#7584 )	2025-09-05 22:02:09 -07:00
thop	Merge remote-tracking branch 'origin/main' into feat/b300_cu13	2025-09-05 15:53:43 +08:00
helpers.py	[None][chore] share input_ids buffers among different cuda graphs (#7236 )	2025-09-06 17:49:42 -04:00
pattern_watcher.py	[TRTLLM-3105][feat] Add Piecewise CUDA Graph Support (#3804 )	2025-05-09 11:04:01 +08:00
test_connector.py	[None][feat] KV Cache Connector API (#7228 )	2025-08-28 23:09:27 -04:00