TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-02-07 03:31:58 +08:00

Author	SHA1	Message	Date
QI JUN	34a6d2d28f	[TRTLLM-9302][chore] Move build config from BaseLlmArgs to TrtLlmArgs (#9249 ) Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>	2025-11-24 10:54:41 +08:00
Anish Shanbhag	a09b38a862	[TRTLLM-8684][chore] Migrate BuildConfig to Pydantic, add a Python wrapper for KVCacheType enum (#8330 ) Signed-off-by: Anish Shanbhag <ashanbhag@nvidia.com>	2025-10-28 09:17:26 -07:00
QI JUN	6ee1c87595	[TRTLLM-8817][chore] Set default value of KvCacheConfig.free_gpu_memory_fraction explicitly (#8561 ) Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>	2025-10-24 08:55:49 +08:00
Anish Shanbhag	15de45d782	[TRTLLM-8682][chore] Remove auto_parallel module (#8329 ) Signed-off-by: Anish Shanbhag <ashanbhag@nvidia.com>	2025-10-22 20:53:08 -04:00
Anish Shanbhag	5ff4f88be6	[TRTLLM-8683][chore] Migrate PluginConfig to Pydantic (#8277 ) Signed-off-by: Anish Shanbhag <ashanbhag@nvidia.com>	2025-10-17 16:13:22 -04:00
Leslie Fang	870cfcf9a0	[None][chore] Remove executor config in create_py_executor (#7599 ) Signed-off-by: leslie-fang25 <leslief@nvidia.com>	2025-09-18 14:24:58 +08:00
Leslie Fang	20922b7d1f	[None][chore] Create PyExecutor from TorchLlmArgs Part 1 (#7105 ) Signed-off-by: leslie-fang25 <leslief@nvidia.com>	2025-08-26 10:42:01 +08:00
Yanchao Lu	d26a5a93ad	[https://nvbugs/5451296 ][bug] Cherry-pick #7017 from release/1.0 branch (#7043 ) Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com> Co-authored-by: Iman Tabrizian <10105175+Tabrizian@users.noreply.github.com>	2025-08-19 11:25:05 -04:00
Shi Xiaowei	1095dfd03c	[None][fix] BREAKING CHANGE: Mismatch between docs and actual commands (#6323 )	2025-08-14 03:48:57 -04:00
pcastonguay	453a06e6ab	[TRTLLM-6881][feat] Include attention dp rank info with KV cache events (#6563 ) Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com>	2025-08-07 14:17:07 +02:00
amitz-nv	1ee7a08d2b	[5830][feat] Improve LoRA cache memory control (#6220 ) Signed-off-by: Amit Zuker <203509407+amitz-nv@users.noreply.github.com>	2025-07-31 09:26:38 +03:00
Yan Chunwei	ad662ddcdd	chore: disallow arbitrary in llm_args.Configs (#6367 ) Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>	2025-07-29 16:16:52 -04:00
nv-guomingz	9b45499caa	test: update max_beam_width to 1 due to torchsampler changes. (#6101 ) Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>	2025-07-17 18:05:45 +08:00
Wanli Jiang	9354114f68	fix: Update trtllm args issues with extra nested config (#5996 ) Signed-off-by: Wanli Jiang <35160485+Wanli-Jiang@users.noreply.github.com>	2025-07-16 12:41:45 -04:00
nv-guomingz	4e4d18826f	chore: [Breaking Change] Rename cuda_graph_config padding_enabled fie… (#6003 ) Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>	2025-07-15 15:50:03 +09:00
nv-guomingz	0be41b6524	Revert "chore: [Breaking Change] Rename cuda_graph_config padding_enabled fie…" (#5818 )	2025-07-08 13:15:30 +09:00
nv-guomingz	5a8173c121	chore: [Breaking Change] Rename cuda_graph_config padding_enabled fie… (#5795 ) Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>	2025-07-08 08:52:36 +08:00
Yan Chunwei	2d69b55fe8	chore: enhance yaml loading arbitrary options in LlmArgs (#5610 ) Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>	2025-07-02 14:21:37 +08:00
nv-guomingz	6e48ac25a6	chore: remove cuda_graph_ prefix from cuda_graph_config filed members. (#5585 ) Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>	2025-06-30 12:23:14 -04:00
nv-guomingz	578430e64c	[TRTLLM-5530][BREAKING CHANGE]: enhance the llm args pytorch config part 1(cuda_graph_config) (#5014 ) Signed-off-by: nv-guomingz <137257613+nv-guomingz@users.noreply.github.com>	2025-06-30 11:05:40 +08:00
Yan Chunwei	9bd42ecf9b	[TRTLLM-5208][BREAKING CHANGE] chore: make pytorch LLM the default (#5312 ) Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>	2025-06-20 03:01:10 +08:00
Yan Chunwei	c84e41fd9d	fix: build_config in TorchLlmArgs and avoid arbitrary args (#4972 ) Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>	2025-06-15 17:51:56 -07:00
QI JUN	b8c5e3892b	Revert "fix: build_config in TorchLlmArgs and avoid invalid args" (#4949 ) Signed-off-by: QI JUN <22017000+QiJune@users.noreply.github.com>	2025-06-05 17:43:30 +08:00
Yan Chunwei	ac20159d32	fix: build_config in TorchLlmArgs and avoid invalid args (#4600 ) Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>	2025-06-04 13:17:29 +08:00
Yan Chunwei	5506f60037	chore [BREAKING CHANGE]: Flatten PyTorchConfig knobs into TorchLlmArgs (#4603 ) Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>	2025-05-28 18:43:04 +08:00
xiweny	6979afa6f2	test: reorganize tests folder hierarchy (#2996 ) 1. move TRT path tests to 'trt' folder 2. optimize some import usage	2025-03-27 12:07:53 +08:00
Yan Chunwei	531b98ed62	feat: Add several pure python configs to LlmArgs (#2997 ) * add SchedulerConfig * add PeftCacheConfig	2025-03-24 16:16:17 +08:00
Kaiyu Xie	2631f21089	Update (#2978 ) Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>	2025-03-23 16:39:35 +08:00

28 Commits