TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-14 06:27:45 +08:00

History

brb-nv 9106b5d9a5 fix: Skip rope scaling for local layers in Gemma3 VLM (#5773 ) Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>		2025-07-07 13:36:23 +08:00
..
_torch	fix: Skip rope scaling for local layers in Gemma3 VLM (#5773 )	2025-07-07 13:36:23 +08:00
api_stability	Speculation: Draft Target in new FW (#4558 )	2025-06-17 02:26:08 +08:00
bindings	Solve underallocation in VSWA+/VGQA (#4667 )	2025-06-12 12:12:46 +08:00
disaggregated	Add disaggregated unittest (#4899 )	2025-06-05 19:14:31 +08:00
llmapi	[Infra] - Waive failed cases on release/0.21 (#5674 )	2025-07-02 22:23:54 -04:00
others	test: reorganize tests folder hierarchy (#2996 )	2025-03-27 12:07:53 +08:00
scaffolding	[TRTLLM-4638] feat(scaffolding): update Reward Controller to PRM specific controller with step split (#4337 )	2025-05-19 17:53:41 +08:00
tools	Enable trtllm-bench to run LoRA and add basic e2e perf testing capability for LoRA in PyT flow (#5130 )	2025-06-15 18:54:04 +03:00
trt	Mxfp8xmxfp4 quant mode(#4978 )	2025-06-10 22:01:37 +08:00
utils	chore: fix llm_root when LLM_ROOT is not set (#4741 )	2025-05-29 19:44:34 -07:00
conftest.py	[fix][test] clear cuda cache before unittests automatically (#5121 )	2025-06-19 00:36:53 +03:00
dump_checkpoint_stats.py	Update TensorRT-LLM (#2936 )	2025-03-18 21:25:19 +08:00
gc_utils.py	[nvbug 5273941] fix: broken cyclic reference detect (#5417 )	2025-06-26 07:35:35 +08:00
profile_utils.py	Update TensorRT-LLM (#2936 )	2025-03-18 21:25:19 +08:00
pytest.ini	[TRTLLM-5053] Refactoring and Unifying the Multimodal input preparation (#4506 )	2025-06-03 12:02:07 -07:00
test_model_runner_cpp.py	Update TensorRT-LLM (#2936 )	2025-03-18 21:25:19 +08:00
test_pip_install.py	relax the limitation of setuptools (#2992 )	2025-03-24 13:36:10 +08:00