TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-02-05 18:51:38 +08:00

History

shuyixiong fd2af8d58a [TRTLLM-9771][feat] Support partial update weight for fp8 (#10456 ) Signed-off-by: Shuyi Xiong <219646547+shuyixiong@users.noreply.github.com> Signed-off-by: shuyixiong <219646547+shuyixiong@users.noreply.github.com>		2026-01-22 14:46:05 +08:00
..
_torch	[TRTLLM-9771][feat] Support partial update weight for fp8 (#10456 )	2026-01-22 14:46:05 +08:00
api_stability	[#8241 ][feat] Support model_kwargs for pytorch backend (#10351 )	2026-01-21 20:51:38 -08:00
bindings	[TRTLLM-9527][feat] Add transferAgent binding (step 1) (#10113 )	2026-01-06 08:40:38 +08:00
disaggregated	[TRTLLM-10059][feat] Use global unique id as disagg request id (#10187 )	2026-01-21 22:52:34 -05:00
executor	[https://nvbugs/5720482 ][fix] Fix test rpc streaming (#9902 )	2025-12-13 01:14:43 -08:00
llmapi	[TRTLLM-10154][feat] Enable guided decoding with reasoning parsers (#10890 )	2026-01-22 14:14:28 +08:00
others	[TRTLLM-9581][infra] Use /home/scratch.trt_llm_data_ci in computelab (#10616 )	2026-01-19 00:40:40 -05:00
scaffolding	[None][feat] Refactor scaffolding streaming feature and fix openai wo… (#8622 )	2025-10-30 16:02:40 +08:00
tools	[None][feat] Layer-wise benchmarks: make model init more general and support weights loading (#10562 )	2026-01-13 19:17:03 +08:00
trt	[TRTLLM-8682][chore] Remove auto_parallel module (#8329 )	2025-10-22 20:53:08 -04:00
utils	[TRTLLM-9771][feat] Support partial update weight for fp8 (#10456 )	2026-01-22 14:46:05 +08:00
conftest.py	[TRTLLM-9737][chore] Add rl perf reproduce script and enhance the robustness of Ray tests (#9939 )	2025-12-24 15:27:01 +08:00
dump_checkpoint_stats.py	Update TensorRT-LLM (#2936 )	2025-03-18 21:25:19 +08:00
gc_utils.py	[nvbug 5273941] fix: broken cyclic reference detect (#5417 )	2025-07-01 20:12:55 +08:00
profile_utils.py	Update TensorRT-LLM (#2936 )	2025-03-18 21:25:19 +08:00
pytest.ini	[TRTLLM-9181][feat] improve disagg-server prometheus metrics; synchronize workers' clocks when workers are dynamic (#9726 )	2025-12-16 05:16:32 -08:00
test_model_runner_cpp.py	Update TensorRT-LLM (#2936 )	2025-03-18 21:25:19 +08:00
test_pip_install.py	[https://nvbugs/5616189 ][fix] Make more cases use local cached models (#8935 )	2025-11-11 03:14:05 -08:00