TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-02-12 14:03:48 +08:00

History

Yiqing Yan 72dd6b1929 [None][chore] Bump version to 1.1.0rc2.post2 (#7582 ) Signed-off-by: Yiqing Yan <yiqingy@nvidia.com>		2025-09-07 23:09:48 +08:00
..
apps
auto_deploy	[None][doc] Update autodeploy README.md, deprecate lm_eval in examples folder (#7233 )	2025-08-26 10:47:57 -07:00
bindings/executor
cpp/executor	[TRTLLM-7030][fix] Refactor the example doc of dist-serving (#6766 )	2025-08-13 17:39:27 +08:00
cpp_library
disaggregated	[None] [fix] Minor fixes to slurm and benchmark scripts (#7453 )	2025-09-02 01:57:03 -04:00
dora
draft_target_model
eagle	doc: remove the outdated features which marked as Experimental (#5995 )	2025-08-06 22:01:42 -04:00
infinitebench
language_adapter
llm-api	[None][feat] KV Cache Connector API (#7228 )	2025-08-28 23:09:27 -04:00
llm-eval/lm-eval-harness
lookahead
medusa
models	[TRTLLM-7207][feat] Chat completions API for gpt-oss (#7261 )	2025-08-28 10:22:06 +08:00
ngram
openai_triton
python_plugin
quantization	[#6530 ][fix] Fix script when using calibration tensors from modelopt (#6803 )	2025-08-12 20:41:10 -07:00
redrafter
sample_weight_stripping	doc: remove the outdated features which marked as Experimental (#5995 )	2025-08-06 22:01:42 -04:00
scaffolding	[None][fix] fix scaffolding dynasor test (#7070 )	2025-08-20 15:20:46 +08:00
serve
trtllm-eval
wide_ep	[TRTLLM-7008][fix] Add automatic shared memory delete if already exist (#7377 )	2025-09-03 12:44:06 -04:00
constraints.txt	[None][chore] Bump version to 1.1.0rc2.post2 (#7582 )	2025-09-07 23:09:48 +08:00
eval_long_context.py
generate_checkpoint_config.py
generate_xgrammar_tokenizer_info.py
hf_lora_convert.py
mmlu.py
run.py
summarize.py
utils.py