TensorRT-LLMs/examples
Yuxian Qiu 3b3069b390
[https://nvbugs/5747930][fix] Use offline tokenizer for whisper models. (#10121)
Signed-off-by: Yuxian Qiu <142763828+yuxianq@users.noreply.github.com>
2025-12-20 09:42:07 +08:00
..
apps
auto_deploy [#9640][feat] Migrate model registry to v2.0 format with composable configs (#9836) 2025-12-19 05:30:02 -08:00
bindings/executor
configs [None][fix] enable KV cache reuse for config database (#10094) 2025-12-19 15:16:56 -08:00
cpp/executor [TRTLLM-9197][infra] Move thirdparty stuff to it's own listfile (#8986) 2025-11-20 16:44:23 -08:00
cpp_library
disaggregated [TRTC-102][docs] --extra_llm_api_options->--config in docs/examples/tests (#10005) 2025-12-19 13:48:43 -05:00
dora
draft_target_model
eagle
infinitebench
language_adapter
layer_wise_benchmarks [TRTLLM-9615][feat] Implement a distributed tuning system (#9621) 2025-12-15 21:08:53 +08:00
llm-api [TRTC-102][docs] --extra_llm_api_options->--config in docs/examples/tests (#10005) 2025-12-19 13:48:43 -05:00
llm-eval/lm-eval-harness
longbench [None] [feat] Optimize the algorithm part of RocketKV (#9333) 2025-12-01 09:04:09 +08:00
lookahead
medusa [OMNIML-3036][doc] Re-branding TensorRT-Model-Optimizer as Nvidia Model-Optimizer (#9679) 2025-12-07 07:14:05 -08:00
models [TRTC-102][docs] --extra_llm_api_options->--config in docs/examples/tests (#10005) 2025-12-19 13:48:43 -05:00
ngram
openai_triton
opentelemetry [None][chore] Change trt-server to trtlllm-server in opentelemetry readme (#9173) 2025-11-17 22:02:24 -08:00
python_plugin
quantization [OMNIML-3036][doc] Re-branding TensorRT-Model-Optimizer as Nvidia Model-Optimizer (#9679) 2025-12-07 07:14:05 -08:00
ray_orchestrator [TRTC-102][docs] --extra_llm_api_options->--config in docs/examples/tests (#10005) 2025-12-19 13:48:43 -05:00
redrafter
sample_weight_stripping [None][chore] Weekly mass integration of release/1.1 -- rebase (#9522) 2025-11-29 21:48:48 +08:00
scaffolding
serve [TRTC-102][docs] --extra_llm_api_options->--config in docs/examples/tests (#10005) 2025-12-19 13:48:43 -05:00
sparse_attention [TRTC-102][docs] --extra_llm_api_options->--config in docs/examples/tests (#10005) 2025-12-19 13:48:43 -05:00
trtllm-eval
wide_ep [TRTC-102][docs] --extra_llm_api_options->--config in docs/examples/tests (#10005) 2025-12-19 13:48:43 -05:00
__init__.py [TRTC-102][docs] --extra_llm_api_options->--config in docs/examples/tests (#10005) 2025-12-19 13:48:43 -05:00
constraints.txt [None][chore] bump version to 1.2.0rc6 (#9874) 2025-12-10 04:53:26 -08:00
eval_long_context.py
generate_checkpoint_config.py
generate_xgrammar_tokenizer_info.py
hf_lora_convert.py
mmlu.py [https://nvbugs/4141427][chore] Add more details to LICENSE file (#9881) 2025-12-13 08:35:31 +08:00
run.py
summarize.py [None][feat] Support ignored prompt length for penalties via new sampling config parameter (#8127) 2025-10-27 13:12:31 -04:00
utils.py [https://nvbugs/5747930][fix] Use offline tokenizer for whisper models. (#10121) 2025-12-20 09:42:07 +08:00