TensorRT-LLMs/examples
amirkl94 e04f6a1b9b
fix: Fix p-tuning test bug (#3326)
* fix: Fix p-tuning test bug

* A change in the vocab_size calculation for T5Tokenizer,
introduced in transformers version 4.34, caused addition of incorrect vtokens for ptuning.
In general, instead of adding tokens which are outside the vocabulary, tokens inside the vocabulary were added.

Signed-off-by: Amir Klein <203507526+amirkl94@users.noreply.github.com>
2025-04-08 17:14:00 +08:00
..
apps doc: refactor trtllm-serve examples and doc (#3187) 2025-04-04 11:40:43 +08:00
auto_deploy chore: remove usernames from comments (#3291) 2025-04-05 13:44:28 +08:00
bert Update TensorRT-LLM (#2582) 2024-12-16 21:50:47 -08:00
bindings/executor Update TensorRT-LLM (#2582) 2024-12-16 21:50:47 -08:00
commandr Update TensorRT-LLM (#2849) 2025-03-04 18:44:00 +08:00
cpp/executor Update TensorRT-LLM (#2873) 2025-03-11 21:13:42 +08:00
cpp_library Update TensorRT-LLM (#1274) 2024-03-12 18:15:52 +08:00
deepseek_v3 feat: enable DeepGEMM by default (#3341) 2025-04-08 13:58:57 +08:00
disaggregated update readme for disaggregated (#3323) 2025-04-07 21:29:15 +08:00
dora Update TensorRT-LLM (#2755) 2025-02-11 03:01:00 +00:00
draft_target_model Update TensorRT-LLM (#2849) 2025-03-04 18:44:00 +08:00
eagle Update TensorRT-LLM (#2849) 2025-03-04 18:44:00 +08:00
enc_dec doc: use alert formatting (#3153) 2025-03-31 07:30:52 +08:00
exaone Add EXAONE-Deep (#3054) 2025-03-26 14:24:04 +08:00
gemma Update (#2978) 2025-03-23 16:39:35 +08:00
glm-4-9b chore: clean some ci of qa test (#3083) 2025-03-31 14:30:41 +08:00
gpt fix: GPT-Next convert failure (#3220) 2025-04-02 17:14:39 +08:00
granite Update TensorRT-LLM (#2755) 2025-02-11 03:01:00 +00:00
infinitebench Update TensorRT-LLM (#1725) 2024-06-04 20:26:32 +08:00
internlm2 Update TensorRT-LLM (#2755) 2025-02-11 03:01:00 +00:00
language_adapter Update TensorRT-LLM (#2873) 2025-03-11 21:13:42 +08:00
llama fix: fix for cp > kvHeadNum (#3002) 2025-03-26 12:39:02 +08:00
llm-api feat: use cudaMalloc to allocate kvCache (#3303) 2025-04-08 10:59:14 +08:00
llm-eval/lm-eval-harness Update (#2978) 2025-03-23 16:39:35 +08:00
lookahead Update TensorRT-LLM (#2849) 2025-03-04 18:44:00 +08:00
mamba Update TensorRT-LLM (#2849) 2025-03-04 18:44:00 +08:00
medusa Update TensorRT-LLM (#2849) 2025-03-04 18:44:00 +08:00
mixtral Update TensorRT-LLM (#2849) 2025-03-04 18:44:00 +08:00
mllama Update TensorRT-LLM (#2582) 2024-12-16 21:50:47 -08:00
models/contrib chore: clean some ci of qa test (#3083) 2025-03-31 14:30:41 +08:00
multimodal test: add random image test for llama-3.2-11b-vision (#3055) 2025-03-26 15:38:16 +08:00
nemotron Update TensorRT-LLM (#2849) 2025-03-04 18:44:00 +08:00
nemotron_nas Update TensorRT-LLM (#2562) 2024-12-11 00:31:05 -08:00
openai_triton Update TensorRT-LLM (#2792) 2025-02-18 21:27:39 +08:00
phi Add support for Phi-4-mini (#2990) 2025-04-02 08:34:39 +08:00
prompt_lookup Update TensorRT-LLM (#2873) 2025-03-11 21:13:42 +08:00
python_plugin Update TensorRT-LLM (#2755) 2025-02-11 03:01:00 +00:00
pytorch chore: refactor the LlmArgs with Pydantic and migrate remaining pybinding configs to python (#3025) 2025-04-05 13:31:48 +08:00
quantization Update README.md (#2862) 2025-03-24 13:46:09 +08:00
qwen fix: fix for cp > kvHeadNum (#3002) 2025-03-26 12:39:02 +08:00
qwen2audio chore: Handle qwen2audio inputs ids expansion during processing (#3080) 2025-03-26 15:00:27 +08:00
qwenvl Update TensorRT-LLM (#2849) 2025-03-04 18:44:00 +08:00
recurrentgemma Fix .gitmodules (#2852) 2025-03-04 22:34:09 +08:00
redrafter Update TensorRT-LLM (#2849) 2025-03-04 18:44:00 +08:00
sample_weight_stripping Update (#2978) 2025-03-23 16:39:35 +08:00
scaffolding doc: add a directory for scaffolding contributors (#3224) 2025-04-02 16:08:00 +08:00
serve doc: refactor trtllm-serve examples and doc (#3187) 2025-04-04 11:40:43 +08:00
vit Update (#2978) 2025-03-23 16:39:35 +08:00
whisper Update TensorRT-LLM (#2849) 2025-03-04 18:44:00 +08:00
constraints.txt chore: bump version to 0.19.0.dev2025040800 (#3171) 2025-04-02 08:21:55 +08:00
eval_long_context.py test: Accuracy test improvement (Part 2): Incorporate mmlu to accuracy test suite (#2982) 2025-03-25 07:34:10 +08:00
generate_checkpoint_config.py Update TensorRT-LLM (#2562) 2024-12-11 00:31:05 -08:00
generate_xgrammar_tokenizer_info.py Update TensorRT-LLM (#2783) 2025-02-13 18:40:22 +08:00
gpqa_llmapi.py test: Add gpqa tests for DeepSeek models (#3063) 2025-03-27 19:47:06 +08:00
hf_lora_convert.py Update TensorRT-LLM (#2755) 2025-02-11 03:01:00 +00:00
mmlu_llmapi.py test: Accuracy test improvement (Part 2): Incorporate mmlu to accuracy test suite (#2982) 2025-03-25 07:34:10 +08:00
mmlu.py test: Accuracy test improvement (Part 2): Incorporate mmlu to accuracy test suite (#2982) 2025-03-25 07:34:10 +08:00
run.py fix: Fix p-tuning test bug (#3326) 2025-04-08 17:14:00 +08:00
summarize.py test: Accuracy test improvement (Part 3.1): Extend accuracy test suite with LLM API and initial implementation of trtllm-eval (#3167) 2025-04-01 22:20:29 +08:00
utils.py Update TensorRT-LLM (#2849) 2025-03-04 18:44:00 +08:00