TensorRT-LLMs/tests/unittest/llmapi
Yechan Kim c6e2111f4e
feat: enhance trtllm serve multimodal (#3757)
* feat: enhance trtllm serve multimodal

1. made the load_image and load_video asynchronous
2. add image_encoded input support to be compatible with genai-perf
3. support text-only on multimodal mdoels(currently, Qwen2-VL & Qwen2.5-VL)

Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>

* add test

Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>

* fix bandit

Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>

* trimming uils

Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>

* trimming for test

Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>

* genai perf command fix

Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>

* command fix

Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>

* refactor chat_utils

Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>

* stress test genai-perf command

Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>

---------

Signed-off-by: yechank <161688079+yechank-nvidia@users.noreply.github.com>
2025-05-15 16:16:31 -07:00
..
apps feat: enhance trtllm serve multimodal (#3757) 2025-05-15 16:16:31 -07:00
__init__.py test: reorganize tests folder hierarchy (#2996) 2025-03-27 12:07:53 +08:00
_run_mpi_comm_task.py fix: trtllm-bench build trt engine on slurm (#3825) 2025-04-27 22:26:23 +08:00
fake.sh doc: fix path after examples migration (#3814) 2025-04-24 02:36:45 +08:00
grid_searcher.py Update TensorRT-LLM (#2936) 2025-03-18 21:25:19 +08:00
run_llm_exit.py Update TensorRT-LLM (#2936) 2025-03-18 21:25:19 +08:00
run_llm_with_postproc.py Update TensorRT-LLM (#2936) 2025-03-18 21:25:19 +08:00
run_llm.py Update (#2978) 2025-03-23 16:39:35 +08:00
test_build_cache.py Update TensorRT-LLM (#2936) 2025-03-18 21:25:19 +08:00
test_executor.py chore: Cleanup deprecated APIs from LLM-API (part 1/2) (#3732) 2025-05-07 13:20:25 +08:00
test_llm_args.py test: reorganize tests folder hierarchy (#2996) 2025-03-27 12:07:53 +08:00
test_llm_download.py test: reorganize tests folder hierarchy (#2996) 2025-03-27 12:07:53 +08:00
test_llm_kv_cache_events.py test: add kv cache event tests for disagg workers (#3602) 2025-04-18 18:30:19 +08:00
test_llm_models.py move the reset models into examples/models/core directory (#3555) 2025-04-19 20:48:59 -07:00
test_llm_multi_gpu_pytorch.py feat: support multi lora adapters and TP (#3885) 2025-05-08 23:45:45 +08:00
test_llm_multi_gpu.py [CI] waive two multi-gpu test cases (#4206) 2025-05-12 08:04:48 +08:00
test_llm_perf_evaluator.py test: reorganize tests folder hierarchy (#2996) 2025-03-27 12:07:53 +08:00
test_llm_pytorch.py Breaking change: perf: Enable scheduling overlap by default (#4174) 2025-05-15 14:27:36 +08:00
test_llm_quant.py test: reorganize tests folder hierarchy (#2996) 2025-03-27 12:07:53 +08:00
test_llm_utils.py chore: refactor the LlmArgs with Pydantic and migrate remaining pybinding configs to python (#3025) 2025-04-05 13:31:48 +08:00
test_llm.py Breaking change: perf: Enable scheduling overlap by default (#4174) 2025-05-15 14:27:36 +08:00
test_mpi_session.py fix: trtllm-bench build trt engine on slurm (#3825) 2025-04-27 22:26:23 +08:00
test_reasoning_parser.py feat: add deepseek-r1 reasoning parser to trtllm-serve (#3354) 2025-05-06 08:13:04 +08:00