TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-26 13:43:38 +08:00

History

Yixin Dong c90ebadd84 feat: Support the Structural Tag in guided decoding (#4066 ) * finish Signed-off-by: Ubospica <ubospica@gmail.com> * update Signed-off-by: Ubospica <ubospica@gmail.com> * update Signed-off-by: Ubospica <ubospica@gmail.com> * fix Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com> * exc overlap scheduler Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com> * add test Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com> * fix api ref Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com> --------- Signed-off-by: Ubospica <ubospica@gmail.com> Signed-off-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com> Co-authored-by: Enwei Zhu <21126786+syuoni@users.noreply.github.com>		2025-05-12 17:24:50 +08:00
..
apps	feat: Support the Structural Tag in guided decoding (#4066 )	2025-05-12 17:24:50 +08:00
__init__.py	test: reorganize tests folder hierarchy (#2996 )	2025-03-27 12:07:53 +08:00
_run_mpi_comm_task.py	fix: trtllm-bench build trt engine on slurm (#3825 )	2025-04-27 22:26:23 +08:00
fake.sh	doc: fix path after examples migration (#3814 )	2025-04-24 02:36:45 +08:00
grid_searcher.py	Update TensorRT-LLM (#2936 )	2025-03-18 21:25:19 +08:00
run_llm_exit.py	Update TensorRT-LLM (#2936 )	2025-03-18 21:25:19 +08:00
run_llm_with_postproc.py	Update TensorRT-LLM (#2936 )	2025-03-18 21:25:19 +08:00
run_llm.py	Update (#2978 )	2025-03-23 16:39:35 +08:00
test_build_cache.py	Update TensorRT-LLM (#2936 )	2025-03-18 21:25:19 +08:00
test_executor.py	chore: Cleanup deprecated APIs from LLM-API (part 1/2) (#3732 )	2025-05-07 13:20:25 +08:00
test_llm_args.py	test: reorganize tests folder hierarchy (#2996 )	2025-03-27 12:07:53 +08:00
test_llm_download.py	test: reorganize tests folder hierarchy (#2996 )	2025-03-27 12:07:53 +08:00
test_llm_kv_cache_events.py	test: add kv cache event tests for disagg workers (#3602 )	2025-04-18 18:30:19 +08:00
test_llm_models.py	move the reset models into `examples/models/core` directory (#3555 )	2025-04-19 20:48:59 -07:00
test_llm_multi_gpu_pytorch.py	feat: support multi lora adapters and TP (#3885 )	2025-05-08 23:45:45 +08:00
test_llm_multi_gpu.py	[CI] waive two multi-gpu test cases (#4206 )	2025-05-12 08:04:48 +08:00
test_llm_perf_evaluator.py	test: reorganize tests folder hierarchy (#2996 )	2025-03-27 12:07:53 +08:00
test_llm_pytorch.py	feat: support multi lora adapters and TP (#3885 )	2025-05-08 23:45:45 +08:00
test_llm_quant.py	test: reorganize tests folder hierarchy (#2996 )	2025-03-27 12:07:53 +08:00
test_llm_utils.py	chore: refactor the LlmArgs with Pydantic and migrate remaining pybinding configs to python (#3025 )	2025-04-05 13:31:48 +08:00
test_llm.py	feat: support multi lora adapters and TP (#3885 )	2025-05-08 23:45:45 +08:00
test_mpi_session.py	fix: trtllm-bench build trt engine on slurm (#3825 )	2025-04-27 22:26:23 +08:00
test_reasoning_parser.py	feat: add deepseek-r1 reasoning parser to trtllm-serve (#3354 )	2025-05-06 08:13:04 +08:00