TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-28 22:56:13 +08:00

History

Iman Tabrizian af04b6f6aa bug: Fix hang bug when context server doesn't have enough capacity for KV Cache (#3095 ) * Fix hang bug when KV cache is low Signed-off-by: Iman Tabrizian <itabrizian@nvidia.com> * Review comments Signed-off-by: Iman Tabrizian <itabrizian@nvidia.com> * Fix attentiondp typo Signed-off-by: Iman Tabrizian <itabrizian@nvidia.com> * Add CI test for this case Signed-off-by: Iman Tabrizian <itabrizian@nvidia.com> * fix: Fix the insertion order for responder futures Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com> * fix: Fix disagg CPP Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com> --------- Signed-off-by: Iman Tabrizian <itabrizian@nvidia.com> Signed-off-by: Iman Tabrizian <10105175+tabrizian@users.noreply.github.com>		2025-04-21 15:16:55 +08:00
..
_llmapi_perf_evaluator	Update (#2978 )	2025-03-23 16:39:35 +08:00
accuracy	move the reset models into `examples/models/core` directory (#3555 )	2025-04-19 20:48:59 -07:00
deterministic	Update (#2978 )	2025-03-23 16:39:35 +08:00
disaggregated	bug: Fix hang bug when context server doesn't have enough capacity for KV Cache (#3095 )	2025-04-21 15:16:55 +08:00
examples	move the reset models into `examples/models/core` directory (#3555 )	2025-04-19 20:48:59 -07:00
perf	tests: change qa perf test to trtllm-bench (#3189 )	2025-04-17 09:53:32 +08:00
stress_test	feat: Add stress test for TRT-LLM (#3250 )	2025-04-13 10:24:25 +08:00
sysinfo	Update (#2978 )	2025-03-23 16:39:35 +08:00
__init__.py	Update (#2978 )	2025-03-23 16:39:35 +08:00
_run_llmapi_llm.py	Update (#2978 )	2025-03-23 16:39:35 +08:00
.test_durations	infra: Add step to generate new duration file (#3298 )	2025-04-18 12:56:31 +08:00
agg_unit_mem_df.csv	test: reorganize tests folder hierarchy (#2996 )	2025-03-27 12:07:53 +08:00
ci_profiler.py	Update (#2978 )	2025-03-23 16:39:35 +08:00
common.py	move the reset models into `examples/models/core` directory (#3555 )	2025-04-19 20:48:59 -07:00
conftest.py	move the reset models into `examples/models/core` directory (#3555 )	2025-04-19 20:48:59 -07:00
cpp_common.py	chore : Split more tests out of gpt tests (#3524 )	2025-04-18 12:04:57 +08:00
local_venv.py	test: Automatically clean checkpoints and engines (#3468 )	2025-04-12 09:56:29 +08:00
pytest.ini	chore: Refine attention backend interface. (#3271 )	2025-04-09 02:34:53 +08:00
runner_interface.py	Update (#2978 )	2025-03-23 16:39:35 +08:00
test_cache.py	chore: clean some ci of qa test (#3083 )	2025-03-31 14:30:41 +08:00
test_cases.yml	Update (#2978 )	2025-03-23 16:39:35 +08:00
test_cpp.py	chore : Split more tests out of gpt tests (#3524 )	2025-04-18 12:04:57 +08:00
test_e2e.py	test: add llama3.2 ptp test case (#3363 )	2025-04-21 15:15:45 +08:00
test_list_parser.py	infra: Add test list name check (#3097 )	2025-04-20 23:02:16 +08:00
test_list_validation.py	Update (#2978 )	2025-03-23 16:39:35 +08:00
test_mlpf_results.py	Update (#2978 )	2025-03-23 16:39:35 +08:00
test_sanity.py	Update (#2978 )	2025-03-23 16:39:35 +08:00
test_unittests.py	test: reorganize tests folder hierarchy (#2996 )	2025-03-27 12:07:53 +08:00
trt_test_alternative.py	Add thread leak check and fix thread/memory leak issues. (#3270 )	2025-04-08 19:03:18 +08:00
turtle_defs.json	Update (#2978 )	2025-03-23 16:39:35 +08:00