TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-02-11 13:33:40 +08:00

History

pcastonguay ae5671644a feat: Disaggregated router class (#3584 ) * Add draft scheduler class Signed-off-by: Shunkang <182541032+Shunkangz@users.noreply.github.co> * Refactor the design Signed-off-by: Shunkang <182541032+Shunkangz@users.noreply.github.co> * feat: Introduce router class for disaggregated server Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com> * Add unit tests for router class Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com> * Adding tests for disagg_utils Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com> * Fixing missing import Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com> * Fixing disagg integration tests Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com> * Addressing MR review comments Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com> --------- Signed-off-by: Shunkang <182541032+Shunkangz@users.noreply.github.co> Signed-off-by: Patrice Castonguay <55748270+pcastonguay@users.noreply.github.com> Co-authored-by: Shunkang <182541032+Shunkangz@users.noreply.github.co>		2025-04-19 00:34:12 +08:00
..
_llmapi_perf_evaluator	Update (#2978 )	2025-03-23 16:39:35 +08:00
accuracy	fix: Fix PP for llama. (#3449 )	2025-04-12 17:20:27 +08:00
deterministic	Update (#2978 )	2025-03-23 16:39:35 +08:00
disaggregated	feat: Disaggregated router class (#3584 )	2025-04-19 00:34:12 +08:00
examples	feat: allocate minimal blocks per window size (#3028 )	2025-04-17 16:04:57 +08:00
perf	tests: change qa perf test to trtllm-bench (#3189 )	2025-04-17 09:53:32 +08:00
stress_test	feat: Add stress test for TRT-LLM (#3250 )	2025-04-13 10:24:25 +08:00
sysinfo	Update (#2978 )	2025-03-23 16:39:35 +08:00
__init__.py	Update (#2978 )	2025-03-23 16:39:35 +08:00
_run_llmapi_llm.py	Update (#2978 )	2025-03-23 16:39:35 +08:00
.test_durations	infra: Add step to generate new duration file (#3298 )	2025-04-18 12:56:31 +08:00
agg_unit_mem_df.csv	test: reorganize tests folder hierarchy (#2996 )	2025-03-27 12:07:53 +08:00
ci_profiler.py	Update (#2978 )	2025-03-23 16:39:35 +08:00
common.py	feat: allocate minimal blocks per window size (#3028 )	2025-04-17 16:04:57 +08:00
conftest.py	feat: Add FP8 support for SM 120 (#3248 )	2025-04-14 16:05:41 -07:00
cpp_common.py	chore : Split more tests out of gpt tests (#3524 )	2025-04-18 12:04:57 +08:00
local_venv.py	test: Automatically clean checkpoints and engines (#3468 )	2025-04-12 09:56:29 +08:00
pytest.ini	chore: Refine attention backend interface. (#3271 )	2025-04-09 02:34:53 +08:00
runner_interface.py	Update (#2978 )	2025-03-23 16:39:35 +08:00
test_cache.py	chore: clean some ci of qa test (#3083 )	2025-03-31 14:30:41 +08:00
test_cases.yml	Update (#2978 )	2025-03-23 16:39:35 +08:00
test_cpp.py	chore : Split more tests out of gpt tests (#3524 )	2025-04-18 12:04:57 +08:00
test_e2e.py	test: add quickstart test for nemotron-ultra (#3596 )	2025-04-17 11:16:41 +08:00
test_list_parser.py	Feat: Variable-Beam-Width-Search (VBWS) part3 (#3338 )	2025-04-08 23:51:27 +08:00
test_list_validation.py	Update (#2978 )	2025-03-23 16:39:35 +08:00
test_mlpf_results.py	Update (#2978 )	2025-03-23 16:39:35 +08:00
test_sanity.py	Update (#2978 )	2025-03-23 16:39:35 +08:00
test_unittests.py	test: reorganize tests folder hierarchy (#2996 )	2025-03-27 12:07:53 +08:00
trt_test_alternative.py	Add thread leak check and fix thread/memory leak issues. (#3270 )	2025-04-08 19:03:18 +08:00
turtle_defs.json	Update (#2978 )	2025-03-23 16:39:35 +08:00