TensorRT-LLMs

mirror of https://github.com/NVIDIA/TensorRT-LLM.git synced 2026-01-22 19:52:38 +08:00

History

Yibin Li d7581bb551 [TRTLLM-8031][feat] Add chunked return_generation_logits logic (#7831 ) Signed-off-by: Yibin Li <109242046+yibinl-nvidia@users.noreply.github.com>		2025-10-01 12:47:07 -04:00
..
test_chunked_logits.py	[TRTLLM-8031][feat] Add chunked return_generation_logits logic (#7831 )	2025-10-01 12:47:07 -04:00
test_executor_request_queue.py	[None][opt] Balance the request based on number of tokens in AttentionDP (#7183 )	2025-08-27 11:16:12 +08:00
test_overlap_scheduler_input.json	[None][ci] move unittests to sub-directories (#6635 )	2025-08-20 05:42:22 -04:00
test_overlap_scheduler.py	[None][ci] move unittests to sub-directories (#6635 )	2025-08-20 05:42:22 -04:00
test_pytorch_model_engine.py	[None][chore] extract weights loading related logic to model loader (#7579 )	2025-09-25 10:19:22 -07:00
test_resource_manager.py	fix/improve kvcache allocation in PyTorch runtime (#5933 )	2025-08-26 12:40:22 +08:00
test_router_dealer_ipc.py	[https://nvbugs/5503440 ][fix] Fix potential hang due to wrong type of ZMQ socket and protocol for worker_init_status_queue (#7646 )	2025-09-19 18:13:33 +08:00