Jonas Yang CN
|
88ea2c4ee9
|
[TRTLLM-7349][feat] Adding new orchestrator type -- ray (#7520)
Signed-off-by: Erin Ho <14718778+hchings@users.noreply.github.com>
Co-authored-by: Yuan Tong <13075180+tongyuantongyu@users.noreply.github.com>
Co-authored-by: Erin Ho <14718778+hchings@users.noreply.github.com>
|
2025-10-04 08:12:24 +08:00 |
|
brb-nv
|
bd3d0ad233
|
[TRTLLM-7733][feat] Executor changes to support helix parallelism (#7972)
Signed-off-by: Balaram Buddharaju <169953907+brb-nv@users.noreply.github.com>
|
2025-10-01 22:13:03 -04:00 |
|
Yibin Li
|
d7581bb551
|
[TRTLLM-8031][feat] Add chunked return_generation_logits logic (#7831)
Signed-off-by: Yibin Li <109242046+yibinl-nvidia@users.noreply.github.com>
|
2025-10-01 12:47:07 -04:00 |
|
QI JUN
|
1529a6f22d
|
[None][chore] extract weights loading related logic to model loader (#7579)
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
|
2025-09-25 10:19:22 -07:00 |
|
Liao Lanyu
|
18095a7cb8
|
[https://nvbugs/5503440][fix] Fix potential hang due to wrong type of ZMQ socket and protocol for worker_init_status_queue (#7646)
Signed-off-by: Lanyu Liao <lancelly@users.noreply.github.com>
|
2025-09-19 18:13:33 +08:00 |
|
Mike Iovine
|
b3c57a7042
|
[TRTLLM-7353][feat] Implement capturable drafting loops for speculation (#7100)
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
|
2025-09-01 14:37:44 -04:00 |
|
Shunkangz
|
ff4047414b
|
[None][opt] Balance the request based on number of tokens in AttentionDP (#7183)
Signed-off-by: Shunkang <182541032+Shunkangz@users.noreply.github.co>
Co-authored-by: Shunkang <182541032+Shunkangz@users.noreply.github.co>
|
2025-08-27 11:16:12 +08:00 |
|
qixiang-99
|
b165f8bc97
|
fix/improve kvcache allocation in PyTorch runtime (#5933)
Signed-off-by: qixiang-99 <203170375+qixiang-99@users.noreply.github.com>
|
2025-08-26 12:40:22 +08:00 |
|
QI JUN
|
bea5e07fb7
|
[None][refactor] refactor the CUDA graph runner to manage all CUDA graphs (#6846)
Signed-off-by: junq <22017000+QiJune@users.noreply.github.com>
|
2025-08-25 20:52:05 +08:00 |
|
Robin Kobus
|
b95cab2a7c
|
[None][ci] move unittests to sub-directories (#6635)
Signed-off-by: Robin Kobus <19427718+Funatiq@users.noreply.github.com>
|
2025-08-20 05:42:22 -04:00 |
|