Chuang Zhu
|
bc5811da65
|
chore: Ucx ip port remove mpi depend (#3101)
* initial ucx support
Signed-off-by: roeya <165803633+RoeyAzran1992@users.noreply.github.com>
* fixes to support dynloading and ucx connection establishment - not stable yet
Signed-off-by: roeya <165803633+RoeyAzran1992@users.noreply.github.com>
* update
Signed-off-by: roeya <165803633+RoeyAzran1992@users.noreply.github.com>
* more connection bringup fixes - faillig on connection vector build
Signed-off-by: roeya <165803633+RoeyAzran1992@users.noreply.github.com>
* executor test pass
Signed-off-by: roeya <165803633+RoeyAzran1992@users.noreply.github.com>
* update
Signed-off-by: roeya <165803633+RoeyAzran1992@users.noreply.github.com>
* passed full benchmark
Signed-off-by: roeya <165803633+RoeyAzran1992@users.noreply.github.com>
* changing to TLLM_THROW and removing cout
Signed-off-by: roeya <165803633+RoeyAzran1992@users.noreply.github.com>
* stoping progress thread at ucxComm destructor
Signed-off-by: roeya <165803633+RoeyAzran1992@users.noreply.github.com>
* fixing build with ENABLE_UCX=0 to not build ucx traget at all and removing includes for ucxConnection for cache transceiver, also delete commented cold code
Signed-off-by: roeya <165803633+RoeyAzran1992@users.noreply.github.com>
* fix copyrights
Signed-off-by: roeya <165803633+RoeyAzran1992@users.noreply.github.com>
* adding ucx flavor to cache transceiver test and insertto the CI pipeline
Signed-off-by: roeya <165803633+RoeyAzran1992@users.noreply.github.com>
* allowing sending non ib interfaces IPs
Signed-off-by: roeya <165803633+RoeyAzran1992@users.noreply.github.com>
* setting UCX port reuse for the tests in pipeline
Signed-off-by: roeya <165803633+RoeyAzran1992@users.noreply.github.com>
* code review fixes
Signed-off-by: roeya <165803633+RoeyAzran1992@users.noreply.github.com>
* querying ep after GID message is sent to avoid UCX Errors
Signed-off-by: roeya <165803633+RoeyAzran1992@users.noreply.github.com>
* fixing more CR issues
Signed-off-by: roeya <165803633+RoeyAzran1992@users.noreply.github.com>
* querying ep to not fail is ep_not_connected yet
Signed-off-by: roeya <165803633+RoeyAzran1992@users.noreply.github.com>
* remove mpi dependency and debug
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
* debug to info
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
* mpirun n 2
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
* remove mpi comm split when disaggOrchestrator mode
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
* waive disagg_mtp test
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
* use future instead of thread
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
* use future_promise instead of cv wait
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
* connectionId type
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
* improve test
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
* imporve test 2
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
* gtest_skip
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
---------
Signed-off-by: roeya <165803633+RoeyAzran1992@users.noreply.github.com>
Signed-off-by: Chuang Zhu <111838961+chuangz0@users.noreply.github.com>
Co-authored-by: roeya <165803633+RoeyAzran1992@users.noreply.github.com>
|
2025-04-02 09:42:29 +08:00 |
|
BatshevaBlack
|
3e37531c6a
|
feat: Add BW measurement (#3070)
|
2025-03-28 10:53:00 +08:00 |
|
Shunkangz
|
8ee840159b
|
Add updateKVCacheTransfer (#2984)
Add kv cache transfer measurement
Signed-off-by: Shunkang <182541032+Shunkangz@users.noreply.github.co>
Co-authored-by: Shunkang <182541032+Shunkangz@users.noreply.github.co>
|
2025-03-25 21:45:35 +08:00 |
|
Kaiyu Xie
|
3aa6b11d13
|
Update TensorRT-LLM (#2936)
* Update TensorRT-LLM
---------
Co-authored-by: changcui <cuichang147@gmail.com>
|
2025-03-18 21:25:19 +08:00 |
|
Kaiyu Xie
|
9b931c0f63
|
Update TensorRT-LLM (#2873)
|
2025-03-11 21:13:42 +08:00 |
|