Yan Chunwei
b5f9fff1c1
[ https://nvbugs/5569754 ][fix] trtllm-llmapi-launch port conflict ( #8582 )
...
Signed-off-by: Yan Chunwei <328693+Superjomn@users.noreply.github.com>
Signed-off-by: Mike Iovine <6158008+mikeiovine@users.noreply.github.com>
Signed-off-by: Mike Iovine <miovine@nvidia.com>
2025-11-20 12:43:13 -05:00
hvagadia
6ff82ea24e
[None][feat] Allow env variable to specify spawn process IPC address ( #8922 )
...
Signed-off-by: hvagadia <hvagadia@nvidia.com>
2025-11-07 15:45:57 -08:00
Yanchao Lu
e5cead1eb9
[TRTLLM-6295][test] Exit as early as possible and propagate exit status correctly for multi-node testing ( #7739 )
...
Signed-off-by: Yanchao Lu <yanchaol@nvidia.com>
2025-09-16 09:59:18 +08:00
Yan Chunwei
3946e798db
fix[nvbug5298640]: trtllm-llmapi-launch multiple LLM instances ( #4727 )
...
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
2025-06-19 06:13:53 +08:00
Yi Zhang
1fca654bfd
tests: Update gb200 test case ( #4754 )
...
Signed-off-by: Yi Zhang <187001205+yizhang-nv@users.noreply.github.com>
2025-06-04 18:49:20 +08:00
Yan Chunwei
2a09826ec4
fix hmac in remote mpi session ( #3649 )
...
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
Co-authored-by: Tao Li @ NVIDIA <tali@nvidia.com>
2025-04-18 17:47:51 +08:00
bhsueh_NV
3aa37e6b72
fix bug ( #3570 )
...
Signed-off-by: bhsueh <11360707+byshiue@users.noreply.github.com>
2025-04-15 16:50:22 +08:00
Yan Chunwei
b37c5c0a4d
make LLM-API slurm examples executable ( #3402 )
...
Signed-off-by: chunweiy <328693+Superjomn@users.noreply.github.com>
2025-04-13 21:42:45 +08:00
Yan Chunwei
74850c61e9
fix: switch ZMQ from file socket to tcp socket in RemoteMpiCommSession ( #3462 )
...
* switch ZMQ from file socket to tcp
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
* fix comment
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
---------
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
2025-04-13 09:15:55 +08:00
Yan Chunwei
deb876ecdb
clean up trtllm-llmapi-launch logs ( #3358 )
...
Signed-off-by: Superjomn <328693+Superjomn@users.noreply.github.com>
2025-04-08 16:00:59 +08:00
Kaiyu Xie
2631f21089
Update ( #2978 )
...
Signed-off-by: Kaiyu Xie <26294424+kaiyux@users.noreply.github.com>
2025-03-23 16:39:35 +08:00
Kaiyu Xie
75057cd036
Update TensorRT-LLM ( #2333 )
...
* Update TensorRT-LLM
---------
Co-authored-by: Puneesh Khanna <puneesh.khanna@tii.ae>
Co-authored-by: Ethan Zhang <26497102+ethnzhng@users.noreply.github.com>
2024-10-15 15:28:40 +08:00