mirror of
https://github.com/jingyaogong/minimind.git
synced 2026-06-06 00:04:50 +00:00
[update] readme
This commit is contained in:
@@ -1203,12 +1203,21 @@ $$
|
||||
**训练方式**:
|
||||
|
||||
```bash
|
||||
# ① 默认使用torch做rollout
|
||||
# 方式1
|
||||
torchrun --nproc_per_node N train_agent.py
|
||||
# 方式2
|
||||
python train_agent.py
|
||||
```
|
||||
|
||||
```bash
|
||||
# ② 使用sglang做rollout
|
||||
# 需先启动sglang server:
|
||||
python -m sglang.launch_server --model-path ./minimind-3 --attention-backend triton --host 0.0.0.0 --port 8998
|
||||
# 训练参数可参考:
|
||||
python train_agent.py --rollout_engine sglang --sglang_base_url http://localhost:8998 --sglang_shared_path ./ckpt_mm --data_path ../dataset/agent_rl_math.jsonl --use_wandb
|
||||
```
|
||||
|
||||
> 训练后的模型权重文件默认每隔`save_interval步`保存为: `agent_*.pth`
|
||||
|
||||

|
||||
|
||||
@@ -1202,12 +1202,21 @@ Here, tool call legality, `gt` hits, format closure, unfinished penalty, and Rew
|
||||
**Training method**:
|
||||
|
||||
```bash
|
||||
# ① Default: use torch for rollout
|
||||
# Method 1
|
||||
torchrun --nproc_per_node N train_agent.py
|
||||
# Method 2
|
||||
python train_agent.py
|
||||
```
|
||||
|
||||
```bash
|
||||
# ② Use sglang for rollout
|
||||
# Start sglang server first:
|
||||
python -m sglang.launch_server --model-path ./minimind-3 --attention-backend triton --host 0.0.0.0 --port 8998
|
||||
# Training parameters for reference:
|
||||
python train_agent.py --rollout_engine sglang --sglang_base_url http://localhost:8998 --sglang_shared_path ./ckpt_mm --data_path ../dataset/agent_rl_math.jsonl --use_wandb
|
||||
```
|
||||
|
||||
> The trained model weight files are saved by default every `save_interval steps` as: `agent_*.pth`
|
||||
|
||||

|
||||
|
||||
Reference in New Issue
Block a user