jingyaogong
|
3f1a7cc25b
|
[fix] ddp exit hang
|
2026-06-01 17:50:39 +08:00 |
|
jingyaogong
|
4a68da72d5
|
[fix] lora dims
|
2026-05-31 13:38:51 +08:00 |
|
jingyaogong
|
4497610ec0
|
[fix] issue#771
|
2026-05-19 17:40:03 +08:00 |
|
jingyaogong
|
9da8e1ab18
|
[update] requirements
|
2026-05-16 18:08:37 +08:00 |
|
jingyaogong
|
dddedc6881
|
[fix] repetition_penalty
|
2026-05-07 19:08:52 +08:00 |
|
jingyaogong
|
802c15b2b4
|
[feat] reduce RL memory
|
2026-05-06 15:07:28 +08:00 |
|
jingyaogong
|
e73a407f7a
|
Merge pull request #759 from TKiteRunner/fix/grpo-oom-reorder-rewards
fix: resolve CUDA OOM in train_grpo.py on GPUs with <=8GB VRAM
|
2026-05-06 15:00:17 +08:00 |
|
TKiteRunner
|
10776417aa
|
fix: resolve CUDA OOM in train_grpo.py on GPUs with <=8GB VRAM
|
2026-05-06 14:14:15 +08:00 |
|
jingyaogong
|
5020dc9dd4
|
[update] readme
|
2026-05-06 13:41:46 +08:00 |
|
jingyaogong
|
bdee223036
|
[fix] inference bug
|
2026-05-03 20:48:48 +08:00 |
|
jingyaogong
|
06d882e4ef
|
[update] readme
|
2026-05-01 12:06:47 +08:00 |
|
jingyaogong
|
da865af63d
|
[update] readme
|
2026-04-28 17:22:25 +08:00 |
|
jingyaogong
|
773e451b11
|
[fix] bugs
|
2026-04-27 19:16:08 +08:00 |
|
jingyaogong
|
6361510016
|
[fix] rollout bugs
|
2026-04-27 17:54:09 +08:00 |
|
jingyaogong
|
d4c6bc5c7e
|
[update] readme
|
2026-04-27 10:59:41 +08:00 |
|
jingyaogong
|
24896cd2c4
|
[update] readme
|
2026-04-24 20:16:06 +08:00 |
|
jingyaogong
|
e2fc397176
|
[update] readme
|
2026-04-24 20:14:10 +08:00 |
|
jingyaogong
|
693fb1ccf1
|
[update] readme
|
2026-04-21 14:34:46 +08:00 |
|
jingyaogong
|
5416a44471
|
[fix] bugs
|
2026-04-21 13:03:34 +08:00 |
|
jingyaogong
|
1718e9a44d
|
[fix] transformers-5.x
|
2026-04-19 23:48:54 +08:00 |
|
jingyaogong
|
5704766352
|
[update] tie embedding
|
2026-04-19 21:57:28 +08:00 |
|
jingyaogong
|
1ea113ea2c
|
[update] readme
|
2026-04-19 14:51:47 +08:00 |
|
jingyaogong
|
487f78754d
|
[update] readme
|
2026-04-10 10:55:24 +08:00 |
|
jingyaogong
|
e796b8028a
|
[fix] lora compile
|
2026-04-10 00:01:21 +08:00 |
|
jingyaogong
|
b2488e6440
|
[update] readme
|
2026-04-09 19:00:41 +08:00 |
|
jingyaogong
|
939dc8ff42
|
[update] readme
|
2026-04-09 18:54:17 +08:00 |
|
jingyaogong
|
cadacabecb
|
[update] bench
|
2026-04-09 17:12:48 +08:00 |
|
jingyaogong
|
5351424bf0
|
[fix] lora moe
|
2026-04-09 16:36:48 +08:00 |
|
jingyaogong
|
aa3e6affa1
|
[update] add a comment
|
2026-04-09 15:09:58 +08:00 |
|
jingyaogong
|
25a7edcd6f
|
[update] image
|
2026-04-04 14:54:13 +08:00 |
|
jingyaogong
|
cacf1d4cd0
|
[update] readme
|
2026-04-04 11:25:21 +08:00 |
|
jingyaogong
|
2ab6455d9d
|
[update] open causal
|
2026-04-02 15:28:58 +08:00 |
|
jingyaogong
|
9348fde743
|
[update] readme
|
2026-04-02 15:28:29 +08:00 |
|
jingyaogong
|
90cd275524
|
[update] readme
|
2026-04-01 14:00:21 +08:00 |
|
jingyaogong
|
b7e0ae21d6
|
[update] default model
|
2026-03-31 13:40:16 +08:00 |
|
jingyaogong
|
b1865f75c2
|
[update] random seed
|
2026-03-27 21:20:02 +08:00 |
|
jingyaogong
|
6b0b0c5e2f
|
[update] fp16 inference
|
2026-03-27 16:29:46 +08:00 |
|
jingyaogong
|
88e675dc2c
|
[update] image
|
2026-03-26 15:35:42 +08:00 |
|
jingyaogong
|
b8b3d35257
|
[update] change default seq_len
|
2026-03-26 10:09:06 +08:00 |
|
jingyaogong
|
101d7df2da
|
[update] minimind-3
|
2026-03-25 23:57:45 +08:00 |
|
jingyaogong
|
83e52f6a27
|
Merge pull request #698 from readlnh/master
[fix] 修复训练脚本中 1-indexed step 与 0-indexed 逻辑混用的问题
|
2026-03-24 13:41:20 +08:00 |
|
readlnh
|
cf4b49a348
|
[fix] align log/save last-step check and ETA with 1-indexed step
|
2026-03-24 02:01:40 +01:00 |
|
readlnh
|
d25500d363
|
[fix] gradient accumulation step alignment
|
2026-03-24 01:45:04 +01:00 |
|
jingyaogong
|
349e74ec7b
|
[update] empty_think_ratio
|
2026-02-06 19:15:21 +08:00 |
|
jingyaogong
|
288e1ac02a
|
[update] empty_think_ratio
|
2026-02-06 01:36:02 +08:00 |
|
jingyaogong
|
ccc190da05
|
[feat] data process
|
2026-02-06 01:17:57 +08:00 |
|
jingyaogong
|
11a44340ba
|
[update] save interval
|
2026-01-30 20:30:50 +08:00 |
|
jingyaogong
|
04616c41a5
|
[update] safe half
|
2026-01-30 20:29:31 +08:00 |
|
jingyaogong
|
fea69cf338
|
[fix] data skip
|
2026-01-18 16:56:29 +08:00 |
|
jingyaogong
|
f7ffdf1fdb
|
[update] shuffle data
|
2026-01-18 16:39:34 +08:00 |
|