58 Commits

Author SHA1 Message Date
jingyaogong 3f1a7cc25b [fix] ddp exit hang 2026-06-01 17:50:39 +08:00
jingyaogong 4497610ec0 [fix] issue#771 2026-05-19 17:40:03 +08:00
jingyaogong 802c15b2b4 [feat] reduce RL memory 2026-05-06 15:07:28 +08:00
TKiteRunner 10776417aa fix: resolve CUDA OOM in train_grpo.py on GPUs with <=8GB VRAM 2026-05-06 14:14:15 +08:00
jingyaogong bdee223036 [fix] inference bug 2026-05-03 20:48:48 +08:00
jingyaogong 773e451b11 [fix] bugs 2026-04-27 19:16:08 +08:00
jingyaogong 6361510016 [fix] rollout bugs 2026-04-27 17:54:09 +08:00
jingyaogong 5416a44471 [fix] bugs 2026-04-21 13:03:34 +08:00
jingyaogong e796b8028a [fix] lora compile 2026-04-10 00:01:21 +08:00
jingyaogong 5351424bf0 [fix] lora moe 2026-04-09 16:36:48 +08:00
jingyaogong aa3e6affa1 [update] add a comment 2026-04-09 15:09:58 +08:00
jingyaogong b8b3d35257 [update] change default seq_len 2026-03-26 10:09:06 +08:00
jingyaogong 101d7df2da [update] minimind-3 2026-03-25 23:57:45 +08:00
readlnh cf4b49a348 [fix] align log/save last-step check and ETA with 1-indexed step 2026-03-24 02:01:40 +01:00
readlnh d25500d363 [fix] gradient accumulation step alignment 2026-03-24 01:45:04 +01:00
jingyaogong 11a44340ba [update] save interval 2026-01-30 20:30:50 +08:00
jingyaogong fea69cf338 [fix] data skip 2026-01-18 16:56:29 +08:00
jingyaogong f7ffdf1fdb [update] shuffle data 2026-01-18 16:39:34 +08:00
jingyaogong c090b69c4d [update] align loss 2026-01-15 00:56:32 +08:00
jingyaogong e119db8478 [fix] compile unpack 2026-01-14 20:13:32 +08:00
jingyaogong 81d24a4f16 [feat] add compile 2026-01-14 14:42:30 +08:00
jingyaogong df89069362 [update] params log 2026-01-07 23:08:45 +08:00
jingyaogong 07364c3fbe [update] rename train tokenizer 2026-01-06 01:17:33 +08:00
jingyaogong 4e73f34823 [update] rename reason 2026-01-05 23:12:29 +08:00
jingyaogong 42a4e8c86a [fix] dist cleanup 2026-01-02 22:25:55 +08:00
jingyaogong 9d898576ac [update] aux loss 2026-01-01 22:41:46 +08:00
jingyaogong c65335b56f [fix] experts unused 2025-12-31 21:47:04 +08:00
jingyaogong bc8fd82166 [fix] layers set 8 2025-12-31 21:06:37 +08:00
jingyaogong 5dd4df7e18 [fix] moe unused 2025-12-31 21:00:06 +08:00
jingyaogong 9236260a4a [feat] get params 2025-12-31 20:46:59 +08:00
jingyaogong 288a1d7212 [feat] get params 2025-12-31 20:44:34 +08:00
jingyaogong 6242980917 [feat] update lr 2025-12-31 10:27:09 +08:00
jingyaogong 7eae14f3ce [feat] remove empty_cache 2025-12-27 07:14:36 +08:00
jingyaogong 11b962da06 [feat] explicit left padding 2025-12-23 18:59:48 +08:00
jingyaogong fe24501602 [feat] adjust seq length 2025-12-14 20:41:58 +08:00
jingyaogong 5129f0e2a2 [fix] dtype & lr 2025-12-09 13:01:38 +08:00
dyhuachi bf3878ace8 [fix] Refactor get_lr function to include min_lr calculation
这里的退火算法会让参数里的lr的起始值变成原来lr的1.1倍,作出如下修改
2025-12-06 17:09:51 +08:00
jingyaogong 5e1447b913 [fix] cuda memory #559 2025-12-01 16:17:43 +08:00
jingyaogong 6b86ea399a [feat] release memory 2025-11-27 19:39:49 +08:00
jingyaogong d7f4f4eab8 [fix] ppo mask 2025-11-19 23:39:02 +08:00
jingyaogong 9c98cabc9a [fix] prompt length calculate 2025-11-15 18:25:37 +08:00
jingyaogong 509d8dacf1 [feat] clear cache 2025-11-06 13:12:28 +08:00
jingyaogong 0323815729 [feat] update import 2025-10-31 23:45:55 +08:00
jingyaogong bf123b585d [feat] add args 2025-10-30 10:05:12 +08:00
jingyaogong 1713c24114 [fix] model device 2025-10-29 10:36:28 +08:00
jingyaogong acd5925193 [feat] update trainer 2025-10-29 00:52:37 +08:00
jingyaogong 8f7e07b8ef [feat] update trainer 2025-10-28 23:30:10 +08:00
jingyaogong e8484874f5 [feat] pause-training 2025-10-26 18:49:52 +08:00
jingyaogong a82526da11 [feat] shuffle data 2025-10-23 20:13:28 +08:00
jingyaogong 805744e60a [fix] loss-issues-430 2025-10-23 19:08:42 +08:00