Commit Graph

  • 15a968ecdb
    Merge 138c367ddb into 693fb1ccf1 Yixian Chen 2026-04-22 09:20:38 +0800
  • 00ec16c61f
    Merge 01c04f519b into 693fb1ccf1 guosj 2026-04-21 15:19:56 +0000
  • 693fb1ccf1 [update] readme master jingyaogong 2026-04-21 14:34:46 +0800
  • 5416a44471 [fix] bugs jingyaogong 2026-04-21 13:03:34 +0800
  • 1718e9a44d [fix] transformers-5.x jingyaogong 2026-04-19 23:48:54 +0800
  • 5704766352 [update] tie embedding jingyaogong 2026-04-19 21:57:28 +0800
  • 1ea113ea2c [update] readme jingyaogong 2026-04-19 14:51:47 +0800
  • 48ed5ec8bc fix: OpenAI API SSE compatibility and stream reliability voidborne-d 2026-04-17 00:56:36 +0000
  • ef40a1f271 一些小的改动 翟锦洋 2026-04-12 18:06:52 +0800
  • 487f78754d [update] readme jingyaogong 2026-04-10 10:55:24 +0800
  • e796b8028a [fix] lora compile jingyaogong 2026-04-10 00:01:21 +0800
  • b2488e6440 [update] readme jingyaogong 2026-04-09 19:00:41 +0800
  • 939dc8ff42 [update] readme jingyaogong 2026-04-09 18:54:17 +0800
  • cadacabecb [update] bench jingyaogong 2026-04-09 17:12:48 +0800
  • 5351424bf0 [fix] lora moe jingyaogong 2026-04-09 16:36:48 +0800
  • aa3e6affa1 [update] add a comment jingyaogong 2026-04-09 15:09:58 +0800
  • d399015c04 [update] ignore local datasets and model artifacts wzz 2026-04-05 18:18:48 +0000
  • cfa8247f20 test wzz 2026-04-05 18:17:05 +0000
  • d37bfa9d75 [update] harden training and inference reliability root 2026-04-05 18:11:37 +0000
  • 8432408c70 [update] image docs jingyaogong 2026-04-04 15:01:11 +0800
  • 299facca84 [update] image website jingyaogong 2026-04-04 14:55:03 +0800
  • 25a7edcd6f [update] image jingyaogong 2026-04-04 14:54:13 +0800
  • cacf1d4cd0 [update] readme jingyaogong 2026-04-04 11:25:21 +0800
  • 367838379a fix: repetition_penalty boosts negative-logit tokens instead of suppressing them d 🔹 2026-04-03 12:11:16 +0000
  • 2ab6455d9d [update] open causal jingyaogong 2026-04-02 15:28:58 +0800
  • 9348fde743 [update] readme jingyaogong 2026-04-02 15:28:29 +0800
  • 90cd275524 [update] readme jingyaogong 2026-04-01 14:00:21 +0800
  • b7e0ae21d6 [update] default model jingyaogong 2026-03-31 13:40:16 +0800
  • b1865f75c2 [update] random seed jingyaogong 2026-03-27 21:20:02 +0800
  • 40358702ab 使用einops进一步提升代码可读性 wizardforcel 2026-03-27 19:22:53 +0800
  • 6b0b0c5e2f [update] fp16 inference jingyaogong 2026-03-27 16:29:46 +0800
  • 138c367ddb [feat] add dapo algorithm exian 2026-03-27 13:24:37 +0800
  • 88e675dc2c [update] image jingyaogong 2026-03-26 15:35:42 +0800
  • b8b3d35257 [update] change default seq_len jingyaogong 2026-03-26 10:09:06 +0800
  • 101d7df2da [update] minimind-3 jingyaogong 2026-03-24 15:59:39 +0800
  • fe7fc29435
    Merge b3069d4743 into 83e52f6a27 yuyu5333 2026-03-25 13:08:38 +0800
  • b113b494cb merge redundant forward passes for logps and aux_loss (in train_grpo.py) Dxpsk 2026-03-24 17:15:55 +0800
  • 03a71c9463
    Merge daf6cc0c2e into 83e52f6a27 李子浩 2026-03-24 14:22:25 +0800
  • 83e52f6a27
    Merge pull request #698 from readlnh/master jingyaogong 2026-03-24 13:41:20 +0800
  • cf4b49a348 [fix] align log/save last-step check and ETA with 1-indexed step readlnh 2026-03-24 02:01:40 +0100
  • d25500d363 [fix] gradient accumulation step alignment readlnh 2026-03-24 01:45:04 +0100
  • f1c141dbdd [update] minimind intro jingyaogong 2026-03-24 00:39:58 +0800
  • c3e83db369 [update] minimind intro jingyaogong 2026-03-24 00:39:40 +0800
  • 0de02a3e6c [update] minimind intro jingyaogong 2026-03-24 00:35:33 +0800
  • a3ba20dc40 [update] minimind new docs jingyaogong 2026-03-24 00:33:58 +0800
  • b301f76f40 新建test分支 upwardflow 2026-03-22 11:30:32 +0800
  • 8972dab6f5 fix sft resume with compile mode 王得利 2026-03-19 12:44:53 +0000
  • 50b039485f Add dynamic growth pipeline, eval tooling, and overnight runner Peter Clark 2026-02-23 04:55:45 +0800
  • 93803dfcb6
    Merge f5079ce090 into 349e74ec7b LearnMan 2026-02-17 02:16:29 -0800
  • 0305628b3d
    Merge 22fa685cc4 into 349e74ec7b Bader 2026-02-07 05:40:43 +0800
  • 349e74ec7b [update] empty_think_ratio jingyaogong 2026-02-06 19:15:21 +0800
  • 288e1ac02a [update] empty_think_ratio jingyaogong 2026-02-06 01:36:02 +0800
  • ccc190da05 [feat] data process jingyaogong 2026-02-06 01:17:57 +0800
  • 22fa685cc4 update comments Bader 2026-02-04 23:12:22 +0800
  • db2d948f93
    Update train_gated_ppo.py vanking 2026-02-03 10:34:20 +0800
  • 0b37f04f15
    Update train_gated_grpo.py vanking 2026-02-03 10:33:27 +0800
  • e84437a8ca
    [add] Create train_gated_grpo.py vanking 2026-02-03 10:20:56 +0800
  • c540ea2537
    [add] created train_gated_ppo.py vanking 2026-02-03 10:17:56 +0800
  • ce9ed24dcd
    [add] add CISPO (Clipped Importance Sampling Policy Optimization) algorithm vanking 2026-02-02 22:23:27 +0800
  • 7389f64dee
    [add] add DAPO argorithm (Decoupled Clip and Dynamic sAmpling Policy Optimization) vanking 2026-02-02 22:16:52 +0800
  • 35fe1399c8
    Update README.md vanking 2026-02-02 16:33:02 +0800
  • 01c04f519b adjust nonhidden_params learning_rate to 5e-4 guo-sj 2026-02-02 14:48:22 +0800
  • 11a44340ba [update] save interval jingyaogong 2026-01-30 20:30:50 +0800
  • 04616c41a5 [update] safe half jingyaogong 2026-01-30 20:29:31 +0800
  • 020bd44f3f [mod] fix spo algorithm in RLAIF part Your Name 2026-01-30 11:03:35 +0800
  • a2e0fe710f add muon optimizer guo-sj 2026-01-29 10:10:39 +0800
  • c9545c502f
    Fix wording in RLHF section of README vanking 2026-01-27 20:21:58 +0800
  • 2db1238e31 modified: .gitignore FWJ321 2026-01-19 09:19:01 +0800
  • fea69cf338 [fix] data skip jingyaogong 2026-01-18 16:56:29 +0800
  • f7ffdf1fdb [update] shuffle data jingyaogong 2026-01-18 16:39:34 +0800
  • 3a5aba82db [fix] max length jingyaogong 2026-01-17 13:26:14 +0800
  • 714abcf802 [update] pretrain load jingyaogong 2026-01-17 12:00:17 +0800
  • aa539a824a [update] align mask jingyaogong 2026-01-15 11:20:41 +0800
  • c090b69c4d [update] align loss jingyaogong 2026-01-15 00:56:32 +0800
  • e119db8478 [fix] compile unpack jingyaogong 2026-01-14 20:13:32 +0800
  • 81d24a4f16 [feat] add compile jingyaogong 2026-01-14 14:42:30 +0800
  • 1279a61681 [update] prompt prefill jingyaogong 2026-01-13 17:46:54 +0800
  • 05d0b216f6 [update] show speed jingyaogong 2026-01-07 23:33:47 +0800
  • df89069362 [update] params log jingyaogong 2026-01-07 23:08:45 +0800
  • f55d4c32a0 [update] mask log jingyaogong 2026-01-07 22:12:26 +0800
  • c972c4e090
    Fix DPO loss_mask boundary (include first assistant token) xiao-baia 2026-01-07 21:00:46 +0800
  • 20a43d7db0 [update] readme jingyaogong 2026-01-07 00:58:38 +0800
  • 7641985d14 [update] simplify loader jingyaogong 2026-01-06 01:20:52 +0800
  • 0b4a8ad4aa [update] readme jingyaogong 2026-01-06 01:18:10 +0800
  • 07364c3fbe [update] rename train tokenizer jingyaogong 2026-01-06 01:17:33 +0800
  • 9830915d87 [update] readme jingyaogong 2026-01-05 23:15:25 +0800
  • 4e73f34823 [update] rename reason jingyaogong 2026-01-05 23:11:49 +0800
  • a8455ca8a3 [fix] messages num jingyaogong 2026-01-04 11:03:16 +0800
  • 42a4e8c86a [fix] dist cleanup jingyaogong 2026-01-02 22:25:55 +0800
  • 9d898576ac [update] aux loss jingyaogong 2026-01-01 22:37:49 +0800
  • c65335b56f [fix] experts unused jingyaogong 2025-12-31 21:47:04 +0800
  • bc8fd82166 [fix] layers set 8 jingyaogong 2025-12-31 21:06:37 +0800
  • 5dd4df7e18 [fix] moe unused jingyaogong 2025-12-31 21:00:06 +0800
  • 9236260a4a [feat] get params jingyaogong 2025-12-31 20:46:59 +0800
  • 288a1d7212 [feat] get params jingyaogong 2025-12-31 20:44:34 +0800
  • e34bd5c90e docs: clarify pretraining data format in README dieu 2025-12-31 13:39:59 +0800
  • eead9538b2 [feat] update config jingyaogong 2025-12-31 10:29:13 +0800
  • 6242980917 [feat] update lr jingyaogong 2025-12-31 10:27:09 +0800
  • 936d105e9b [feat] compatible tokenizer jingyaogong 2025-12-31 10:26:46 +0800
  • 4a5c9f5ece [feat] stream load data jingyaogong 2025-12-28 16:58:52 +0800