diff --git a/images/LLM-structure-moe.jpg b/images/LLM-structure-moe.jpg new file mode 100644 index 0000000..ac45ff1 Binary files /dev/null and b/images/LLM-structure-moe.jpg differ diff --git a/images/LLM-structure-moe.png b/images/LLM-structure-moe.png deleted file mode 100644 index 4588477..0000000 Binary files a/images/LLM-structure-moe.png and /dev/null differ diff --git a/images/LLM-structure.jpg b/images/LLM-structure.jpg new file mode 100644 index 0000000..2fe5a6b Binary files /dev/null and b/images/LLM-structure.jpg differ diff --git a/images/LLM-structure.png b/images/LLM-structure.png deleted file mode 100755 index bbd93dd..0000000 Binary files a/images/LLM-structure.png and /dev/null differ diff --git a/images/minimind-3.gif b/images/minimind-3.gif new file mode 100644 index 0000000..131d762 Binary files /dev/null and b/images/minimind-3.gif differ diff --git a/images/minimind2.gif b/images/minimind2.gif deleted file mode 100644 index 43c9cd1..0000000 Binary files a/images/minimind2.gif and /dev/null differ diff --git a/index.html b/index.html index 2b43dbf..d415c5d 100644 --- a/index.html +++ b/index.html @@ -528,7 +528,7 @@

- Train a 26M ChatBot from zero.
+ Train a 64M ChatBot from zero.
2 hours. ¥3. One 3090.
That's it.

@@ -537,7 +537,7 @@
-

26M

+

64M

Parameters

@@ -549,7 +549,7 @@

Cost

-

1/7000

+

1/2700

vs GPT-3

@@ -563,11 +563,11 @@

📦 Full Stack

-

Complete pipeline: Tokenizer → Pretrain → SFT → LoRA → PPO/GRPO/SPO

+

Complete pipeline: Tokenizer → Pretrain → SFT → LoRA → DPO → PPO/GRPO/CISPO → Agentic RL

🔬 Latest RL

-

PPO, GRPO, SPO + YaRN length extrapolation. Native PyTorch implementation.

+

PPO, GRPO, CISPO + Agentic RL + YaRN length extrapolation. Native PyTorch.

📖 Learn by Reading

@@ -575,11 +575,11 @@

🔌 Plug & Play

-

Compatible with vLLM, ollama, llama.cpp, transformers.

+

Compatible with vLLM, ollama, llama.cpp, SGLang, transformers.

⚡ OpenAI API

-

Drop-in replacement for FastGPT, Open-WebUI, Dify.

+

Drop-in replacement for FastGPT, Open-WebUI, Dify. Tool Calling & Adaptive Thinking.

@@ -598,23 +598,16 @@ - MiniMind2-Small - 26M - 512 + MiniMind-3 + 64M + 768 8 ~0.5 GB - MiniMind2 - 104M + MiniMind-3-MoE + 198M / A64M 768 - 16 - ~1.0 GB - - - MiniMind2-MoE - 145M - 640 8 ~1.0 GB @@ -627,20 +620,37 @@
- 🔥 2025-10-24 (Latest) + 🔥 2026-03-20 (Latest)
+
+
    +
  • 🔥 Release minimind-3 / minimind-3-moe: structure, tokenizer, training & inference fully updated
  • +
  • Architecture aligned with Qwen3 / Qwen3-MoE: Dense ~64M, MoE ~198M/A64M
  • +
  • New native Agentic RL script (train_agent.py): multi-turn Tool-Use with GRPO/CISPO
  • +
  • RLAIF / Agentic RL rollout engine decoupled for flexible inference backends
  • +
  • serve_openai_api.py & web_demo.py: reasoning_content / tool_calls / open_thinking
  • +
  • Tokenizer updated (BPE + ByteLevel) with tool call & thinking tokens
  • +
  • LoRA weight merge & export via scripts/convert_model.py
  • +
  • README & architecture diagrams major update
  • +
+
+
+ +
+
+ ⚙️ 2025-10-24 + + +
  • 🔥 RLAIF algorithms: PPO, GRPO, SPO (native PyTorch)
  • Checkpoint resume training: auto-save & cross-GPU recovery
  • -
  • RLAIF dataset: rlaif-mini.jsonl (10K samples); Simplified DPO dataset with Chinese data
  • +
  • RLAIF dataset: rlaif-mini.jsonl; Simplified DPO dataset with Chinese data
  • YaRN algorithm for RoPE length extrapolation
  • Adaptive Thinking in reasoning models
  • Tool Calling & Reasoning tags support
  • -
  • Complete RLAIF chapter with training curves
  • SwanLab integration (WandB alternative for China)
  • -
  • Code standardization & bug fixes
@@ -663,7 +673,7 @@
- 🎉 2025-02-09 (MiniMind2 Release) + 🎉 2025-02-09 (MiniMind2) +
@@ -753,9 +763,9 @@
🎮 Inside MiniMind
- Streamlit Demo - LLM Structure - LLM Structure MOE + Streamlit Demo + LLM Structure + LLM Structure MOE
@@ -798,10 +808,10 @@
  • Pure PyTorch—no magic, no black boxes
  • Understand by building, not by reading docs
  • Works on your laptop—no cloud GPUs needed
  • -
  • 2025 RLAIF algorithms: PPO, GRPO, SPO
  • +
  • RLAIF algorithms: PPO, GRPO, CISPO + Agentic RL
  • +
  • Tool Calling & Adaptive Thinking built-in
  • OpenAI API compatible—plug into any UI
  • Vision support via MiniMind-V
  • -
  • Code you can actually read and modify
  • 💭 "Building a Lego plane beats flying first class."