mirror of
https://github.com/jingyaogong/minimind.git
synced 2026-05-03 12:52:34 +00:00
[update] readme
This commit is contained in:
@@ -34,7 +34,7 @@
|
|||||||
* 此开源项目旨在完全从 0 开始,仅用 3 块钱成本与 2 小时训练时间,即可训练出规模约为 64M 的超小语言模型 MiniMind。
|
* 此开源项目旨在完全从 0 开始,仅用 3 块钱成本与 2 小时训练时间,即可训练出规模约为 64M 的超小语言模型 MiniMind。
|
||||||
* MiniMind 系列极其轻量,主线最小版本体积约为 GPT-3 的 $\frac{1}{2700}$,力求让普通个人 GPU 也能快速完成训练与复现。
|
* MiniMind 系列极其轻量,主线最小版本体积约为 GPT-3 的 $\frac{1}{2700}$,力求让普通个人 GPU 也能快速完成训练与复现。
|
||||||
* 项目同时开源了大模型的极简结构与完整训练链路,覆盖 MoE、数据清洗、预训练(Pretrain)、监督微调(SFT)、LoRA、RLHF(DPO)、RLAIF(PPO / GRPO / CISPO)、Tool Use、Agentic RL、自适应思考与模型蒸馏等全过程代码。
|
* 项目同时开源了大模型的极简结构与完整训练链路,覆盖 MoE、数据清洗、预训练(Pretrain)、监督微调(SFT)、LoRA、RLHF(DPO)、RLAIF(PPO / GRPO / CISPO)、Tool Use、Agentic RL、自适应思考与模型蒸馏等全过程代码。
|
||||||
* MiniMind 同时拓展了视觉多模态版本 [MiniMind-V](https://github.com/jingyaogong/minimind-v)、扩散语言模型(MiniMind-dLM)、线性模型(MiniMind-Linear),详见 [Discussion](https://github.com/jingyaogong/minimind/discussions)。
|
* MiniMind 同时拓展了视觉模态模型 [MiniMind-V](https://github.com/jingyaogong/minimind-v)、多模态Omni模型 [MiniMind-O](https://github.com/jingyaogong/minimind-o)、扩散语言模型(MiniMind-dLM)、线性模型(MiniMind-Linear),详见 [Discussion](https://github.com/jingyaogong/minimind/discussions)。
|
||||||
* 项目所有核心算法代码均从 0 使用 PyTorch 原生实现,不依赖第三方库提供的高层抽象接口。
|
* 项目所有核心算法代码均从 0 使用 PyTorch 原生实现,不依赖第三方库提供的高层抽象接口。
|
||||||
* 这不仅是一个大语言模型全阶段开源复现项目,也是一套面向 LLM 入门与实践的教程。
|
* 这不仅是一个大语言模型全阶段开源复现项目,也是一套面向 LLM 入门与实践的教程。
|
||||||
* 希望此项目能为更多人提供一个可复现、可理解、可扩展的起点,一起感受创造的乐趣,并推动更广泛 AI 社区的进步。
|
* 希望此项目能为更多人提供一个可复现、可理解、可扩展的起点,一起感受创造的乐趣,并推动更广泛 AI 社区的进步。
|
||||||
|
|||||||
+1
-1
@@ -34,7 +34,7 @@
|
|||||||
* This open-source project aims to train an ultra-small language model MiniMind with approximately 64M parameters entirely from scratch, using only 3 CNY in cost and 2 hours of training time.
|
* This open-source project aims to train an ultra-small language model MiniMind with approximately 64M parameters entirely from scratch, using only 3 CNY in cost and 2 hours of training time.
|
||||||
* The MiniMind series is extremely lightweight, with the smallest version on the main branch being approximately $\frac{1}{2700}$ the size of GPT-3, striving to enable even ordinary personal GPUs to quickly complete training and reproduction.
|
* The MiniMind series is extremely lightweight, with the smallest version on the main branch being approximately $\frac{1}{2700}$ the size of GPT-3, striving to enable even ordinary personal GPUs to quickly complete training and reproduction.
|
||||||
* The project also open-sources the minimalist structure and complete training pipeline of large models, covering the entire process code for MoE, data cleaning, Pretraining, Supervised Fine-Tuning (SFT), LoRA, RLHF (DPO), RLAIF (PPO / GRPO / CISPO), Tool Use, Agentic RL, Adaptive Thinking, and Model Distillation.
|
* The project also open-sources the minimalist structure and complete training pipeline of large models, covering the entire process code for MoE, data cleaning, Pretraining, Supervised Fine-Tuning (SFT), LoRA, RLHF (DPO), RLAIF (PPO / GRPO / CISPO), Tool Use, Agentic RL, Adaptive Thinking, and Model Distillation.
|
||||||
* MiniMind has also been extended to a visual multimodal version [MiniMind-V](https://github.com/jingyaogong/minimind-v), a diffusion language model (MiniMind-dLM), and a linear attention model (MiniMind-Linear), See [Discussion](https://github.com/jingyaogong/minimind/discussions) for details.
|
* MiniMind has also been extended to a visual model [MiniMind-V](https://github.com/jingyaogong/minimind-v), a multimodal Omni model [MiniMind-O](https://github.com/jingyaogong/minimind-o), a diffusion language model (MiniMind-dLM), and a linear attention model (MiniMind-Linear), See [Discussion](https://github.com/jingyaogong/minimind/discussions) for details.
|
||||||
* All core algorithm code in the project is implemented from scratch using native PyTorch, without relying on high-level abstract interfaces provided by third-party libraries.
|
* All core algorithm code in the project is implemented from scratch using native PyTorch, without relying on high-level abstract interfaces provided by third-party libraries.
|
||||||
* This is not only a full-stage open-source reproduction project for large language models, but also a tutorial oriented towards LLM introduction and practice.
|
* This is not only a full-stage open-source reproduction project for large language models, but also a tutorial oriented towards LLM introduction and practice.
|
||||||
* We hope this project can provide a reproducible, understandable, and extensible starting point for more people, to share the joy of creation together and promote the progress of the broader AI community.
|
* We hope this project can provide a reproducible, understandable, and extensible starting point for more people, to share the joy of creation together and promote the progress of the broader AI community.
|
||||||
|
|||||||
Reference in New Issue
Block a user