From 00d145c481c1c5b80c7c850fc7dc194e52cbf0dc Mon Sep 17 00:00:00 2001 From: jingyaogong Date: Sat, 26 Apr 2025 10:21:34 +0800 Subject: [PATCH] update readme --- README.md | 5 ++--- README_en.md | 8 +++----- 2 files changed, 5 insertions(+), 8 deletions(-) diff --git a/README.md b/README.md index 21f949b..c92ac0f 100644 --- a/README.md +++ b/README.md @@ -130,7 +130,6 @@ - generate方式重构,继承自GenerationMixin类。 - 🔥支持llama.cpp、vllm、ollama等热门三方生态。 - 规范代码和目录结构。 -- 🔥更新:从0实现PPO、GRPO的训练代码。 - 改动词表``->`<|im_start|><|im_end|>` ```text 为兼容第三方推理框架llama.cpp、vllm,本次更新需付出一些可观代价。 @@ -510,12 +509,12 @@ quality(当然也还不算high,提升数据质量无止尽)。 --- -## Ⅷ 数据集下载 +## Ⅷ MiniMind训练数据集 > [!NOTE] > 2025-02-05后,开源MiniMind最终训练所用的所有数据集,因此无需再自行预处理大规模数据集,避免重复性的数据处理工作。 -MiniMind训练数据集 ([ModelScope](https://www.modelscope.cn/datasets/gongjy/minimind_dataset/files) | [HuggingFace](https://huggingface.co/datasets/jingyaogong/minimind_dataset/tree/main)) +MiniMind训练数据集下载地址: [HuggingFace](https://huggingface.co/datasets/jingyaogong/minimind_dataset/tree/main) > 无需全部clone,可单独下载所需的文件 diff --git a/README_en.md b/README_en.md index 1412244..7550da2 100644 --- a/README_en.md +++ b/README_en.md @@ -145,9 +145,7 @@ We hope this open-source project can help LLM beginners quickly get started! • 🔥 Support for popular third-party ecosystems like llama.cpp, vllm, and ollama. -• Standardized code and directory structure. - -• 🔥 New: Added training code for PPO and GRPO from scratch. +• Standardized code and directory structure. • Updated vocabulary tokens: `` → `<|im_start|><|im_end|>`. @@ -559,7 +557,7 @@ Big respect! --- -## Ⅷ Dataset Download +## Ⅷ MiniMind Dataset Download > [!NOTE] > After `2025-02-05`, MiniMind’s open-source datasets for final training are provided, so there is no need for @@ -567,7 +565,7 @@ Big respect! MiniMind Training Datasets are available for download from: -Dataset ([ModelScope](https://www.modelscope.cn/datasets/gongjy/minimind_dataset/files) | [HuggingFace](https://huggingface.co/datasets/jingyaogong/minimind_dataset/tree/main)) +MiniMind Dataset ([HuggingFace](https://huggingface.co/datasets/jingyaogong/minimind_dataset/tree/main)) > You don’t need to clone everything, just download the necessary files.