From 1cc73836d4750272ed3be41ef474078fad5b7d26 Mon Sep 17 00:00:00 2001
From: gongjy <2474590974@qq.com>
Date: Fri, 27 Sep 2024 16:38:18 +0800
Subject: [PATCH] update readme info

---
 README.md    | 5 ++---
 README_en.md | 5 ++---
 2 files changed, 4 insertions(+), 6 deletions(-)
diff --git a/README.md b/README.md
index d57023a..5515b81 100644
--- a/README.md
+++ b/README.md
@@ -80,7 +80,7 @@ https://github.com/user-attachments/assets/88b98128-636e-43bc-a419-b1b1403c2055
 <details close> 
 <summary> <b>2024-09-27</b> </summary>
 
-- 09-27更新pretrain数据集的预处理方式，为了保证文本完整性，放弃预处理成.bin训练的形式（轻微牺牲训练速度）。
+- 👉09-27更新pretrain数据集的预处理方式，为了保证文本完整性，放弃预处理成.bin训练的形式（轻微牺牲训练速度）。
 
 - 目前pretrain预处理后的文件命名为：pretrain_data.csv。
 
@@ -252,8 +252,7 @@ streamlit run fast_inference.py
       <tr><td>minimind tokenizer</td><td>6,400</td><td>自定义</td></tr>
     </table>
 
-  > [!TIP]
-  > 2024-09-17更新：为了防止过去的版本歧义&控制体积，minimind所有模型均使用minimind_tokenizer分词，废弃所有mistral_tokenizer版本。
+  > 👉2024-09-17更新：为了防止过去的版本歧义&控制体积，minimind所有模型均使用minimind_tokenizer分词，废弃所有mistral_tokenizer版本。
 
   > 尽管minimind_tokenizer长度很小，编解码效率弱于qwen2、glm等中文友好型分词器。
   > 但minimind模型选择了自己训练的minimind_tokenizer作为分词器，以保持整体参数轻量，避免编码层和计算层占比失衡，头重脚轻，因为minimind的词表大小只有6400。
diff --git a/README_en.md b/README_en.md
index fd0f86f..3fc8f35 100644
--- a/README_en.md
+++ b/README_en.md
@@ -87,7 +87,7 @@ We hope this open-source project helps LLM beginners get started quickly!
 <details close> 
 <summary> <b>2024-09-27</b> </summary>
 
-- Updated the preprocessing method for the pretrain dataset on 09-27 to ensure text integrity, opting to abandon the preprocessing into .bin training format (slightly sacrificing training speed).
+- 👉Updated the preprocessing method for the pretrain dataset on 09-27 to ensure text integrity, opting to abandon the preprocessing into .bin training format (slightly sacrificing training speed).
 
 - The current filename for the pretrain data after preprocessing is: pretrain_data.csv.
 
@@ -282,8 +282,7 @@ git clone https://github.com/jingyaogong/minimind.git
       <tr><td>minimind tokenizer</td><td>6,400</td><td>Custom</td></tr>
     </table>
 
-  > [!IMPORTANT]
-  > Update on 2024-09-17: To avoid ambiguity from previous versions and control the model size, all Minimind models now
+  > 👉Update on 2024-09-17: To avoid ambiguity from previous versions and control the model size, all Minimind models now
   use the Minimind_tokenizer for tokenization, and all versions of the Mistral_tokenizer have been deprecated.
 
   > Although the Minimind_tokenizer has a small length and its encoding/decoding efficiency is weaker compared to