Merge pull request THUDM#62 from GanymedeNil/main

Add some parameter support in web demo
kwang1971 · Mar 17, 2023 · ecd2857 · ecd2857
2 parents fc4ac83 + 702c2ca
commit ecd2857
Show file tree

Hide file tree

Showing 11 changed files with 135 additions and 8 deletions.
diff --git a/.github/ISSUE_TEMPLATE/bug_report.yaml b/.github/ISSUE_TEMPLATE/bug_report.yaml
@@ -0,0 +1,63 @@
+name: 🐞 Bug/Help
+description: File a bug/issue
+title: "[BUG/Help] <title>"
+labels: []
+body:
+- type: checkboxes
+  attributes:
+    label: Is there an existing issue for this?
+    description: Please search to see if an issue already exists for the bug you encountered.
+    options:
+    - label: I have searched the existing issues
+      required: true
+- type: textarea
+  attributes:
+    label: Current Behavior
+    description: | 
+      A concise description of what you're experiencing, with screenshot attached if possible.
+      Tip: You can attach images or log files by clicking this area to highlight it and then dragging files in.
+  validations:
+    required: true
+- type: textarea
+  attributes:
+    label: Expected Behavior
+    description: A concise description of what you expected to happen.
+  validations:
+    required: false
+- type: textarea
+  attributes:
+    label: Steps To Reproduce
+    description: Steps to reproduce the behavior.
+    placeholder: |
+      1. In this environment...
+      2. With this config...
+      3. Run '...'
+      4. See error...
+  validations:
+    required: true
+- type: textarea
+  attributes:
+    label: Environment
+    description: |
+      examples:
+        - **OS**: Ubuntu 20.04
+        - **Python**: 3.8
+        - **Transformers**: 4.26.1
+        - **PyTorch**: 1.12
+        - **CUDA Support**: True
+    value: |
+        - OS:
+        - Python:
+        - Transformers:
+        - PyTorch:
+        - CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) :
+    render: markdown
+  validations:
+    required: true
+- type: textarea
+  attributes:
+    label: Anything else?
+    description: |
+      Links? References? Anything that will give us more context about the issue you are encountering!
+  validations:
+    required: false
diff --git a/.github/ISSUE_TEMPLATE/config.yml b/.github/ISSUE_TEMPLATE/config.yml
@@ -0,0 +1 @@
+blank_issues_enabled: false
diff --git a/.github/ISSUE_TEMPLATE/feature_request.yml b/.github/ISSUE_TEMPLATE/feature_request.yml
@@ -0,0 +1,26 @@
+name: Feature request
+description: Suggest an idea for this project
+title: "[Feature] <title>"
+labels: []
+body:
+- type: textarea
+  attributes:
+    label: Is your feature request related to a problem? Please describe.
+    description: | 
+      A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
+  validations:
+    required: false
+- type: textarea
+  attributes:
+    label: Solutions
+    description: |
+      Describe the solution you'd like
+      A clear and concise description of what you want to happen.
+  validations:
+    required: true
+- type: textarea
+  attributes:
+    label: Additional context
+    description: Add any other context or screenshots about the feature request here.
+  validations:
+    required: false
diff --git a/README.md b/README.md
@@ -2,9 +2,10 @@
 
 ## 介绍
 
-ChatGLM-6B 是一个开源的、支持中英双语的对话语言模型，基于 [General Language Model (GLM)](https://github.com/THUDM/GLM) 架构，具有 62 亿参数。结合模型量化技术，用户可以在消费级的显卡上进行本地部署（INT4 量化级别下最低只需 6GB 显存）。ChatGLM-6B 使用了和 ChatGPT 相似的技术，针对中文问答和对话进行了优化。经过约 1T 标识符的中英双语训练，辅以监督微调、反馈自助、人类反馈强化学习等技术的加持，62 亿参数的 ChatGLM-6B 已经能生成相当符合人类偏好的回答。更多信息请参考我们的[博客](https://chatglm.cn/blog)。
+ChatGLM-6B 是一个开源的、支持中英双语的对话语言模型，基于 [General Language Model (GLM)](https://github.com/THUDM/GLM) 架构，具有 62 亿参数。结合模型量化技术，用户可以在消费级的显卡上进行本地部署（INT4 量化级别下最低只需 6GB 显存）。
+ChatGLM-6B 使用了和 ChatGPT 相似的技术，针对中文问答和对话进行了优化。经过约 1T 标识符的中英双语训练，辅以监督微调、反馈自助、人类反馈强化学习等技术的加持，62 亿参数的 ChatGLM-6B 已经能生成相当符合人类偏好的回答。更多信息请参考我们的[博客](https://chatglm.cn/blog)。
 
-同时，我们基于千亿基座的[ChatGLM 模型](https://chatglm.cn)正在邀请制内测，后续将逐步扩大内测范围，欢迎申请加入内测。
+不过，由于ChatGLM-6B的规模较小，目前已知其具有相当多的[**局限性**](#局限性)，如事实性/数学逻辑错误，可能生成有害/有偏见内容，较弱的上下文能力，自我认知混乱，以及对英文指示生成与中文指示完全矛盾的内容。请大家在使用前了解这些问题，以免产生误解。
 
 *Read this in [English](README_en.md).*
 
@@ -46,7 +47,7 @@ ChatGLM-6B 是一个开源的、支持中英双语的对话语言模型，基于
 
 如果这些方法无法帮助你入睡,你可以考虑咨询医生或睡眠专家,寻求进一步的建议。
 ```
-完整的模型实现可以在 [Hugging Face Hub](https://huggingface.co/THUDM/chatglm-6b) 上查看。
+完整的模型实现可以在 [Hugging Face Hub](https://huggingface.co/THUDM/chatglm-6b) 上查看。如果你从Hugging Face Hub上下载checkpoint的速度较慢，也可以从[这里](https://cloud.tsinghua.edu.cn/d/fb9f16d6dc8f482596c2/)手动下载。
 
 ### Demo
 
@@ -107,6 +108,8 @@ model = AutoModel.from_pretrained("THUDM/chatglm-6b", trust_remote_code=True).bf
 ```
 需保证空闲内存接近16G，并且推理速度会很慢。
 
+MacOS 如果报错`RuntimeError: Unknown platform: darwin`的话请参考这个[Issue](https://github.com/THUDM/ChatGLM-6B/issues/6#issuecomment-1470060041).
+
 ## ChatGLM-6B示例
 
 以下是一些使用`web_demo.py`得到的示例截图。更多ChatGLM-6B的可能，等待你来探索发现！
@@ -163,6 +166,34 @@ model = AutoModel.from_pretrained("THUDM/chatglm-6b", trust_remote_code=True).bf
 
 </details>
 
+## 局限性
+
+由于ChatGLM-6B的小规模，其能力仍然有许多局限性。以下是我们目前发现的一些问题：
+
+- 模型容量较小：6B的小容量，决定了其相对较弱的模型记忆和语言能力。在面对许多事实性知识任务时，ChatGLM-6B可能会生成不正确的信息；它也不擅长逻辑类问题（如数学、编程）的解答。
+    <details><summary><b>点击查看例子</b></summary>
+
+    ![](limitations/factual_error.png)
+
+    ![](limitations/math_error.png)
+
+    </details>
+
+- 产生有害说明或有偏见的内容：ChatGLM-6B只是一个初步与人类意图对齐的语言模型，可能会生成有害、有偏见的内容。（内容可能具有冒犯性，此处不展示）
+
+- 英文能力不足：ChatGLM-6B 训练时使用的指示/回答大部分都是中文的，仅有极小一部分英文内容。因此，如果输入英文指示，回复的质量远不如中文，甚至与中文指示下的内容矛盾，并且出现中英夹杂的情况。
+
+- 易被误导，对话能力较弱：ChatGLM-6B 对话能力还比较弱，而且 “自我认知” 存在问题，并很容易被误导并产生错误的言论。例如当前版本的模型在被误导的情况下，会在自我认知上发生偏差。
+    <details><summary><b>点击查看例子</b></summary>
+
+    ![](limitations/self-confusion_google.jpg)
+
+    ![](limitations/self-confusion_openai.jpg)
+
+    ![](limitations/self-confusion_tencent.jpg)
+
+    </details>
+
 ## 协议
 
 本仓库的代码依照 [Apache-2.0](LICENSE) 协议开源，ChatGLM-6B 模型的权重的使用则需要遵循 [Model License](MODEL_LICENSE)。

diff --git a/limitations/factual_error.png b/limitations/factual_error.png
diff --git a/limitations/math_error.png b/limitations/math_error.png
diff --git a/limitations/self-confusion_google.jpg b/limitations/self-confusion_google.jpg
diff --git a/limitations/self-confusion_openai.jpg b/limitations/self-confusion_openai.jpg
diff --git a/limitations/self-confusion_tencent.jpg b/limitations/self-confusion_tencent.jpg
diff --git a/requirements.txt b/requirements.txt
@@ -2,3 +2,4 @@ protobuf>=3.19.5,<3.20.1
 transformers>=4.26.1
 icetk
 cpm_kernels
+torch>=1.10
diff --git a/web_demo.py b/web_demo.py
@@ -9,10 +9,11 @@
 MAX_BOXES = MAX_TURNS * 2
 
 
-def predict(input, history=None):
+def predict(input, max_length, top_p, temperature, history=None):
     if history is None:
         history = []
-    response, history = model.chat(tokenizer, input, history)
+    response, history = model.chat(tokenizer, input, history, max_length=max_length, top_p=top_p,
+                                   temperature=temperature)
     updates = []
     for query, response in history:
         updates.append(gr.update(visible=True, value="用户：" + query))
@@ -33,8 +34,12 @@ def predict(input, history=None):
 
     with gr.Row():
         with gr.Column(scale=4):
-            txt = gr.Textbox(show_label=False, placeholder="Enter text and press enter").style(container=False)
+            txt = gr.Textbox(show_label=False, placeholder="Enter text and press enter", lines=11).style(
+                container=False)
         with gr.Column(scale=1):
+            max_length = gr.Slider(0, 4096, value=2048, step=1.0, label="Maximum length", interactive=True)
+            top_p = gr.Slider(0, 1, value=0.7, step=0.01, label="Top P", interactive=True)
+            temperature = gr.Slider(0, 1, value=0.95, step=0.01, label="Temperature", interactive=True)
             button = gr.Button("Generate")
-    button.click(predict, [txt, state], [state] + text_boxes)
-demo.queue().launch(share=True)
+    button.click(predict, [txt, max_length, top_p, temperature, state], [state] + text_boxes)
+demo.queue().launch(share=True, inbrowser=True)