Merge branch 'main' of https://github.com/BlinkDL/RWKV-LM into main

016mm · Apr 11, 2023 · 11b606f · 11b606f
2 parents 55d9aeb + f29e01f
commit 11b606f
Show file tree

Hide file tree

Showing 2 changed files with 14 additions and 9 deletions.
diff --git a/.github/FUNDING.yml b/.github/FUNDING.yml
@@ -0,0 +1 @@
+ko_fi: rwkv_lm
diff --git a/README.md b/README.md
@@ -6,9 +6,9 @@ RWKV is an RNN with Transformer-level LLM performance, which can also be directl
 
 So it's combining the best of RNN and transformer - **great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding** (using the final hidden state).
 
-**HuggingFace Gradio demo (14B ctx8192)**: https://huggingface.co/spaces/BlinkDL/ChatRWKV-gradio
+HuggingFace Gradio demo (14B ctx8192): https://huggingface.co/spaces/BlinkDL/ChatRWKV-gradio
 
-Raven (7B finetuned on Alpaca) Demo: https://huggingface.co/spaces/BlinkDL/Raven-RWKV-7B
+**Raven** (7B finetuned on Alpaca and more) Demo: https://huggingface.co/spaces/BlinkDL/Raven-RWKV-7B
 
 **ChatRWKV:** with "stream" and "split" strategies and INT8. **3G VRAM is enough to run RWKV 14B :)** https://github.com/BlinkDL/ChatRWKV
 
@@ -108,16 +108,18 @@ prompt = f'\nQ & A\n\nQuestion:\n{qq}\n\nDetailed Expert Answer:\n' # let the mo
 
 https://pypi.org/project/rwkvstic/ a pip package (with 8bit & offload for low VRAM GPUs)
 
-https://github.com/harrisonvanderbyl/rwkv_chatbot a chatbot
+**https://github.com/saharNooby/rwkv.cpp rwkv.cpp for fast CPU reference**
+
+https://github.com/wfox4/WebChatRWKVv2 WebUI
+
+https://github.com/Blealtan/RWKV-LM-LoRA LoRA fine-tuning
 
-https://github.com/hizkifw/WebChatRWKVstic WebUI (WIP)
+https://github.com/harrisonvanderbyl/rwkv_chatbot a chatbot
 
 https://github.com/gururise/rwkv_gradio RWKV Gradio
 
 https://github.com/cryscan/eloise RWKV QQ bot
 
-https://github.com/Blealtan/RWKV-LM-LoRA LoRA fine-tuning
-
 https://github.com/mrsteyk/RWKV-LM-jax
 
 https://github.com/wozeparrot/tinyrwkv RWKV in tinygrad (nice simple DL framework)
@@ -146,6 +148,8 @@ https://github.com/Pathos14489/RWKVDistributedInference RWKV Distributed Inferen
 
 https://github.com/AXKuhta/rwkv-onnx-dml RWKV ONNX
 
+https://github.com/saharNooby/rwkv.cpp FP32, FP16 and quantized INT4 inference for CPU using [ggml](https://github.com/ggerganov/ggml)
+
 ### Inference
 
 **Run RWKV-4 Pile models:** Download models from https://huggingface.co/BlinkDL. Set TOKEN_MODE = 'pile' in run.py and run it. It's fast even on CPU (the default mode).
@@ -176,9 +180,9 @@ python tools/preprocess_data.py --input ./my_data.jsonl --output-prefix ./data/m
 ```
 The jsonl format sample (one line for each document):
 ```
-{"meta": {"ID": 101}, "text": "This is the first document."}
-{"meta": {"ID": 102}, "text": "Hello\nWorld"}
-{"meta": {"ID": 103}, "text": "1+1=2\n1+2=3\n2+2=4"}
+{"text": "This is the first document."}
+{"text": "Hello\nWorld"}
+{"text": "1+1=2\n1+2=3\n2+2=4"}
 ```
 generated by code like this:
 ```