Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
BlinkDL authored Apr 29, 2023
1 parent 2f57660 commit 91f8c0a
Showing 1 changed file with 10 additions and 10 deletions.
20 changes: 10 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,16 @@ A cool paper (Spiking Neural Network) using RWKV: https://github.com/ridgerchu/S

**RWKV in 150 lines** (model, inference, text generation): https://github.com/BlinkDL/ChatRWKV/blob/main/RWKV_in_150_lines.py

**Cool Community RWKV Projects (check them!)**:

https://github.com/saharNooby/rwkv.cpp INT4 INT8 FP16 FP32 inference for CPU using [ggml](https://github.com/ggerganov/ggml)

https://github.com/harrisonvanderbyl/rwkv-cpp-cuda pure CUDA RWKV (no need for python & pytorch)

https://github.com/Blealtan/RWKV-LM-LoRA LoRA fine-tuning

More RWKV projects: https://github.com/search?o=desc&q=rwkv&s=updated&type=Repositories

ChatRWKV with RWKV 14B ctx8192:

![RWKV-chat](RWKV-chat.png)
Expand Down Expand Up @@ -104,16 +114,6 @@ Here is a great prompt for testing Q&A of LLMs. Works for any model: (found by m
prompt = f'\nQ & A\n\nQuestion:\n{qq}\n\nDetailed Expert Answer:\n' # let the model generate after this
```

**Cool Community RWKV Projects (check them!)**:

https://github.com/saharNooby/rwkv.cpp FP32, FP16 and quantized INT4 inference for CPU using [ggml](https://github.com/ggerganov/ggml)

https://github.com/harrisonvanderbyl/rwkv-cpp-cuda pure CUDA RWKV (no need for python & pytorch)

https://github.com/Blealtan/RWKV-LM-LoRA LoRA fine-tuning

More RWKV projects: https://github.com/search?o=desc&q=rwkv&s=updated&type=Repositories

### Inference

**Run RWKV-4 Pile models:** Download models from https://huggingface.co/BlinkDL. Set TOKEN_MODE = 'pile' in run.py and run it. It's fast even on CPU (the default mode).
Expand Down

0 comments on commit 91f8c0a

Please sign in to comment.