Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Let's talk about converting and runtime 我们来谈谈转换和运行时 #42

Open
80Builder80 opened this issue May 11, 2024 · 0 comments

Comments

@80Builder80
Copy link

80Builder80 commented May 11, 2024

It seems like some important information has been left out of the documentation.
文档中似乎遗漏了一些重要信息。

Apparently some changes need to be made to the tokenizer_config.json file and the main.cpp files.
显然需要对 tokenizer_config.json 文件和 main.cpp 文件进行一些更改。

Before converting the model, the tokenizer_config.json file needs to be modified. Specifically "eos_token": "<|end_of_text|>"
should be changed to "eos_token": "<|end|>"
What about the "bos_token": "<|begin_of_text|>" ? Should it be changed to "bos_token": "<|start|>" Are there any other areas of the tokenizer that needs to be modified?
在转换模型之前,需要修改 tokenizer_config.json 文件。具体来说是“eos_token”:“<|end_of_text|>”
应更改为“eos_token”:“<|end|>”
那 "bos_token": "<|begin_of_text|>" 怎么样?是否应该更改为 "bos_token": "<|start|>" 分词器还有其他区域需要修改吗?

The rkllm-runtime/examples/rkllm_api_demo/src/main.cpp file also needs to be modified.
#define PROMPT_TEXT_PREFIX "<|im_start|>system You are a helpful assistant. <|im_end|> <|im_start|>user"
#define PROMPT_TEXT_POSTFIX "<|im_end|><|im_start|>assistant"
should be changed to
#define PROMPT_TEXT_PREFIX "<|user|>"
#define PROMPT_TEXT_POSTFIX "<|end|><|assistant|>"
What about the system prompt?

rkllm-runtime/examples/rkllm_api_demo/src/main.cpp 文件也需要修改。
#define PROMPT_TEXT_PREFIX "<|im_start|>system 你是一个有用的助手。<|im_end|> <|im_start|>user"
#define PROMPT_TEXT_POSTFIX "<|im_end|><|im_start|>助理"
应该改为
#define PROMPT_TEXT_PREFIX "<|用户|>"
#define PROMPT_TEXT_POSTFIX "<|结束|><|助手|>"
系统提示怎么办?

If the tokenizer needs to be modified, should the readme not reflect this information? If the main.cpp is not correct, shouldn't that be fixed in the repo?
如果需要修改分词器,自述文件是否不应反映此信息?如果 main.cpp 不正确,是否应该在存储库中修复它?

There is a discussion going on Reddit about this topic.
Reddit 上正在讨论这个话题。
https://www.reddit.com/r/RockchipNPU/comments/1cpngku/rknnllm_v101_lets_talk_about_converting_and/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant