forked from npuichigo/openai_trtllm
-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
And use it in Triton chat completions and legacy completions. To activate, configure a prompt_format for the chat_completions model: ``` [models.chat_completions."Mistral-7B-Instruct-v0.2"] ... prompt_format = "mistral" ``` This will look for templates in /etc/ai-router/templates. The template for chat completions should go in the chat subdirectory, and for legacy completions the template should go in the completions subdirectory. Example templates Mistral-7B-Instruct-v0.2 (exclude the ```): Chat, based on the template from the Hugging Face Hub, which only supports the user and assistant roles, to be placed in /etc/ai-router/templates/chat/mistral.j2: ``` {%- set bos_token = '<s>' -%} {% set eos_token = '</s>' -%} {{ bos_token -}} {%- for message in messages -%} {% if (message['role'] == 'user') != (loop.index0 % 2 == 0) -%} {{ raise_exception('Conversation roles must alternate user/assistant/user/assistant/...') }} {% endif -%} {% if message['role'] == 'user' -%} {{ '[INST] ' + message['content'] + ' [/INST]' -}} {% elif message ['role'] == 'assistant' -%} {{ ' ' + message['content'] + eos_token -}} {% else -%} {{ raise_exception('Only user and assistant roles are supported!') }} {% endif -%} {% endfor %} ``` Modified version of the above template that injects a system prompt before the first user prompt: ``` {%- set bos_token = '<s>' -%} {% set eos_token = '</s>' -%} {% set mod = 0 -%} {% set system = '' -%} {{ bos_token -}} {%- for message in messages -%} {% if (message['role'] == 'system' and loop.index0 == 0) -%} {% set mod = 1 -%} {% set system = message['content'] %} {% else -%} {% if (message['role'] == 'user') != (loop.index0 % 2 == mod) -%} {{ raise_exception('Conversation roles must alternate user/assistant/user/assistant/... ' ) }} {% endif -%} {% if message['role'] == 'user' -%} {% if system and system | length > 0 and loop.index0 == 1 -%} {{ '[INST] ' + system + '\n' + message['content'] + ' [/INST]' -}} {% else -%} {{ '[INST] ' + message['content'] + ' [/INST]' -}} {% endif %} {% elif message ['role'] == 'assistant' -%} {{ ' ' + message['content'] + eos_token -}} {% else -%} {{ raise_exception('Only user and assistant roles are supported!') }} {% endif -%} {% endif -%} {% endfor -%} ``` Legacy completions do not support roles, so a much simpler template can be used, in /etc/ai-router/templates/completions/mistral.j2: ``` [INST] {% for message in messages -%} {{ message -}} {% endfor %} [/INST] ``` As we use chat_completion models in the config for both chat completions and legacy completions, configure a prompt_format for a model will require you to place a template file for both chat completions and legacy completions in the expected location. If one of them is missing, ai-router will not start. The error message should point out why: Error: config file validation failed: model `meta-llama/Llama-2-70b-chat-hf` has prompt_format configured but template legacy completions (/etc/ai-router/templates/completions/llama.j2) is missing If you wish to only enable chat completions for a model, and disable legacy completions, this can be done by simply raising an exception in the template: ``` {{ raise_exception('Legacy completions are disabled for this model') }} ``` Closes: #4
- Loading branch information
Showing
13 changed files
with
160 additions
and
62 deletions.
There are no files selected for viewing
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -6,5 +6,6 @@ pub mod routes; | |
pub mod startup; | ||
mod state; | ||
pub mod telemetry; | ||
mod templater; | ||
mod tokenizers; | ||
mod utils; |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.