WIP: Prep 0.21.0 (vegu-ai#83)

* cleanup * refactor clean_dialogue * prompt fixes * prompt fixes * conversation format types - movie script and chat (legacy) * stopping strings updated * mistral.ai client * prompt tweaks * mistral client return token counts * anthropic client * archive history emits whole object so we can inspectr time stamps * show timestamp in history dialog * openai compat fixes to stop trying to coerce openai url path schema and to never attempt to retrieve the model name automatically, hopefully improving compatibility with the various openai api implementations across the board * openai compat client let api control prompt template via config option * fix custom client configs and implement max backscroll * fix backscroll limit * remove debug message * prep 0.21.0 * include model name in prompt template selection label * use tabs for side nav in app config modal * readme / docs * fix issue where "No API key set" could be persisted as the selected model name to the config * deepinfra example * linting
MiyeonLin · Mar 10, 2024 · abdfb1a · abdfb1a
1 parent 2f07248
commit abdfb1a
Show file tree

Hide file tree

Showing 36 changed files with 1,415 additions and 715 deletions.
diff --git a/README.md b/README.md
@@ -7,16 +7,21 @@ Roleplay with AI with a focus on strong narration and consistent world and game
 |![Screenshot 4](docs/img/0.17.0/ss-4.png)|![Screenshot 1](docs/img/0.19.0/Screenshot_15.png)|
 |![Screenshot 2](docs/img/0.19.0/Screenshot_16.png)|![Screenshot 3](docs/img/0.19.0/Screenshot_17.png)|
 
-> :warning: **It does not run any large language models itself but relies on existing APIs. Currently supports OpenAI, text-generation-webui and LMStudio. 0.18.0 also adds support for generic OpenAI api implementations, but generation quality on that will vary.**
+> :warning: **It does not run any large language models itself but relies on existing APIs. Currently supports OpenAI, Anthropic, mistral.ai, self-hosted text-generation-webui and LMStudio. 0.18.0 also adds support for generic OpenAI api implementations, but generation quality on that will vary.**
 
-This means you need to either have:
-- an [OpenAI](https://platform.openai.com/overview) api key
-- setup local (or remote via runpod) LLM inference via:
-    - [oobabooga/text-generation-webui](https://github.com/oobabooga/text-generation-webui)
-    - [LMStudio](https://lmstudio.ai/)
-- Any other OpenAI api implementation that implements the v1/completions endpoint
-    - tested llamacpp with the `api_like_OAI.py` wrapper
-    - let me know if you have tested any other implementations and they failed / worked or landed somewhere in between
+Officially supported APIs:
+- [OpenAI](https://platform.openai.com/overview)
+- [Anthropic](https://www.anthropic.com/)
+- [mistral.ai](https://mistral.ai/)
+
+Officially supported self-hosted APIs:
+- [oobabooga/text-generation-webui](https://github.com/oobabooga/text-generation-webui) (local or with runpod support)
+- [LMStudio](https://lmstudio.ai/)
+
+Generic OpenAI api implementations (tested and confirmed working):
+- [DeepInfra](https://deepinfra.com/) - see [instructions](https://github.com/vegu-ai/talemate/issues/78#issuecomment-1986884304)
+- [llamacpp](https://github.com/ggerganov/llama.cpp) with the `api_like_OAI.py` wrapper
+- let me know if you have tested any other implementations and they failed / worked or landed somewhere in between
 
 ## Current features
 
@@ -78,8 +83,9 @@ Please read the documents in the `docs` folder for more advanced configuration a
     - [Installation](#installation)
     - [Connecting to an LLM](#connecting-to-an-llm)
         - [Text-generation-webui](#text-generation-webui)
-        - [Recommended Models](#recommended-models)
-        - [OpenAI](#openai)
+            - [Recommended Models](#recommended-models)
+        - [OpenAI / mistral.ai / Anthropic](#openai)
+        - [DeepInfra via OpenAI Compatible client](#deepinfra-via-openai-compatible-client)
     - [Ready to go](#ready-to-go)
     - [Load the introductory scenario "Infinity Quest"](#load-the-introductory-scenario-infinity-quest)
     - [Loading character cards](#loading-character-cards)
@@ -118,61 +124,101 @@ There is also a [troubleshooting guide](docs/troubleshoot.md) that might help.
 1. Start the backend: `python src/talemate/server/run.py runserver --host 0.0.0.0 --port 5050`.
 1. Open a new terminal, navigate to the `talemate_frontend` directory, and start the frontend server by running `npm run serve`.
 
-## Connecting to an LLM
+# Connecting to an LLM
 
 On the right hand side click the "Add Client" button. If there is no button, you may need to toggle the client options by clicking this button:
 
 ![Client options](docs/img/client-options-toggle.png)
 
-### Text-generation-webui
+![No clients](docs/img/0.21.0/no-clients.png)
+
+## Text-generation-webui
 
 > :warning: As of version 0.13.0 the legacy text-generator-webui API `--extension api` is no longer supported, please use their new `--extension openai` api implementation instead. 
 
 In the modal if you're planning to connect to text-generation-webui, you can likely leave everything as is and just click Save.
 
-![Add client modal](docs/img/client-setup-0.13.png)
+![Add client modal](docs/img/0.21.0/text-gen-webui-setup.png)
+
+### Specifying the correct prompt template
+
+For good results it is **vital** that the correct prompt template is specified for whichever model you have loaded.
+
+Talemate does come with a set of pre-defined templates for some popular models, but going forward, due to the sheet number of models released every day, understanding and specifying the correct prompt template is something you should familiarize yourself with.
+
+If the text-gen-webui client shows a yellow triangle next to it, it means that the prompt template is not set, and it is currently using the default `VICUNA` style prompt template.
+
+![Default prompt template](docs/img/0.21.0/prompt-template-default.png)
+
+Click the two cogwheels to the right of the triangle to open the client settings.
+
+![Client settings](docs/img/0.21.0/select-prompt-template.png)
 
+You can first try by clicking the `DETERMINE VIA HUGGINGFACE` button, depending on the model's README file, it may be able to determine the correct prompt template for you. (basically the readme needs to contain an example of the template)
 
-#### Recommended Models 
+If that doesn't work, you can manually select the prompt template from the dropdown. 
 
-As of 2024.02.06 my personal regular drivers (the ones i test with) are:
+In the case for `bartowski_Nous-Hermes-2-Mistral-7B-DPO-exl2_8_0` that is `ChatML` - select it from the dropdown and click `Save`.
+
+![Client settings](docs/img/0.21.0/selected-prompt-template.png)
+
+### Recommended Models 
+
+As of 2024.03.07 my personal regular drivers (the ones i test with) are:
 
 - Kunoichi-7B
 - sparsetral-16x7B
-- Nous-Hermes-2-SOLAR-10.7B
+- Nous-Hermes-2-Mistral-7B-DPO
 - brucethemoose_Yi-34B-200K-RPMerge
 - dolphin-2.7-mixtral-8x7b
+- rAIfle_Verdict-8x7B
 - Mixtral-8x7B-instruct
-- GPT-3.5-turbo 0125
-- GPT-4-turbo 0116
 
 That said, any of the top models in any of the size classes here should work well (i wouldn't recommend going lower than 7B):
 
 https://www.reddit.com/r/LocalLLaMA/comments/18yp9u4/llm_comparisontest_api_edition_gpt4_vs_gemini_vs/
 
-### OpenAI
+## OpenAI / mistral.ai / Anthropic
+
+The setup is the same for all three, the example below is for OpenAI.
 
 If you want to add an OpenAI client, just change the client type and select the apropriate model.
 
-![Add client modal](docs/img/add-client-modal-openai.png)
+![Add client modal](docs/img/0.21.0/openai-setup.png)
 
 If you are setting this up for the first time, you should now see the client, but it will have a red dot next to it, stating that it requires an API key.
 
 ![OpenAI API Key missing](docs/img/0.18.0/openai-api-key-1.png)
 
 Click the `SET API KEY` button. This will open a modal where you can enter your API key.
 
-![OpenAI API Key missing](docs/img/0.18.0/openai-api-key-2.png)
+![OpenAI API Key missing](docs/img/0.21.0/openai-add-api-key.png)
 
 Click `Save` and after a moment the client should have a green dot next to it, indicating that it is ready to go.
 
 ![OpenAI API Key set](docs/img/0.18.0/openai-api-key-3.png)
 
+## DeepInfra via OpenAI Compatible client
+
+You can use the OpenAI compatible client to connect to [DeepInfra](https://deepinfra.com/).
+
+![DeepInfra](docs/img/0.21.0/deepinfra-setup.png)
+
+```
+API URL: https://api.deepinfra.com/v1/openai
+```
+
+Models on DeepInfra that work well with Talemate:
+
+- [mistralai/Mixtral-8x7B-Instruct-v0.1](https://deepinfra.com/mistralai/Mixtral-8x7B-Instruct-v0.1) (max context 32k, 8k recommended)
+- [cognitivecomputations/dolphin-2.6-mixtral-8x7b](https://deepinfra.com/cognitivecomputations/dolphin-2.6-mixtral-8x7b) (max context 32k, 8k recommended)
+- [lizpreciatior/lzlv_70b_fp16_hf](https://deepinfra.com/lizpreciatior/lzlv_70b_fp16_hf) (max context 4k)
+
 ## Ready to go
 
 You will know you are good to go when the client and all the agents have a green dot next to them.
 
-![Ready to go](docs/img/client-setup-complete.png)
+![Ready to go](docs/img/0.21.0/ready-to-go.png)
 
 ## Load the introductory scenario "Infinity Quest"
 

diff --git a/docs/img/0.21.0/deepinfra-setup.png b/docs/img/0.21.0/deepinfra-setup.png
diff --git a/docs/img/0.21.0/no-clients.png b/docs/img/0.21.0/no-clients.png
diff --git a/docs/img/0.21.0/openai-add-api-key.png b/docs/img/0.21.0/openai-add-api-key.png
diff --git a/docs/img/0.21.0/openai-setup.png b/docs/img/0.21.0/openai-setup.png
diff --git a/docs/img/0.21.0/prompt-template-default.png b/docs/img/0.21.0/prompt-template-default.png
diff --git a/docs/img/0.21.0/ready-to-go.png b/docs/img/0.21.0/ready-to-go.png
diff --git a/docs/img/0.21.0/select-prompt-template.png b/docs/img/0.21.0/select-prompt-template.png
diff --git a/docs/img/0.21.0/selected-prompt-template.png b/docs/img/0.21.0/selected-prompt-template.png
diff --git a/docs/img/0.21.0/text-gen-webui-setup.png b/docs/img/0.21.0/text-gen-webui-setup.png