Skip to content

Commit

Permalink
reinitialize git history
Browse files Browse the repository at this point in the history
  • Loading branch information
richarddwang committed Nov 24, 2023
0 parents commit cdb4533
Show file tree
Hide file tree
Showing 46 changed files with 10,578 additions and 0 deletions.
243 changes: 243 additions & 0 deletions Modules/0. Utility/LangServe & Langchain Templates.ipynb

Large diffs are not rendered by default.

258 changes: 258 additions & 0 deletions Modules/0. Utility/Langsmith.ipynb

Large diffs are not rendered by default.

331 changes: 331 additions & 0 deletions Modules/0. Utility/utilities.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,331 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 基礎設置\n",
"`langchain_setup` 包含了使用 langchain 開發所需的環境和我寫的一些工具。\n",
"\n",
"透過 import 這個套件,會自動讀取 `~/.cache/.env` 或 `langcahin_setup/.env` 中,設定好環境變數,並做好網路連接的設定。"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"import langchain_setup"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"另外這個套件可以根據環境變數 `OPENAI_API_TYPE` 的有無或其值來自動切換使用`OpenAI` 和 `AzureOpenAI`,並且依環境變數 `DEFAULT_...MODEL` 來自動設定模型名稱。讓我們私人或公用時都可以用同一份程式碼不用改。"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"<class 'langchain.llms.openai.AzureOpenAI'>\n",
"text-davinci-003\n",
"vin-test\n"
]
}
],
"source": [
"from langchain_setup import OpenAI\n",
"model = OpenAI()\n",
"print(type(model))\n",
"print(model.model_name)\n",
"print(model.deployment_name)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 計算成本\n",
"Langchain 有提供一個 context manager `get_openai_callback`。在這個 context manager 的範圍內,其會插入一個 callback 計算使用的 token 數。"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Tokens Used: 1340\n",
"\tPrompt Tokens: 27\n",
"\tCompletion Tokens: 1313\n",
"Successful Requests: 1\n",
"Total Cost (USD): $0.0026665\n"
]
}
],
"source": [
"from langchain_setup import ChatOpenAI\n",
"from langchain.callbacks import get_openai_callback\n",
"\n",
"model = ChatOpenAI(n=2)\n",
"\n",
"with get_openai_callback() as cb:\n",
" result = model.predict(\"教練,我想學飛天御劍流。\")\n",
" print(cb)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"其設有一個預先寫死的價目表 (USD per one thousand tokens),而我們可以手動調整這個價目表。"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{'ada': 0.0004,\n",
" 'ada-finetuned-legacy': 0.0016,\n",
" 'babbage': 0.0005,\n",
" 'babbage-002-azure-finetuned': 0.0004,\n",
" 'babbage-002-azure-finetuned-completion': 0.0004,\n",
" 'babbage-002-finetuned': 0.0016,\n",
" 'babbage-002-finetuned-completion': 0.0016,\n",
" 'babbage-finetuned-legacy': 0.0024,\n",
" 'code-davinci-002': 0.02,\n",
" 'curie': 0.002,\n",
" 'curie-finetuned-legacy': 0.012,\n",
" 'davinci-002-azure-finetuned': 0.002,\n",
" 'davinci-002-azure-finetuned-completion': 0.002,\n",
" 'davinci-002-finetuned': 0.012,\n",
" 'davinci-002-finetuned-completion': 0.012,\n",
" 'davinci-finetuned-legacy': 0.12,\n",
" 'gpt-3.5-turbo': 0.0015,\n",
" 'gpt-3.5-turbo-0301': 0.0015,\n",
" 'gpt-3.5-turbo-0301-completion': 0.002,\n",
" 'gpt-3.5-turbo-0613': 0.0015,\n",
" 'gpt-3.5-turbo-0613-completion': 0.002,\n",
" 'gpt-3.5-turbo-0613-finetuned': 0.012,\n",
" 'gpt-3.5-turbo-0613-finetuned-completion': 0.016,\n",
" 'gpt-3.5-turbo-16k': 0.003,\n",
" 'gpt-3.5-turbo-16k-0613': 0.003,\n",
" 'gpt-3.5-turbo-16k-0613-completion': 0.004,\n",
" 'gpt-3.5-turbo-16k-completion': 0.004,\n",
" 'gpt-3.5-turbo-completion': 0.002,\n",
" 'gpt-3.5-turbo-instruct': 0.0015,\n",
" 'gpt-3.5-turbo-instruct-completion': 0.002,\n",
" 'gpt-35-turbo': 0.0015,\n",
" 'gpt-35-turbo-0301': 0.0015,\n",
" 'gpt-35-turbo-0301-completion': 0.002,\n",
" 'gpt-35-turbo-0613': 0.0015,\n",
" 'gpt-35-turbo-0613-azure-finetuned': 0.0015,\n",
" 'gpt-35-turbo-0613-azure-finetuned-completion': 0.002,\n",
" 'gpt-35-turbo-0613-completion': 0.002,\n",
" 'gpt-35-turbo-16k': 0.003,\n",
" 'gpt-35-turbo-16k-0613': 0.003,\n",
" 'gpt-35-turbo-16k-0613-completion': 0.004,\n",
" 'gpt-35-turbo-16k-completion': 0.004,\n",
" 'gpt-35-turbo-completion': 0.002,\n",
" 'gpt-35-turbo-instruct': 0.0015,\n",
" 'gpt-35-turbo-instruct-completion': 0.002,\n",
" 'gpt-4': 0.03,\n",
" 'gpt-4-0314': 0.03,\n",
" 'gpt-4-0314-completion': 0.06,\n",
" 'gpt-4-0613': 0.03,\n",
" 'gpt-4-0613-completion': 0.06,\n",
" 'gpt-4-32k': 0.06,\n",
" 'gpt-4-32k-0314': 0.06,\n",
" 'gpt-4-32k-0314-completion': 0.12,\n",
" 'gpt-4-32k-0613': 0.06,\n",
" 'gpt-4-32k-0613-completion': 0.12,\n",
" 'gpt-4-32k-completion': 0.12,\n",
" 'gpt-4-completion': 0.06,\n",
" 'text-ada-001': 0.0004,\n",
" 'text-babbage-001': 0.0005,\n",
" 'text-curie-001': 0.002,\n",
" 'text-davinci-002': 0.02,\n",
" 'text-davinci-003': 0.02}\n"
]
}
],
"source": [
"from langchain.callbacks.openai_info import MODEL_COST_PER_1K_TOKENS\n",
"from pprint import pprint\n",
"\n",
"pprint(MODEL_COST_PER_1K_TOKENS)"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Tokens Used: 1433\n",
"\tPrompt Tokens: 27\n",
"\tCompletion Tokens: 1406\n",
"Successful Requests: 1\n",
"Total Cost (USD): $0.0028525\n"
]
}
],
"source": [
"MODEL_COST_PER_1K_TOKENS[model.model_name] = 0.0001\n",
"with get_openai_callback() as cb:\n",
" result = model.predict(\"教練,我想學飛天御劍流。\")\n",
" print(cb)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Cache"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"在某些情況,我們會發現自己重複在用同樣的 input 做同樣生成 (generation),這時候我們就可以用到快取 (caching) 來節省我們的成本。\n",
"\n",
"譬如下面舉個例子,要對兩篇很長的文章做一個摘要,這兩篇長到塞不下一個 prompt (超過模型可處理的方法),於是我們想出一個辦法,先對這兩篇文章各自先產出一個摘要,再總結這兩個摘要成為最終的摘要。在下面這個例子中,針對文章個別產出摘要的第一階段可能就會有重複做的問題。"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"摘要:這個摘要描述了一種強化版的術式「蒼」,它能夠從正無限轉變為負無限,並利用強勁的吸引力引起物體之間的碰撞,同時實現瞬間移動。通過結合術式順轉「蒼」和術式反轉「赫」,虛式「茈」能夠發射出具有假想質量和黑洞般吞噬力量的能量。\n",
"Tokens Used: 743\n",
"\tPrompt Tokens: 431\n",
"\tCompletion Tokens: 312\n",
"Successful Requests: 3\n",
"Total Cost (USD): $0.0012705\n",
"===============================================================================\n",
"這些描述涉及超越現有物理學理論的概念,無法用正統物理學的觀點做出總結。\n",
"Tokens Used: 245\n",
"\tPrompt Tokens: 194\n",
"\tCompletion Tokens: 51\n",
"Successful Requests: 1\n",
"Total Cost (USD): $0.000393\n"
]
}
],
"source": [
"import langchain\n",
"from langchain_setup import ChatOpenAI\n",
"from langchain.callbacks import get_openai_callback\n",
"from langchain.cache import InMemoryCache, SQLiteCache\n",
"\n",
"langchain.llm_cache = InMemoryCache()\n",
"\n",
"cache_llm = ChatOpenAI(cache=True)\n",
"non_cache_llm = ChatOpenAI(cache=False) # default\n",
"\n",
"texts = [\n",
" \"術式順轉「蒼」:強化版無下限術式,能夠從「正無限」變換為「負無限」,用強勁的吸引力讓物體和物體互相吸收產生碰撞,也可以用這招來實現瞬間移動。\",\n",
" \"虛式「茈」:結合術式順轉「蒼」和術式反轉「赫」,讓正負的無限之力進行衝突,再將其中產生的假想質量發射出去,力量強大到像是黑洞般能夠吞噬一切。\"\n",
"]\n",
"prompt_template1 = \"請將文章「{text}」摘要成一句話\"\n",
"prompt_template2 = \"請將摘要們「{text}」進一步摘要成一個摘要\"\n",
"\n",
"with get_openai_callback() as cb:\n",
" summary_1 = cache_llm.predict(prompt_template1.format(text=texts[0]))\n",
" summary_2 = cache_llm.predict(prompt_template1.format(text=texts[1]))\n",
"\n",
" summary_of_summary_1 = non_cache_llm.predict(prompt_template2.format(text=summary_1 + '\\n' + summary_2))\n",
" print(summary_of_summary_1)\n",
" print(cb)\n",
"\n",
"print(\"===============================================================================\")\n",
"\n",
"new_prompt_template2 = \"請將摘要們「{text}」以正統物理學的觀點用一句話做出總結\"\n",
"\n",
"with get_openai_callback() as cb:\n",
" summary_1 = cache_llm.predict(prompt_template1.format(text=texts[0]))\n",
" summary_2 = cache_llm.predict(prompt_template1.format(text=texts[1]))\n",
"\n",
" summary_of_summary_2 = non_cache_llm.predict(new_prompt_template2.format(text=summary_1 + '\\n' + summary_2))\n",
" print(summary_of_summary_2)\n",
" print(cb)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"將文件初步摘要的步驟沒有變,只有將產生的摘要進行再摘要的 prompt 變了,因此可以看到第二次執行時,只有一個做再摘要的呼叫 (request),其它則是從快取 (cache) 取用。\n",
"\n",
"而其快取 (cache) 內部其實就是將該呼叫 (request) 的 input prompt, 模型, 設定等等作為一個 key,輸出的生成內容作為 value,的一個大型 `dict`\n",
"\n",
"這邊還有另外一個點在於,如果你是想要每次的生成內容有點隨機性 (temperature>0),caching 會直接無視你想增加隨機性的設定,"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{('[{\"lc\": 1, \"type\": \"constructor\", \"id\": [\"langchain\", \"schema\", \"messages\", \"HumanMessage\"], \"kwargs\": {\"content\": \"\\\\u8acb\\\\u5c07\\\\u6587\\\\u7ae0\\\\u300c\\\\u865b\\\\u5f0f\\\\u300c\\\\u8308\\\\u300d\\\\uff1a\\\\u7d50\\\\u5408\\\\u8853\\\\u5f0f\\\\u9806\\\\u8f49\\\\u300c\\\\u84bc\\\\u300d\\\\u548c\\\\u8853\\\\u5f0f\\\\u53cd\\\\u8f49\\\\u300c\\\\u8d6b\\\\u300d\\\\uff0c\\\\u8b93\\\\u6b63\\\\u8ca0\\\\u7684\\\\u7121\\\\u9650\\\\u4e4b\\\\u529b\\\\u9032\\\\u884c\\\\u885d\\\\u7a81\\\\uff0c\\\\u518d\\\\u5c07\\\\u5176\\\\u4e2d\\\\u7522\\\\u751f\\\\u7684\\\\u5047\\\\u60f3\\\\u8cea\\\\u91cf\\\\u767c\\\\u5c04\\\\u51fa\\\\u53bb\\\\uff0c\\\\u529b\\\\u91cf\\\\u5f37\\\\u5927\\\\u5230\\\\u50cf\\\\u662f\\\\u9ed1\\\\u6d1e\\\\u822c\\\\u80fd\\\\u5920\\\\u541e\\\\u566c\\\\u4e00\\\\u5207\\\\u3002\\\\u300d\\\\u6458\\\\u8981\\\\u6210\\\\u4e00\\\\u53e5\\\\u8a71\"}}]', '{\"lc\": 1, \"type\": \"constructor\", \"id\": [\"langchain\", \"chat_models\", \"azure_openai\", \"AzureChatOpenAI\"], \"kwargs\": {\"deployment_name\": \"gpt-turbo-test-20230330\", \"cache\": true, \"openai_api_type\": \"azure\", \"openai_api_version\": \"2023-07-01-preview\", \"openai_api_base\": \"https://oai-it-test.privatelink.openai.azure.com\", \"openai_api_key\": {\"lc\": 1, \"type\": \"secret\", \"id\": [\"OPENAI_API_KEY\"]}}}---[(\\'stop\\', None)]'): [ChatGeneration(text='透過結合術式順轉「蒼」和術式反轉「赫」,虛式「茈」能夠發射出假想質量並具有黑洞般的吞噬力量。', generation_info={'finish_reason': 'stop'}, message=AIMessage(content='透過結合術式順轉「蒼」和術式反轉「赫」,虛式「茈」能夠發射出假想質量並具有黑洞般的吞噬力量。'))],\n",
" ('[{\"lc\": 1, \"type\": \"constructor\", \"id\": [\"langchain\", \"schema\", \"messages\", \"HumanMessage\"], \"kwargs\": {\"content\": \"\\\\u8acb\\\\u5c07\\\\u6587\\\\u7ae0\\\\u300c\\\\u8853\\\\u5f0f\\\\u9806\\\\u8f49\\\\u300c\\\\u84bc\\\\u300d\\\\uff1a\\\\u5f37\\\\u5316\\\\u7248\\\\u7121\\\\u4e0b\\\\u9650\\\\u8853\\\\u5f0f\\\\uff0c\\\\u80fd\\\\u5920\\\\u5f9e\\\\u300c\\\\u6b63\\\\u7121\\\\u9650\\\\u300d\\\\u8b8a\\\\u63db\\\\u70ba\\\\u300c\\\\u8ca0\\\\u7121\\\\u9650\\\\u300d\\\\uff0c\\\\u7528\\\\u5f37\\\\u52c1\\\\u7684\\\\u5438\\\\u5f15\\\\u529b\\\\u8b93\\\\u7269\\\\u9ad4\\\\u548c\\\\u7269\\\\u9ad4\\\\u4e92\\\\u76f8\\\\u5438\\\\u6536\\\\u7522\\\\u751f\\\\u78b0\\\\u649e\\\\uff0c\\\\u4e5f\\\\u53ef\\\\u4ee5\\\\u7528\\\\u9019\\\\u62db\\\\u4f86\\\\u5be6\\\\u73fe\\\\u77ac\\\\u9593\\\\u79fb\\\\u52d5\\\\u3002\\\\u300d\\\\u6458\\\\u8981\\\\u6210\\\\u4e00\\\\u53e5\\\\u8a71\"}}]', '{\"lc\": 1, \"type\": \"constructor\", \"id\": [\"langchain\", \"chat_models\", \"azure_openai\", \"AzureChatOpenAI\"], \"kwargs\": {\"deployment_name\": \"gpt-turbo-test-20230330\", \"cache\": true, \"openai_api_type\": \"azure\", \"openai_api_version\": \"2023-07-01-preview\", \"openai_api_base\": \"https://oai-it-test.privatelink.openai.azure.com\", \"openai_api_key\": {\"lc\": 1, \"type\": \"secret\", \"id\": [\"OPENAI_API_KEY\"]}}}---[(\\'stop\\', None)]'): [ChatGeneration(text='強化版的術式「蒼」能夠從正無限轉變為負無限,利用強勁的吸引力讓物體互相吸收產生碰撞,並能實現瞬間移動。', generation_info={'finish_reason': 'stop'}, message=AIMessage(content='強化版的術式「蒼」能夠從正無限轉變為負無限,利用強勁的吸引力讓物體互相吸收產生碰撞,並能實現瞬間移動。'))]}\n"
]
}
],
"source": [
"pprint(langchain.llm_cache._cache)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.12"
},
"orig_nbformat": 4
},
"nbformat": 4,
"nbformat_minor": 2
}
Loading

0 comments on commit cdb4533

Please sign in to comment.