Skip to content

Commit

Permalink
使用 Colaboratory 创建
Browse files Browse the repository at this point in the history
  • Loading branch information
sixhj committed Feb 2, 2024
1 parent dd3f70d commit bfbf72c
Showing 1 changed file with 30 additions and 2 deletions.
32 changes: 30 additions & 2 deletions hf_1.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
"metadata": {
"colab": {
"provenance": [],
"authorship_tag": "ABX9TyNFVzh/5zbgq1bUOenZGdND",
"authorship_tag": "ABX9TyMu5NbqFVzEvM+nty/tjPA9",
"include_colab_link": true
},
"kernelspec": {
Expand Down Expand Up @@ -1398,10 +1398,38 @@
"<a href=\"https://colab.research.google.com/github/sixhj/read/blob/main/hf_1.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"cell_type": "markdown",
"source": [
"# huggingface 笔记\n",
"pipeline:\n",
"\n",
"## AutoTokenizer\n",
"\n",
"分词器负责预处理文本,将文本转换为用于输入模型的数字数组。有多个用来管理分词过程的规则,包括如何拆分单词和在什么样的级别上拆分单词(在 分词器总结 学习更多关于分词的信息)。要记住最重要的是你需要实例化的分词器要与模型的名称相同, 来确保和模型训练时使用相同的分词规则\n",
"\n",
"加载分词器\n",
"```\n",
"tokenizer = AutoTokenizer.from_pretrained(model_name)\n",
"```\n",
"\n",
"## AutoModel\n",
"Transformers 提供了一种简单统一的方式来加载预训练的实例. 这表示你可以像加载 [`AutoTokenizer`] 一样加载 [`AutoModel`]。唯一不同的地方是为你的任务选择正确的[`AutoModel`]。对于文本(或序列)分类,你应该加载[`AutoModelForSequenceClassification`]:\n",
"\n",
"\n",
"```\n",
" pt_model = AutoModelForSequenceClassification.from_pretrained(model_name)\n",
"```\n",
"\n"
],
"metadata": {
"id": "eEqxUAQAosBc"
}
},
{
"cell_type": "code",
"source": [
"!pip install transformers"
"!pip install transformers\n"
],
"metadata": {
"id": "htx_GWM7n-kU"
Expand Down

0 comments on commit bfbf72c

Please sign in to comment.