Skip to content

Commit 1a6d1b0

Browse files
New Blog post and updates for Sagemaker inference (huggingface#122)
* added new documentation links and quote for new inference post * added deploy Hugging Face models easily with Amazon SageMaker blog * Update the-partnership-amazon-sagemaker-and-hugging-face.md Co-authored-by: Jeff Boudier <[email protected]> * Update the-partnership-amazon-sagemaker-and-hugging-face.md Co-authored-by: Jeff Boudier <[email protected]> * Update deploy-hugging-face-models-easily-with-amazon-sagemaker.md Co-authored-by: Jeff Boudier <[email protected]> * Update deploy-hugging-face-models-easily-with-amazon-sagemaker.md Co-authored-by: Jeff Boudier <[email protected]> * Update deploy-hugging-face-models-easily-with-amazon-sagemaker.md Co-authored-by: Jeff Boudier <[email protected]> * comment post in and changed pip version Co-authored-by: Jeff Boudier <[email protected]>
1 parent 3ef29d5 commit 1a6d1b0

3 files changed

+331
-1
lines changed

_blog.yml

+4
Original file line numberDiff line numberDiff line change
@@ -160,3 +160,7 @@
160160
- local: sentence-transformers-in-the-hub
161161
title: "Sentence Transformers in the 🤗 Hub"
162162
date: June 28, 2021
163+
164+
- local: deploy-hugging-face-models-easily-with-amazon-sagemaker
165+
title: "Deploy Hugging Face models easily with Amazon SageMaker"
166+
date: July 8, 2021
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,313 @@
1+
---
2+
title: 'Deploy Hugging Face models easily with Amazon SageMaker'
3+
thumbnail: /blog/assets/17_the_partnership_amazon_sagemaker_and_hugging_face/thumbnail.png
4+
---
5+
6+
<img src="/blog/assets/17_the_partnership_amazon_sagemaker_and_hugging_face/cover.png" alt="hugging-face-and-aws-logo" class="w-full">
7+
8+
9+
# **Deploy Hugging Face models easily with Amazon SageMaker 🏎**
10+
11+
Earlier this year[ we announced a strategic collaboration with Amazon](https://huggingface.co/blog/the-partnership-amazon-sagemaker-and-hugging-face) to make it easier for companies to use Hugging Face in Amazon SageMaker, and ship cutting-edge Machine Learning features faster. We introduced new Hugging Face Deep Learning Containers (DLCs) to[ train Hugging Face Transformer models in Amazon SageMaker](https://huggingface.co/transformers/sagemaker.html#getting-started-train-a-transformers-model).
12+
13+
Today, we are excited to share a new inference solution with you that makes it easier than ever to deploy Hugging Face Transformers with Amazon SageMaker! With the new Hugging Face Inference DLCs, you can deploy your trained models for inference with just one more line of code, or select any of the 10,000+ publicly available models from the[ Model Hub](https://huggingface.co/models), and deploy them with Amazon SageMaker.
14+
15+
Deploying models in SageMaker provides you with production-ready endpoints that scale easily within your AWS environment, with built-in monitoring and a ton of enterprise features. It's been an amazing collaboration and we hope you will take advantage of it!
16+
17+
Here's how to use the new[ SageMaker Hugging Face Inference Toolkit](https://github.com/aws/sagemaker-huggingface-inference-toolkit) to deploy Transformers-based models:
18+
19+
20+
```python
21+
from sagemaker.huggingface import HuggingFaceModel
22+
23+
# create Hugging Face Model Class and deploy it as SageMaker Endpoint
24+
huggingface_model = HuggingFaceModel(...).deploy()
25+
```
26+
27+
28+
That's it! 🚀
29+
30+
To learn more about accessing and using the new Hugging Face DLCs with the Amazon SageMaker Python SDK, check out the guides and resources below.
31+
32+
33+
34+
---
35+
36+
37+
38+
# **Resources, Documentation & Samples 📄**
39+
40+
Below you can find all the important resources for deploying your models to Amazon SageMaker.
41+
42+
43+
## **Blog/Video**
44+
45+
- [Video: Deploy a Hugging Face Transformers Model from S3 to Amazon SageMaker](https://youtu.be/pfBGgSGnYLs)
46+
- [Video: Deploy a Hugging Face Transformers Model from the Model Hub to Amazon SageMaker](https://youtu.be/l9QZuazbzWM)
47+
48+
49+
## **Samples/Documentation**
50+
51+
- [Hugging Face documentation for Amazon SageMaker](https://huggingface.co/docs/sagemaker/main)
52+
- [Deploy models to Amazon SageMaker](https://huggingface.co/docs/sagemaker/inference)
53+
- [Amazon SageMaker documentation for Hugging Face](https://docs.aws.amazon.com/sagemaker/latest/dg/hugging-face.html)
54+
- [Python SDK SageMaker documentation for Hugging Face](https://sagemaker.readthedocs.io/en/stable/frameworks/huggingface/index.html)
55+
- [Deep Learning Container](https://github.com/aws/deep-learning-containers/blob/master/available_images.md#huggingface-training-containers)
56+
- [Notebook: Deploy one of the 10 000+ Hugging Face Transformers to Amazon SageMaker for Inference](https://github.com/huggingface/notebooks/blob/master/sagemaker/11_deploy_model_from_hf_hub/deploy_transformer_model_from_hf_hub.ipynb)
57+
- [Notebook: Deploy a Hugging Face Transformer model from S3 to SageMaker for inference](https://github.com/huggingface/notebooks/blob/master/sagemaker/10_deploy_model_from_s3/deploy_transformer_model_from_s3.ipynb)
58+
59+
60+
---
61+
62+
63+
# **SageMaker Hugging Face Inference Toolkit ⚙️**
64+
65+
In addition to the Hugging Face Transformers-optimized Deep Learning Containers for inference, we have created a new[ Inference Toolkit](https://github.com/aws/sagemaker-huggingface-inference-toolkit) for Amazon SageMaker. This new Inference Toolkit leverages the `pipelines` from the `transformers` library to allow zero-code deployments of models without writing any code for pre- or post-processing. In the "Getting Started" section below you find two examples of how to deploy your models to Amazon SageMaker.
66+
67+
In addition to the zero-code deployment, the Inference Toolkit supports "bring your own code" methods, where you can override the default methods. You can learn more about "bring your own code" in the documentation[ here](https://github.com/aws/sagemaker-huggingface-inference-toolkit#-user-defined-codemodules) or you can check out the sample notebook "deploy custom inference code to Amazon SageMaker".
68+
69+
70+
## **API - Inference Toolkit Description**
71+
72+
Using the` transformers pipelines`, we designed an API, which makes it easy for you to benefit from all `pipelines` features. The API has a similar interface than the[ 🤗 Accelerated Inference API](https://api-inference.huggingface.co/docs/python/html/detailed_parameters.html), meaning your inputs need to be defined in the `inputs` key and if you want additional supported `pipelines` parameters you can add them in the `parameters` key. Below you can find examples for requests.
73+
74+
75+
```python
76+
# text-classification request body
77+
{
78+
"inputs": "Camera - You are awarded a SiPix Digital Camera! call 09061221066 fromm landline. Delivery within 28 days."
79+
}
80+
# question-answering request body
81+
{
82+
"inputs": {
83+
"question": "What is used for inference?",
84+
"context": "My Name is Philipp and I live in Nuremberg. This model is used with sagemaker for inference."
85+
}
86+
}
87+
# zero-shot classification request body
88+
{
89+
"inputs": "Hi, I recently bought a device from your company but it is not working as advertised and I would like to get reimbursed!",
90+
"parameters": {
91+
"candidate_labels": [
92+
"refund",
93+
"legal",
94+
"faq"
95+
]
96+
}
97+
}
98+
```
99+
100+
# **Getting started 🧭**
101+
102+
In this guide we will use the new Hugging Face Inference DLCs and Amazon SageMaker Python SDK to deploy two transformer models for inference.
103+
104+
In the first example, we deploy for inference a Hugging Face Transformer model trained in Amazon SageMaker.
105+
106+
In the second example, we directly deploy one of the 10,000+ publicly available Hugging Face Transformers models from the[ Model Hub](https://huggingface.co/models) to Amazon SageMaker for Inference.
107+
108+
109+
## **Setting up the environment**
110+
111+
We will use an Amazon SageMaker Notebook Instance for the example. You can learn[ here how to set up a Notebook Instance.](https://docs.aws.amazon.com/sagemaker/latest/dg/nbi.html) To get started, jump into your Jupyter Notebook or JupyterLab and create a new Notebook with the `conda_pytorch_p36` kernel.
112+
113+
**_Note: The use of Jupyter is optional: We could also launch SageMaker API calls from anywhere we have an SDK installed, connectivity to the cloud, and appropriate permissions, such as a Laptop, another IDE, or a task scheduler like Airflow or AWS Step Functions._**
114+
115+
After that we can install the required dependencies.
116+
117+
118+
```bash
119+
pip install "sagemaker>=2.48.0" --upgrade
120+
```
121+
122+
123+
To deploy a model on SageMaker, we need to create a `sagemaker` Session and provide an IAM role with the right permission. The `get_execution_role` method is provided by the SageMaker SDK as an optional convenience. You can also specify the role by writing the specific role ARN you want your endpoint to use. This IAM role will be later attached to the Endpoint, e.g. download the model from Amazon S3.
124+
125+
126+
```python
127+
import sagemaker
128+
129+
sess = sagemaker.Session()
130+
role = sagemaker.get_execution_role()
131+
```
132+
133+
---
134+
135+
## **Deploy a trained Hugging Face Transformer model to SageMaker for inference**
136+
137+
There are two ways to deploy your SageMaker trained Hugging Face model. You can either deploy it after your training is finished, or you can deploy it later, using the `model_data` pointing to your saved model on Amazon S3. In addition to the two below-mentioned options, you can also instantiate Hugging Face endpoints with lower-level SDK such as `boto3` and `AWS CLI`, `Terraform` and with CloudFormation templates.
138+
139+
140+
### **Deploy the model directly after training with the Estimator class**
141+
142+
If you deploy your model directly after training, you need to ensure that all required model artifacts are saved in your training script, including the tokenizer and the model. A benefit of deploying directly after training is that SageMaker model container metadata will contain the source training job, providing lineage from training job to deployed model.
143+
144+
145+
```python
146+
from sagemaker.huggingface import HuggingFace
147+
148+
############ pseudo code start ############
149+
150+
# create HuggingFace estimator for running training
151+
huggingface_estimator = HuggingFace(....)
152+
153+
# starting the train job with our uploaded datasets as input
154+
huggingface_estimator.fit(...)
155+
156+
############ pseudo code end ############
157+
158+
# deploy model to SageMaker Inference
159+
predictor = hf_estimator.deploy(initial_instance_count=1, instance_type="ml.m5.xlarge")
160+
161+
# example request, you always need to define "inputs"
162+
data = {
163+
"inputs": "Camera - You are awarded a SiPix Digital Camera! call 09061221066 fromm landline. Delivery within 28 days."
164+
}
165+
# request
166+
predictor.predict(data)
167+
```
168+
169+
170+
After we run our request we can delete the endpoint again with.
171+
172+
173+
```python
174+
# delete endpoint
175+
predictor.delete_endpoint()
176+
```
177+
178+
### **Deploy the model from pre-trained checkpoints using the <code>HuggingFaceModel</code> class**
179+
180+
If you've already trained your model and want to deploy it at some later time, you can use the `model_data` argument to specify the location of your tokenizer and model weights.
181+
182+
183+
```python
184+
from sagemaker.huggingface.model import HuggingFaceModel
185+
186+
# create Hugging Face Model Class
187+
huggingface_model = HuggingFaceModel(
188+
model_data="s3://models/my-bert-model/model.tar.gz", # path to your trained sagemaker model
189+
role=role, # iam role with permissions to create an Endpoint
190+
transformers_version="4.6", # transformers version used
191+
pytorch_version="1.7", # pytorch version used
192+
)
193+
# deploy model to SageMaker Inference
194+
predictor = huggingface_model.deploy(
195+
initial_instance_count=1,
196+
instance_type="ml.m5.xlarge"
197+
)
198+
199+
# example request, you always need to define "inputs"
200+
data = {
201+
"inputs": "Camera - You are awarded a SiPix Digital Camera! call 09061221066 fromm landline. Delivery within 28 days."
202+
}
203+
204+
# request
205+
predictor.predict(data)
206+
```
207+
208+
After we run our request, we can delete the endpoint again with:
209+
210+
211+
```python
212+
# delete endpoint
213+
predictor.delete_endpoint()
214+
```
215+
216+
217+
218+
## **Deploy one of the 10,000+ Hugging Face Transformers to Amazon SageMaker for Inference**
219+
220+
To deploy a model directly from the Hugging Face Model Hub to Amazon SageMaker, we need to define two environment variables when creating the `HuggingFaceModel`. We need to define:
221+
222+
* HF_MODEL_ID: defines the model id, which will be automatically loaded from[ huggingface.co/models](http://huggingface.co/models) when creating or SageMaker Endpoint. The 🤗 Hub provides 10,000+ models all available through this environment variable.
223+
* HF_TASK: defines the task for the used 🤗 Transformers pipeline. A full list of tasks can be found[ here](https://huggingface.co/transformers/main_classes/pipelines.html).
224+
225+
```python
226+
from sagemaker.huggingface.model import HuggingFaceModel
227+
228+
# Hub Model configuration. <https://huggingface.co/models>
229+
hub = {
230+
'HF_MODEL_ID':'distilbert-base-uncased-distilled-squad', # model_id from hf.co/models
231+
'HF_TASK':'question-answering' # NLP task you want to use for predictions
232+
}
233+
234+
# create Hugging Face Model Class
235+
huggingface_model = HuggingFaceModel(
236+
env=hub, # configuration for loading model from Hub
237+
role=role, # iam role with permissions to create an Endpoint
238+
transformers_version="4.6", # transformers version used
239+
pytorch_version="1.7", # pytorch version used
240+
)
241+
242+
# deploy model to SageMaker Inference
243+
predictor = huggingface_model.deploy(
244+
initial_instance_count=1,
245+
instance_type="ml.m5.xlarge"
246+
)
247+
248+
# example request, you always need to define "inputs"
249+
data = {
250+
"inputs": {
251+
"question": "What is used for inference?",
252+
"context": "My Name is Philipp and I live in Nuremberg. This model is used with sagemaker for inference."
253+
}
254+
}
255+
256+
# request
257+
predictor.predict(data)
258+
```
259+
260+
After we run our request we can delete the endpoint again with.
261+
262+
263+
```python
264+
# delete endpoint
265+
predictor.delete_endpoint()
266+
```
267+
268+
---
269+
270+
# **FAQ 🎯**
271+
272+
You can find the complete [Frequently Asked Questions](https://huggingface.co/docs/sagemaker/faq) in the [documentation](https://huggingface.co/docs/sagemaker/faq).
273+
274+
_Q: Which models can I deploy for Inference?_
275+
276+
A: You can deploy:
277+
* any 🤗 Transformers model trained in Amazon SageMaker, or other compatible platforms and that can accomodate the SageMaker Hosting design
278+
* any of the 10,000+ publicly available Transformer models from the Hugging Face[ Model Hub](https://huggingface.co/models), or
279+
* your private models hosted in your Hugging Face premium account!
280+
281+
_Q: Which pipelines, tasks are supported by the Inference Toolkit?_
282+
283+
A: The Inference Toolkit and DLC support any of the `transformers` `pipelines`. You can find the full list [here](https://huggingface.co/transformers/main_classes/pipelines.html)
284+
285+
_Q: Do I have to use the `transformers pipelines` when hosting SageMaker endpoints?_
286+
287+
A: No, you can also write your custom inference code to serve your own models and logic, documented [here](add-link-here).
288+
289+
_Q: Do I have to use the SageMaker Python SDK to use the Hugging Face Deep Learning Containers (DLCs)?_
290+
291+
A: You can use the Hugging Face DLC without the SageMaker Python SDK and deploy your models to SageMaker with other SDKs, such as the [AWS CLI](https://docs.aws.amazon.com/cli/latest/reference/sagemaker/create-training-job.html), [boto3](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/sagemaker.html#SageMaker.Client.create_training_job) or [Cloudformation](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-resource-sagemaker-endpoint.html). The DLCs are also available through Amazon ECR and can be pulled and used in any environment of choice.
292+
293+
_Q: Why should I use the Hugging Face Deep Learning Containers?_
294+
295+
A: The DLCs are fully tested, maintained, optimized deep learning environments that require no installation, configuration, or maintenance. In particular, our inference DLC comes with a pre-written serving stack, which drastically lowers the technical bar of DL serving.
296+
297+
_Q: How is my data and code secured by Amazon SageMaker?_
298+
299+
A: Amazon SageMaker provides numerous security mechanisms including **[encryption at rest](https://docs.aws.amazon.com/sagemaker/latest/dg/encryption-at-rest-nbi.html)** and **[in transit](https://docs.aws.amazon.com/sagemaker/latest/dg/encryption-in-transit.html)**, **[Virtual Private Cloud (VPC) connectivity](https://docs.aws.amazon.com/sagemaker/latest/dg/interface-vpc-endpoint.html),** and **[Identity and Access Management (IAM)](https://docs.aws.amazon.com/sagemaker/latest/dg/security_iam_service-with-iam.html)**. To learn more about security in the AWS cloud and with Amazon SageMaker, you can visit **[Security in Amazon SageMaker](https://docs.aws.amazon.com/sagemaker/latest/dg/security_iam_service-with-iam.html)** and **[AWS Cloud Security](https://docs.aws.amazon.com/sagemaker/latest/dg/security_iam_service-with-iam.html)**.
300+
301+
_Q: Is this available in my region?_
302+
303+
A: For a list of the supported regions, please visit the **[AWS region table](https://aws.amazon.com/about-aws/global-infrastructure/regional-product-services/)** for all AWS global infrastructure.
304+
305+
_Q: Do you offer premium support or support SLAs for this solution?_
306+
307+
A: AWS Technical Support tiers are available from AWS and cover development and production issues for AWS products and services - please refer to AWS Support for specifics and scope.
308+
309+
If you have questions which the Hugging Face community can help answer and/or benefit from, please **[post them in the Hugging Face forum](https://discuss.huggingface.co/c/sagemaker/17)**.
310+
311+
---
312+
313+
If you need premium support from the Hugging Face team to accelerate your NLP roadmap, our[ Expert Acceleration Program](https://huggingface.co/support) offers direct guidance from our open-source, science, and ML Engineering teams.

0 commit comments

Comments
 (0)