Skip to content

Latest commit

 

History

History
127 lines (95 loc) · 5.39 KB

machine-learning-data-science-move-data-to-azure-blob-using-python.md

File metadata and controls

127 lines (95 loc) · 5.39 KB
title description services documentationcenter author manager editor ms.assetid ms.service ms.workload ms.tgt_pltfrm ms.devlang ms.topic ms.date ms.author
Move Data to and from Azure Blob Storage using Python | Microsoft Docs
Move Data to and from Azure Blob Storage using Python
machine-learning,storage
bradsev
jhubbard
cgronlun
24276252-b3dd-4edf-9e5d-f6803f8ccccc
machine-learning
data-services
na
na
article
09/14/2016
bradsev

Move Data to and from Azure Blob Storage using Python

This topic describes how to list, upload, and download blobs using the Python API. With the Python API provided in Azure SDK, you can:

  • Create a container
  • Upload a blob into a container
  • Download blobs
  • List the blobs in a container
  • Delete a blob

For more information about using the Python API, see How to Use the Blob Storage Service from Python.

[!INCLUDE blob-storage-tool-selector]

Note

If you are using VM that was set up with the scripts provided by Data Science Virtual machines in Azure, then AzCopy is already installed on the VM.

[!NOTE] For a complete introduction to Azure blob storage, refer to Azure Blob Basics and to Azure Blob Service.

Prerequisites

This document assumes that you have an Azure subscription, a storage account, and the corresponding storage key for that account. Before uploading/downloading data, you must know your Azure storage account name and account key.

Upload Data to Blob

Add the following snippet near the top of any Python code in which you wish to programmatically access Azure Storage:

from azure.storage.blob import BlobService

The BlobService object lets you work with containers and blobs. The following code creates a BlobService object using the storage account name and account key. Replace account name and account key with your real account and key.

blob_service = BlobService(account_name="<your_account_name>", account_key="<your_account_key>")

Use the following methods to upload data to a blob:

  1. put_block_blob_from_path (uploads the contents of a file from the specified path)
  2. put_block_blob_from_file (uploads the contents from an already opened file/stream)
  3. put_block_blob_from_bytes (uploads an array of bytes)
  4. put_block_blob_from_text (uploads the specified text value using the specified encoding)

The following sample code uploads a local file to a container:

blob_service.put_block_blob_from_path("<your_container_name>", "<your_blob_name>", "<your_local_file_name>")

The following sample code uploads all the files (excluding directories) in a local directory to blob storage:

from azure.storage.blob import BlobService
from os import listdir
from os.path import isfile, join

# Set parameters here
ACCOUNT_NAME = "<your_account_name>"
ACCOUNT_KEY = "<your_account_key>"
CONTAINER_NAME = "<your_container_name>"
LOCAL_DIRECT = "<your_local_directory>"        

blob_service = BlobService(account_name=ACCOUNT_NAME, account_key=ACCOUNT_KEY)
# find all files in the LOCAL_DIRECT (excluding directory)
local_file_list = [f for f in listdir(LOCAL_DIRECT) if isfile(join(LOCAL_DIRECT, f))]

file_num = len(local_file_list)
for i in range(file_num):
    local_file = join(LOCAL_DIRECT, local_file_list[i])
    blob_name = local_file_list[i]
    try:
        blob_service.put_block_blob_from_path(CONTAINER_NAME, blob_name, local_file)
    except:
        print "something wrong happened when uploading the data %s"%blob_name

Download Data from Blob

Use the following methods to download data from a blob:

  1. get_blob_to_path
  2. get_blob_to_file
  3. get_blob_to_bytes
  4. get_blob_to_text

These methods that perform the necessary chunking when the size of the data exceeds 64 MB.

The following sample code downloads the contents of a blob in a container to a local file:

blob_service.get_blob_to_path("<your_container_name>", "<your_blob_name>", "<your_local_file_name>")

The following sample code downloads all blobs from a container. It uses list_blobs to get the list of available blobs in the container and downloads them to a local directory.

from azure.storage.blob import BlobService
from os.path import join

# Set parameters here
ACCOUNT_NAME = "<your_account_name>"
ACCOUNT_KEY = "<your_account_key>"
CONTAINER_NAME = "<your_container_name>"
LOCAL_DIRECT = "<your_local_directory>"        

blob_service = BlobService(account_name=ACCOUNT_NAME, account_key=ACCOUNT_KEY)

# List all blobs and download them one by one
blobs = blob_service.list_blobs(CONTAINER_NAME)
for blob in blobs:
    local_file = join(LOCAL_DIRECT, blob.name)
    try:
        blob_service.get_blob_to_path(CONTAINER_NAME, blob.name, local_file)
    except:
        print "something wrong happened when downloading the data %s"%blob.name