title | description | services | documentationcenter | author | manager | editor | ms.assetid | ms.service | ms.component | ms.workload | ms.tgt_pltfrm | ms.devlang | ms.topic | ms.date | ms.author |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Move data to SQL Server on an Azure virtual machine| Microsoft Docs |
Move data from flat files or from an on-premises SQL Server to SQL Server on Azure VM. |
machine-learning |
deguhath |
cgronlun |
cgronlun |
2c9ef1d3-4f5c-4b1f-bf06-223646c8af06 |
machine-learning |
team-data-science-process |
data-services |
na |
na |
article |
11/04/2017 |
deguhath |
This article outlines the options for moving data either from flat files (CSV or TSV formats) or from an on-premises SQL Server to SQL Server on an Azure virtual machine. These tasks for moving data to the cloud are part of the Team Data Science Process.
For a topic that outlines the options for moving data to an Azure SQL Database for Machine Learning, see Move data to an Azure SQL Database for Azure Machine Learning.
The following table summarizes the options for moving data to SQL Server on an Azure virtual machine.
SOURCE | DESTINATION: SQL Server on Azure VM |
---|---|
Flat File | 1. Command-line bulk copy utility (BCP) 2. Bulk Insert SQL Query 3. Graphical Built-in Utilities in SQL Server |
On-Premises SQL Server | 1. Deploy a SQL Server Database to a Microsoft Azure VM wizard 2. Export to a flat File 3. SQL Database Migration Wizard 4. Database back up and restore |
Note that this document assumes that SQL commands are executed from SQL Server Management Studio or Visual Studio Database Explorer.
Tip
As an alternative, you can use Azure Data Factory to create and schedule a pipeline that will move data to a SQL Server VM on Azure. For more information, see Copy data with Azure Data Factory (Copy Activity).
This tutorial assumes you have:
- An Azure subscription. If you do not have a subscription, you can sign up for a free trial.
- An Azure storage account. You will use an Azure storage account for storing the data in this tutorial. If you don't have an Azure storage account, see the Create a storage account article. After you have created the storage account, you will need to obtain the account key used to access the storage. See Manage your storage access keys.
- Provisioned SQL Server on an Azure VM. For instructions, see Set up an Azure SQL Server virtual machine as an IPython Notebook server for advanced analytics.
- Installed and configured Azure PowerShell locally. For instructions, see How to install and configure Azure PowerShell.
If your data is in a flat file (arranged in a row/column format), it can be moved to SQL Server VM on Azure via the following methods:
- Command-line bulk copy utility (BCP)
- Bulk Insert SQL Query
- Graphical Built-in Utilities in SQL Server (Import/Export, SSIS)
BCP is a command-line utility installed with SQL Server and is one of the quickest ways to move data. It works across all three SQL Server variants (On-premises SQL Server, SQL Azure and SQL Server VM on Azure).
Note
Where should my data be for BCP?
While it is not required, having files containing source data located on the same machine as the target SQL Server allows for faster transfers (network speed vs local disk IO speed). You can move the flat files containing data to the machine where SQL Server is installed using various file copying tools such as AZCopy, Azure Storage Explorer or windows copy/paste via Remote Desktop Protocol (RDP).
-
Ensure that the database and the tables are created on the target SQL Server database. Here is an example of how to do that using the
Create Database
andCreate Table
commands:CREATE DATABASE <database_name> CREATE TABLE <tablename> ( <columnname1> <datatype> <constraint>, <columnname2> <datatype> <constraint>, <columnname3> <datatype> <constraint> )
-
Generate the format file that describes the schema for the table by issuing the following command from the command-line of the machine where bcp is installed.
bcp dbname..tablename format nul -c -x -f exportformatfilename.xml -S servername\sqlinstance -T -t \t -r \n
-
Insert the data into the database using the bcp command as follows. This should work from the command-line assuming that the SQL Server is installed on same machine:
bcp dbname..tablename in datafilename.tsv -f exportformatfilename.xml -S servername\sqlinstancename -U username -P password -b block_size_to_move_in_single_attemp -t \t -r \n
Optimizing BCP Inserts Please refer the following article 'Guidelines for Optimizing Bulk Import' to optimize such inserts.
If the data you are moving is large, you can speed things up by simultaneously executing multiple BCP commands in parallel in a PowerShell Script.
Note
Big data Ingestion To optimize data loading for large and very large datasets, partition your logical and physical database tables using multiple filegroups and partition tables. For more information about creating and loading data to partition tables, see Parallel Load SQL Partition Tables.
The sample PowerShell script below demonstrate parallel inserts using bcp:
$NO_OF_PARALLEL_JOBS=2
Set-ExecutionPolicy RemoteSigned #set execution policy for the script to execute
# Define what each job does
$ScriptBlock = {
param($partitionnumber)
#Explictly using SQL username password
bcp database..tablename in datafile_path.csv -F 2 -f format_file_path.xml -U username@servername -S tcp:servername -P password -b block_size_to_move_in_single_attempt -t "," -r \n -o path_to_outputfile.$partitionnumber.txt
#Trusted connection w.o username password (if you are using windows auth and are signed in with that credentials)
#bcp database..tablename in datafile_path.csv -o path_to_outputfile.$partitionnumber.txt -h "TABLOCK" -F 2 -f format_file_path.xml -T -b block_size_to_move_in_single_attempt -t "," -r \n
}
# Background processing of all partitions
for ($i=1; $i -le $NO_OF_PARALLEL_JOBS; $i++)
{
Write-Debug "Submit loading partition # $i"
Start-Job $ScriptBlock -Arg $i
}
# Wait for it all to complete
While (Get-Job -State "Running")
{
Start-Sleep 10
Get-Job
}
# Getting the information back from the jobs
Get-Job | Receive-Job
Set-ExecutionPolicy Restricted #reset the execution policy
Bulk Insert SQL Query can be used to import data into the database from row/column based files (the supported types are covered in thePrepare Data for Bulk Export or Import (SQL Server)) topic.
Here are some sample commands for Bulk Insert are as below:
-
Analyze your data and set any custom options before importing to make sure that the SQL Server database assumes the same format for any special fields such as dates. Here is an example of how to set the date format as year-month-day (if your data contains the date in year-month-day format):
SET DATEFORMAT ymd;
-
Import data using bulk import statements:
BULK INSERT <tablename> FROM '<datafilename>' WITH ( FirstRow=2, FIELDTERMINATOR =',', --this should be column separator in your data ROWTERMINATOR ='\n' --this should be the row separator in your data )
You can use SQL Server Integrations Services (SSIS) to import data into SQL Server VM on Azure from a flat file. SSIS is available in two studio environments. For details, see Integration Services (SSIS) and Studio Environments:
- For details on SQL Server Data Tools, see Microsoft SQL Server Data Tools
- For details on the Import/Export Wizard, see SQL Server Import and Export Wizard
You can also use the following migration strategies:
- Deploy a SQL Server Database to a Microsoft Azure VM wizard
- Export to Flat File
- SQL Database Migration Wizard
- Database back up and restore
We describe each of these below:
The Deploy a SQL Server Database to a Microsoft Azure VM wizard is a simple and recommended way to move data from an on-premises SQL Server instance to SQL Server on an Azure VM. For detailed steps as well as a discussion of other alternatives, see Migrate a database to SQL Server on an Azure VM.
Various methods can be used to bulk export data from an On-Premises SQL Server as documented in the Bulk Import and Export of Data (SQL Server) topic. This document will cover the Bulk Copy Program (BCP) as an example. Once data is exported into a flat file, it can be imported to another SQL server using bulk import.
-
Export the data from on-premises SQL Server to a File using the bcp utility as follows
bcp dbname..tablename out datafile.tsv -S servername\sqlinstancename -T -t \t -t \n -c
-
Create the database and the table on SQL Server VM on Azure using the
create database
andcreate table
for the table schema exported in step 1. -
Create a format file for describing the table schema of the data being exported/imported. Details of the format file are described in Create a Format File (SQL Server).
Format file generation when running BCP from the SQL Server machine
bcp dbname..tablename format nul -c -x -f exportformatfilename.xml -S servername\sqlinstance -T -t \t -r \n
Format file generation when running BCP remotely against a SQL Server
bcp dbname..tablename format nul -c -x -f exportformatfilename.xml -U [email protected] -S tcp:servername -P password --t \t -r \n
-
Use any of the methods described in section Moving Data from File Source to move the data in flat files to a SQL Server.
SQL Server Database Migration Wizard provides a user-friendly way to move data between two SQL server instances. It allows the user to map the data schema between sources and destination tables, choose column types and various other functionalities. It uses bulk copy (BCP) under the covers. A screenshot of the welcome screen for the SQL Database Migration wizard is shown below.
SQL Server supports:
- Database back up and restore functionality (both to a local file or bacpac export to blob) and Data Tier Applications (using bacpac).
- Ability to directly create SQL Server VMs on Azure with a copied database or copy to an existing SQL Azure database. For more details, see Use the Copy Database Wizard.
A screenshot of the Database back up/restore options from SQL Server Management Studio is shown below.