Tutorial: Copy data from Blob Storage to SQL Database using Data Factory
This article applies to version 1 of Data Factory. If you are using the current version of the Data Factory service, see copy activity tutorial.
In this tutorial, you create a data factory with a pipeline to copy data from Blob storage to SQL database.
The Copy Activity performs the data movement in Azure Data Factory. It is powered by a globally available service that can copy data between various data stores in a secure, reliable, and scalable way. See Data Movement Activities article for details about the Copy Activity.
For a detailed overview of the Data Factory service, see the Introduction to Azure Data Factory article.
Prerequisites for the tutorial
Before you begin this tutorial, you must have the following prerequisites:
- Azure subscription. If you don't have a subscription, you can create a free trial account in just a couple of minutes. See the Free Trial article for details.
- Azure Storage Account. You use the blob storage as a source data store in this tutorial. if you don't have an Azure storage account, see the Create a storage account article for steps to create one.
- Azure SQL Database. You use an Azure SQL database as a destination data store in this tutorial. If you don't have an Azure SQL database that you can use in the tutorial, See How to create and configure an Azure SQL Database to create one.
- SQL Server 2012/2014 or Visual Studio 2013. You use SQL Server Management Studio or Visual Studio to create a sample database and to view the result data in the database.
Collect blob storage account name and key
You need the account name and account key of your Azure storage account to do this tutorial. Note down account name and account key for your Azure storage account.
Log in to the Azure portal.
Click All services on the left menu and select Storage Accounts.
In the Storage Accounts blade, select the Azure storage account that you want to use in this tutorial.
Select Access keys link under SETTINGS.
Click copy (image) button next to Storage account name text box and save/paste it somewhere (for example: in a text file).
Repeat the previous step to copy or note down the key1.
Close all the blades by clicking X.
Collect SQL server, database, user names
You need the names of Azure SQL server, database, and user to do this tutorial. Note down names of server, database, and user for your Azure SQL database.
- In the Azure portal, click All services on the left and select SQL databases.
- In the SQL databases blade, select the database that you want to use in this tutorial. Note down the database name.
- In the SQL database blade, click Properties under SETTINGS.
- Note down the values for SERVER NAME and SERVER ADMIN LOGIN.
- Close all the blades by clicking X.
Allow Azure services to access SQL server
Ensure that Allow access to Azure services setting turned ON for your Azure SQL server so that the Data Factory service can access your Azure SQL server. To verify and turn on this setting, do the following steps:
- Click All services hub on the left and click SQL servers.
- Select your server, and click Firewall under SETTINGS.
- In the Firewall settings blade, click ON for Allow access to Azure services.
- Close all the blades by clicking X.
Prepare Blob Storage and SQL Database
Now, prepare your Azure blob storage and Azure SQL database for the tutorial by performing the following steps:
Launch Notepad. Copy the following text and save it as emp.txt to C:\ADFGetStarted folder on your hard drive.
John, Doe Jane, Doe
Use tools such as Azure Storage Explorer to create the adftutorial container and to upload the emp.txt file to the container.
Use the following SQL script to create the emp table in your Azure SQL Database.
CREATE TABLE dbo.emp ( ID int IDENTITY(1,1) NOT NULL, FirstName varchar(50), LastName varchar(50), ) GO CREATE CLUSTERED INDEX IX_emp_ID ON dbo.emp (ID);
If you have SQL Server 2012/2014 installed on your computer: follow instructions from Managing Azure SQL Database using SQL Server Management Studio to connect to your Azure SQL server and run the SQL script.
If your client is not allowed to access the Azure SQL server, you need to configure firewall for your Azure SQL server to allow access from your machine (IP Address). See this article for steps to configure the firewall for your Azure SQL server.
Create a data factory
You have completed the prerequisites. You can create a data factory using one of the following ways. Click one of the options in the drop-down list at the top or the following links to perform the tutorial.
The data pipeline in this tutorial copies data from a source data store to a destination data store. It does not transform input data to produce output data. For a tutorial on how to transform data using Azure Data Factory, see Tutorial: Build your first pipeline to transform data using Hadoop cluster.
You can chain two activities (run one activity after another) by setting the output dataset of one activity as the input dataset of the other activity. See Scheduling and execution in Data Factory for detailed information.