Copy data from a SQL Server database to Azure Blob storage by using the Copy Data tool

APPLIES TO: Azure Data Factory Azure Synapse Analytics (Preview)

In this tutorial, you use the Azure portal to create a data factory. Then, you use the Copy Data tool to create a pipeline that copies data from a SQL Server database to Azure Blob storage.

Note

In this tutorial, you perform the following steps:

  • Create a data factory.
  • Use the Copy Data tool to create a pipeline.
  • Monitor the pipeline and activity runs.

Prerequisites

Azure subscription

Before you begin, if you don't already have an Azure subscription, create a free account.

Azure roles

To create data factory instances, the user account you use to log in to Azure must be assigned a Contributor or Owner role or must be an administrator of the Azure subscription.

To view the permissions you have in the subscription, go to the Azure portal. Select your user name in the upper-right corner, and then select Permissions. If you have access to multiple subscriptions, select the appropriate subscription. For sample instructions on how to add a user to a role, see Add or remove Azure role assignments using the Azure portal.

SQL Server 2014, 2016, and 2017

In this tutorial, you use a SQL Server database as a source data store. The pipeline in the data factory you create in this tutorial copies data from this SQL Server database (source) to Blob storage (sink). You then create a table named emp in your SQL Server database and insert a couple of sample entries into the table.

  1. Start SQL Server Management Studio. If it's not already installed on your machine, go to Download SQL Server Management Studio.

  2. Connect to your SQL Server instance by using your credentials.

  3. Create a sample database. In the tree view, right-click Databases, and then select New Database.

  4. In the New Database window, enter a name for the database, and then select OK.

  5. To create the emp table and insert some sample data into it, run the following query script against the database. In the tree view, right-click the database that you created, and then select New Query.

    CREATE TABLE dbo.emp
    (
        ID int IDENTITY(1,1) NOT NULL,
        FirstName varchar(50),
        LastName varchar(50)
    )
    GO
    
    INSERT INTO emp (FirstName, LastName) VALUES ('John', 'Doe')
    INSERT INTO emp (FirstName, LastName) VALUES ('Jane', 'Doe')
    GO
    

Azure storage account

In this tutorial, you use a general-purpose Azure storage account (specifically, Blob storage) as a destination/sink data store. If you don't have a general-purpose storage account, see Create a storage account for instructions to create one. The pipeline in the data factory you that create in this tutorial copies data from the SQL Server database (source) to this Blob storage (sink).

Get the storage account name and account key

You use the name and key of your storage account in this tutorial. To get the name and key of your storage account, take the following steps:

  1. Sign in to the Azure portal with your Azure user name and password.

  2. In the left pane, select All services. Filter by using the Storage keyword, and then select Storage accounts.

    Storage account search

  3. In the list of storage accounts, filter for your storage account, if needed. Then select your storage account.

  4. In the Storage account window, select Access keys.

  5. In the Storage account name and key1 boxes, copy the values, and then paste them into Notepad or another editor for later use in the tutorial.

Create the adftutorial container

In this section, you create a blob container named adftutorial in your Blob storage.

  1. In the Storage account window, switch to Overview, and then select Blobs.

  2. In the Blobs window, select + Container.

  3. In the New container window, under Name, enter adftutorial, and then select OK.

  4. In the list of containers, select adftutorial.

  5. Keep the Container window for adftutorial open. You use it to verify the output at the end of the tutorial. Data Factory automatically creates the output folder in this container, so you don't need to create one.

Create a data factory

  1. On the menu on the left, select Create a resource > Integration > Data Factory.

    New data factory creation

  2. On the New data factory page, under Name, enter ADFTutorialDataFactory.

    The name of the data factory must be globally unique. If you see the following error message for the name field, change the name of the data factory (for example, yournameADFTutorialDataFactory). For naming rules for Data Factory artifacts, see Data Factory naming rules.

    New data factory name

  3. Select the Azure subscription in which you want to create the data factory.

  4. For Resource Group, take one of the following steps:

  5. Under Version, select V2.

  6. Under Location, select the location for the data factory. Only locations that are supported are displayed in the drop-down list. The data stores (for example, Azure Storage and SQL Database) and computes (for example, Azure HDInsight) used by Data Factory can be in other locations/regions.

  7. Select Create.

  8. After the creation is finished, you see the Data Factory page as shown in the image.

    Data factory home page

  9. Select Author & Monitor to launch the Data Factory user interface in a separate tab.

Use the Copy Data tool to create a pipeline

  1. On the Let's get started page, select Copy Data to launch the Copy Data tool.

    Get started page

  2. On the Properties page of the Copy Data tool, under Task name, enter CopyFromOnPremSqlToAzureBlobPipeline. Then select Next. The Copy Data tool creates a pipeline with the name you specify for this field. Task name

  3. On the Source data store page, click on Create new connection.

  4. Under New Linked Service, search for SQL Server, and then select Continue.

  5. In the New Linked Service (SQL Server) dialog box, under Name, enter SqlServerLinkedService. Select +New under Connect via integration runtime. You must create a self-hosted integration runtime, download it to your machine, and register it with Data Factory. The self-hosted integration runtime copies data between your on-premises environment and the cloud.

  6. In the Integration Runtime Setup dialog box, Select Self-Hosted. Then select Continue.

    Create integration runtime

  7. In the Integration Runtime Setup dialog box, under Name, enter TutorialIntegrationRuntime. Then select Create.

  8. In the Integration Runtime Setup dialog box, select Click here to launch the express setup for this computer. This action installs the integration runtime on your machine and registers it with Data Factory. Alternatively, you can use the manual setup option to download the installation file, run it, and use the key to register the integration runtime.

  9. Run the downloaded application. You see the status of the express setup in the window.

    Express setup status

  10. In the New Linked Service (SQL Server) dialog box, confirm that TutorialIntegrationRuntime is selected for the Integration Runtime field. Then, take the following steps:

    a. Under Name, enter SqlServerLinkedService.

    b. Under Server name, enter the name of your SQL Server instance.

    c. Under Database name, enter the name of your on-premises database.

    d. Under Authentication type, select appropriate authentication.

    e. Under User name, enter the name of user with access to SQL Server.

    f. Enter the password for the user.

    g. Test connection and select Finish.

    Integration runtime selected

  11. On the Source data store page, select Next.

  12. On the Select tables from which to copy the data or use a custom query page, select the [dbo].[emp] table in the list, and select Next. You can select any other table based on your database.

  13. On the Destination data store page, select Create new connection

  14. In New Linked Service, Search and Select Azure Blob, and then select Continue.

    Blob storage selection

  15. On the New Linked Service (Azure Blob Storage) dialog, take the following steps:

    a. Under Name, enter AzureStorageLinkedService.

    b. Under Connect via integration runtime, select TutorialIntegrationRuntime

    c. Under Storage account name, select your storage account from the drop-down list.

    d. Select Finish.

  16. In Destination data store dialog, make sure that Azure Blob Storage is selected. Then select Next.

  17. In the Choose the output file or folder dialog, under Folder path, enter adftutorial/fromonprem. You created the adftutorial container as part of the prerequisites. If the output folder doesn't exist (in this case fromonprem), Data Factory automatically creates it. You can also use the Browse button to browse the blob storage and its containers/folders. If you do not specify any value under File name, by default the name from the source would be used (in this case dbo.emp).

    Choose the output file or folder

  18. On the File format settings dialog, select Next.

  19. On the Settings dialog, select Next.

  20. On the Summary dialog, review values for all the settings, and select Next.

  21. On the Deployment page, select Monitor to monitor the pipeline (task).

  22. When the pipeline run completes, you can view the status of the pipeline you created.

  23. On the Pipeline runs page, select Refresh to refresh the list. Click the link under PIPELINE NAME to view activity run details or rerun the pipeline.

  24. On the Activity runs page, select the Details link (eyeglasses icon) under the ACTIVITY NAME column for more details about copy operation. To go back to the Pipeline Runs view, select the ALL pipeline runs link in the breadcrumb menu. To refresh the view, select Refresh.

  25. Confirm that you see the output file in the fromonprem folder of the adftutorial container.

  26. Select the Edit tab on the left to switch to the editor mode. You can update the linked services, datasets, and pipelines created by the tool by using the editor. Select Code to view the JSON code associated with the entity opened in the editor. For details on how to edit these entities in the Data Factory UI, see the Azure portal version of this tutorial.

Next steps

The pipeline in this sample copies data from a SQL Server database to Blob storage. You learned how to:

  • Create a data factory.
  • Use the Copy Data tool to create a pipeline.
  • Monitor the pipeline and activity runs.

For a list of data stores that are supported by Data Factory, see Supported data stores.

To learn about how to copy data in bulk from a source to a destination, advance to the following tutorial: