Use the Copy Data tool to copy data

In this quickstart, you use the Azure portal to create a data factory. Then, you use the Copy Data tool to create a pipeline that copies data from a folder in Azure Blob storage to another folder.


If you are new to Azure Data Factory, see Introduction to Azure Data Factory before doing this quickstart.

This article applies to version 2 of Data Factory, which is currently in preview. If you are using version 1 of the service, which is in general availability (GA), see Get started with Data Factory version 1.


Azure subscription

If you don't have an Azure subscription, create a free account before you begin.

Azure roles

To create Data Factory instances, the user account that you use to sign in to Azure must be a member of the contributor or owner role, or an administrator of the Azure subscription. In the Azure portal, select your username in the upper-right corner, and select Permissions to view the permissions that you have in the subscription. If you have access to multiple subscriptions, select the appropriate subscription. For sample instructions on adding a user to a role, see the Add roles article.

Azure storage account

You use a general-purpose Azure storage account (specifically Blob storage) as both source and destination data stores in this quickstart. If you don't have a general-purpose Azure storage account, see Create a storage account to create one.

Get the storage account name and account key

You use the name and key of your Azure storage account in this quickstart. The following procedure provides steps to get the name and key of your storage account:

  1. In a web browser, go to the Azure portal. Sign in by using your Azure username and password.
  2. Select More services on the left menu, filter with the Storage keyword, and select Storage accounts.

    Search for a storage account

  3. In the list of storage accounts, filter for your storage account (if needed), and then select your storage account.
  4. On the Storage account page, select Access keys on the menu.

    Get storage account name and key

  5. Copy the values for the Storage account name and key1 boxes to the clipboard. Paste them into Notepad or any other editor and save it. You use them later in this quickstart.

Create the input folder and files

In this section, you create a blob container named adftutorial in Azure Blob storage. You create a folder named input in the container, and then upload a sample file to the input folder.

  1. On the Storage account page, switch to Overview, and then select Blobs.

    Select Blobs option

  2. On the Blob service page, select + Container on the toolbar.

    Add container button

  3. In the New container dialog box, enter adftutorial for the name, and then select OK.

    Enter container name

  4. Select adftutorial in the list of containers.

    Select the container

  5. On the Container page, select Upload on the toolbar.

    Upload button

  6. On the Upload blob page, select Advanced.

    Select Advanced link

  7. Start Notepad and create a file named emp.txt with the following content. Save it in the c:\ADFv2QuickStartPSH folder. Create the ADFv2QuickStartPSH folder if it does not already exist.

    John, Doe
    Jane, Doe
  8. In the Azure portal, on the Upload blob page, browse to and select the emp.txt file for the Files box.
  9. Enter input as a value for the Upload to folder box.

    Upload blob settings

  10. Confirm that the folder is input and the file is emp.txt, and select Upload.

    You should see the emp.txt file and the status of the upload in the list.

  11. Close the Upload blob page by clicking X in the corner.

    Close upload blob page

  12. Keep the Container page open. You use it to verify the output at the end of this quickstart.

Create a data factory

  1. Select New on the left menu, select Data + Analytics, and then select Data Factory.

    Data Factory selection in the "New" pane

  2. On the New data factory page, enter ADFTutorialDataFactory for Name.

    "New data factory" page

    The name of the Azure data factory must be globally unique. If you see the following error, change the name of the data factory (for example, <yourname>ADFTutorialDataFactory) and try creating again. For naming rules for Data Factory artifacts, see the Data Factory - naming rules article.

    Error when a name is not available

  3. For Subscription, select your Azure subscription in which you want to create the data factory.
  4. For Resource Group, use one of the following steps:

    • Select Use existing, and select an existing resource group from the list.
    • Select Create new, and enter the name of a resource group.

    To learn about resource groups, see Using resource groups to manage your Azure resources.

  5. For Version, select V2 (Preview).
  6. For Location, select the location for the data factory.

    The list shows only supported locations. The data stores (like Azure Storage and Azure SQL Database) and computes (like Azure HDInsight) that Data Factory uses can be in other locations/regions.

  7. Select Pin to dashboard.

  8. Select Create.
  9. On the dashboard, you see the following tile with the status Deploying Data Factory:

    "Deploying Data Factory" tile

  10. After the creation is complete, you see the Data Factory page. Select the Author & Monitor tile to start the Azure Data Factory user interface (UI) application on a separate tab.

    Home page for the data factory, with the "Author & Monitor" tile

Start the Copy Data tool

  1. On the Let's get started page, select the Copy Data tile to start the Copy Data tool.

    "Copy Data" tile

  2. On the Properties page of the Copy Data tool, select Next. You can specify a name for the pipeline and its description on this page.

    "Properties" page

  3. On the Source data store page, select Azure Blob Storage, and then select Next.

    "Source data store" page

  4. On the Specify the Azure Blob storage account page, select your storage account from the Storage account name list, and then select Next.

    "Specify the Azure Blob storage account" page

  5. On the Choose the input file or folder page, complete the following steps:

    a. Browse to the adftutorial/input folder.

    b. Select the emp.txt file.

    c. Select Choose. You can double-click emp.txt to skip this step.

    d. Select Next.

    "Choose the input file or folder" page

  6. On the File format settings page, notice that the tool automatically detects the column and row delimiters, and select Next. You can also preview data and view schemas of the input data on this page.

    "File format settings" page

  7. On the Destination data store page, select Azure Blob Storage, and then select Next.

    "Destination data store" page

  8. On the Specify the Azure Blob storage account page, select your Azure Blob storage account, and then select Next.

    "Specify the Azure Blob storage account" page

  9. On the Choose the output file or folder page, complete the following steps:

    a. Enter adftutorial/output for the folder path.

    b. Enter emp.txt for the file name.

    c. Select Next.

    "Choose the output file or folder" page

  10. On the File format settings page, select Next.

    "File format settings" page

  11. On the Settings page, select Next.

    "Settings" page

  12. Review all settings on the Summary page, and select Next.

    "Summary" page

  13. On the Deployment complete page, select Monitor to monitor the pipeline that you created.

    "Deployment complete" page

  14. The application switches to the Monitor tab. You see the status of the pipeline on this tab. Select Refresh to refresh the list.

    Tab for monitoring pipeline runs, with "Refresh" button

  15. Select the View Activity Runs link in the Actions column. The pipeline has only one activity of type Copy.

    List of activity runs

  16. To view details about the copy operation, select the Details (eyeglasses image) link in the Actions column. For details about the properties, see Copy Activity overview.

    Copy operation details

  17. Verify that the emp.txt file is created in the output folder of the adftutorial container. If the output folder does not exist, the Data Factory service automatically creates it.
  18. Switch to the Edit tab so that you can edit linked services, datasets, and pipelines. To learn about editing them in the Data Factory UI, see Create a data factory by using the Azure portal.

    Edit tab

Next steps

The pipeline in this sample copies data from one location to another location in Azure Blob storage. To learn about using Data Factory in more scenarios, go through the tutorials.