Quickstart: Use the Copy Data tool to copy data

In this quickstart, you use the Azure portal to create a data factory. Then, you use the Copy Data tool to create a pipeline that copies data from a folder in Azure Blob storage to another folder.

Note

If you are new to Azure Data Factory, see Introduction to Azure Data Factory before doing this quickstart.

Prerequisites

Azure subscription

If you don't have an Azure subscription, create a free account before you begin.

Azure roles

To create Data Factory instances, the user account that you use to sign in to Azure must be a member of the contributor or owner role, or an administrator of the Azure subscription. To view the permissions that you have in the subscription, go to the Azure portal, select your username in the upper-right corner, select More options (...), and then select My permissions. If you have access to multiple subscriptions, select the appropriate subscription.

To create and manage child resources for Data Factory - including datasets, linked services, pipelines, triggers, and integration runtimes - the following requirements are applicable:

  • To create and manage child resources in the Azure portal, you must belong to the Data Factory Contributor role at the resource group level or above.
  • To create and manage child resources with PowerShell or the SDK, the contributor role at the resource level or above is sufficient.

For sample instructions about how to add a user to a role, see the Add roles article.

For more info, see the following articles:

Azure storage account

You use a general-purpose Azure storage account (specifically Blob storage) as both source and destination data stores in this quickstart. If you don't have a general-purpose Azure storage account, see Create a storage account to create one.

Get the storage account name

You will need the name of your Azure storage account for this quickstart. The following procedure provides steps to get the name of your storage account:

  1. In a web browser, go to the Azure portal and sign in using your Azure username and password.
  2. From the Azure portal menu, select All services, then select Storage > Storage accounts. You can also search for and select Storage accounts from any page.
  3. In the Storage accounts page, filter for your storage account (if needed), and then select your storage account.

You can also search for and select Storage accounts from any page.

Create a blob container

In this section, you create a blob container named adftutorial in Azure Blob storage.

  1. From the storage account page, select Overview > Blobs.

  2. On the <Account name> - Blobs page's toolbar, select Container.

  3. In the New container dialog box, enter adftutorial for the name, and then select OK. The <Account name> - Blobs page is updated to include adftutorial in the list of containers.

    List of containers

Add an input folder and file for the blob container

In this section, you create a folder named input in the container you just created, and then upload a sample file to the input folder. Before you begin, open a text editor such as Notepad, and create a file named emp.txt with the following content:

John, Doe
Jane, Doe

Save the file in the C:\ADFv2QuickStartPSH folder. (If the folder doesn't already exist, create it.) Then return to the Azure portal and follow these steps:

  1. In the <Account name> - Blobs page where you left off, select adftutorial from the updated list of containers.

    1. If you closed the window or went to another page, sign in to the Azure portal again.
    2. From the Azure portal menu, select All services, then select Storage > Storage accounts. You can also search for and select Storage accounts from any page.
    3. Select your storage account, and then select Blobs > adftutorial.
  2. On the adftutorial container page's toolbar, select Upload.

  3. In the Upload blob page, select the Files box, and then browse to and select the emp.txt file.

  4. Expand the Advanced heading. The page now displays as shown:

    Select Advanced link

  5. In the Upload to folder box, enter input.

  6. Select the Upload button. You should see the emp.txt file and the status of the upload in the list.

  7. Select the Close icon (an X) to close the Upload blob page.

Keep the adftutorial container page open. You use it to verify the output at the end of this quickstart.

Create a data factory

  1. Launch Microsoft Edge or Google Chrome web browser. Currently, Data Factory UI is supported only in Microsoft Edge and Google Chrome web browsers.

  2. Go to the Azure portal.

  3. From the Azure portal menu, select Create a resource.

    Create a resource from the Azure portal menu

  4. Select Analytics, and then select Data Factory.

    Data Factory selection in the "New" pane

  5. On the New data factory page, enter ADFTutorialDataFactory for Name.

    The name of the Azure data factory must be globally unique. If you see the following error, change the name of the data factory (for example, <yourname>ADFTutorialDataFactory) and try creating again. For naming rules for Data Factory artifacts, see the Data Factory - naming rules article.

    Error when a name is not available

  6. For Subscription, select your Azure subscription in which you want to create the data factory.

  7. For Resource Group, use one of the following steps:

    • Select Use existing, and select an existing resource group from the list.
    • Select Create new, and enter the name of a resource group.

    To learn about resource groups, see Using resource groups to manage your Azure resources.

  8. For Version, select V2.

  9. For Location, select the location for the data factory.

    The list shows only locations that Data Factory supports, and where your Azure Data Factory meta data will be stored. The associated data stores (like Azure Storage and Azure SQL Database) and computes (like Azure HDInsight) that Data Factory uses can run in other regions.

  10. Select Create.

  11. After the creation is complete, you see the Data Factory page. Select the Author & Monitor tile to start the Azure Data Factory user interface (UI) application on a separate tab.

    Home page for the data factory, with the "Author & Monitor" tile

Start the Copy Data tool

  1. On the Let's get started page, select the Copy Data tile to start the Copy Data tool.

    "Copy Data" tile

  2. On the Properties page of the Copy Data tool, you can specify a name for the pipeline and its description, then select Next.

    "Properties" page

  3. On the Source data store page, complete the following steps:

    a. Click + Create new connection to add a connection.

    b. Select Azure Blob Storage from the gallery, and then select Continue.

    c. On the New Linked Service (Azure Blob Storage) page, specify a name for your linked service. Select your storage account from the Storage account name list, test connection, and then select Finish.

    Configure the Azure Blob storage account

    d. Select the newly created linked service as source, and then click Next.

  4. On the Choose the input file or folder page, complete the following steps:

    a. Click Browse to navigate to the adftutorial/input folder, select the emp.txt file, and then click Choose.

    d. Select the Binary copy checkbox to copy file as-is, and then select Next.

    "Choose the input file or folder" page

  5. On the Destination data store page, select the Azure Blob Storage linked service you created, and then select Next.

  6. On the Choose the output file or folder page, enter adftutorial/output for the folder path, and then select Next.

    "Choose the output file or folder" page

  7. On the Settings page, select Next to use the default configurations.

  8. On the Summary page, review all settings, and select Next.

  9. On the Deployment complete page, select Monitor to monitor the pipeline that you created.

    "Deployment complete" page

  10. The application switches to the Monitor tab. You see the status of the pipeline on this tab. Select Refresh to refresh the list.

  11. Select the View Activity Runs link in the Actions column. The pipeline has only one activity of type Copy.

  12. To view details about the copy operation, select the Details (eyeglasses image) link in the Actions column. For details about the properties, see Copy Activity overview.

  13. Verify that the emp.txt file is created in the output folder of the adftutorial container. If the output folder doesn't exist, the Data Factory service automatically creates it.

  14. Switch to the Author tab above the Monitor tab on the left panel so that you can edit linked services, datasets, and pipelines. To learn about editing them in the Data Factory UI, see Create a data factory by using the Azure portal.

Next steps

The pipeline in this sample copies data from one location to another location in Azure Blob storage. To learn about using Data Factory in more scenarios, go through the tutorials.