How to access on-premises data sources in Data Factory for Microsoft Fabric

Data Factory for Microsoft Fabric is a powerful cloud-based data integration service that allows you to create, schedule, and manage workflows for various data sources. In scenarios where your data sources are located on-premises, Microsoft provides the On-Premises Data Gateway to securely bridge the gap between your on-premises environment and the cloud. This document guides you through the process of accessing on-premises data sources within Data Factory for Microsoft Fabric using the On-Premises Data Gateway.

Create an on-premises data gateway

  1. An on-premises data gateway is a software application designed to be installed within a local network environment. It provides a means to directly install the gateway onto your local machine. For detailed instructions on how to download and install the on-premises data gateway, refer to Install an on-premises data gateway.

    Screenshot showing the on-premises data gateway setup.

  2. Sign-in using your user account to access the on-premises data gateway, after which it's prepared for utilization.

    Screenshot showing the on-premises data gateway setup after the user signed in.

Create a connection for your on-premises data source

  1. Navigate to the admin portal and select the settings button (an icon that looks like a gear) at the top right of the page. Then choose Manage connections and gateways from the dropdown menu that appears.

    Screenshot showing the Settings menu with Manage connections and gateways highlighted.

  2. On the New connection dialog that appears, select On-premises and then provide your gateway cluster, along with the associated resource type and relevant information.

    Screenshot showing the New connection dialog with On-premises selected.

Connect your on-premises data source to a Dataflow Gen2 in Data Factory for Microsoft Fabric

  1. Go to your workspace and create a Dataflow Gen2.

    Screenshot showing a demo workspace with the new Dataflow Gen2 option highlighted.

  2. Add a new source to the dataflow and select the connection established in the previous step.

    Screenshot showing the Connect to data source dialog in a Dataflow Gen2 with an on-premises source selected.

  3. You can use the Dataflow Gen2 to perform any necessary data transformations based on your requirements.

    Screenshot showing the Power Query editor with some transformations applied to the sample data source.

  4. Use the Add data destination button on the Home tab of the Power Query editor to add a destination for your data from the on-premises source.

    Screenshot showing the Power Query editor with the Add data destination button selected, showing the available destination types.

  5. Publish the Dataflow Gen2.

    Screenshot showing the Power Query editor with the Publish button highlighted.

Now you've created a Dataflow Gen2 to load data from an on-premises data source into a cloud destination.

Using on-premises data in a pipeline (Preview)

  1. Go to your workspace and create a data pipeline.

    Screenshot showing how to create a new data pipeline.

Note

You need to configure the firewall to allow outbound connections *.frontend.clouddatahub.net from the gateway for Fabric pipeline capabilities.

  1. From the Home tab of the pipeline editor, select Copy data and then Use copy assistant. Add a new source to the activity in the assistant's Choose data source page, then select the connection established in the previous step.

    Screenshot showing where to choose a new data source from the Copy data activity.

  2. Select a destination for your data from the on-premises data source.

    Screenshot showing where to choose the data destination in the Copy activity.

  3. Run the pipeline.

    Screenshot showing where to run the pipeline in the pipeline editor window.

Now you've created and ran a pipeline to load data from an on-premises data source into a cloud destination.