Extend your on-premises big data investments with HDInsight

HDInsight

Solution Idea

If you'd like to see us expand this article with more information, such as potential use cases, alternative services, implementation considerations, or pricing guidance, let us know with GitHub Feedback!

Extend your on-premises big data investments to the cloud and transform your business using the advanced analytics capabilities of HDInsight. The integration of WANdisco Fusion with Azure HDInsight presents an enterprise solution that enables organizations to meet stringent data availability and compliance requirements while seamlessly moving production data at petabyte scale from on-premises big data deployments to Microsoft Azure.

Architecture

Architecture Diagram Download an SVG of this architecture.

  1. Establish ExpressRoute between on-premises infrastructure and Microsoft datacenters, to allow private connection for reliable, speedy, and secure data replication from an on- premises Hadoop setup to an Azure HDInsight cluster.
  2. Install the WANdisco Fusion server in the same Azure Virtual Network as the HDInsight cluster. This allows the server to access the cluster in a secure manner.
  3. Install the WANdisco Fusion app on a HDInsight cluster (new or existing). In the License key field, enter the Public IP of the Fusion Server.
  4. Configure the Fusion App on an HDInsight cluster to set up continuous active replication from on-premises big data/Hadoop deployments to Azure HDInsight, multi-region replication, backup and restore, and more.

Components

Next steps

Learn more about the component technologies:

Explore related architectures: