Solution Idea
If you'd like to see us expand this article with more information, such as potential use cases, alternative services, implementation considerations, or pricing guidance, let us know with GitHub Feedback!
Extend your on-premises big data investments to the cloud and transform your business using the advanced analytics capabilities of HDInsight. The integration of WANdisco Fusion with Azure HDInsight presents an enterprise solution that enables organizations to meet stringent data availability and compliance requirements while seamlessly moving production data at petabyte scale from on-premises big data deployments to Microsoft Azure.
Architecture
Download an SVG of this architecture.
- Establish ExpressRoute between on-premises infrastructure and Microsoft datacenters, to allow private connection for reliable, speedy, and secure data replication from an on-premises Hadoop setup to an Azure HDInsight cluster.
- Install the WANdisco Fusion server in the same Azure Virtual Network as the HDInsight cluster. This allows the server to access the cluster in a secure manner.
- Install the WANdisco Fusion app on a HDInsight cluster (new or existing). In the License key field, enter the Public IP of the Fusion Server.
- Configure the Fusion App on an HDInsight cluster to set up continuous active replication from on-premises big data/Hadoop deployments to Azure HDInsight, multi-region replication, backup and restore, and more.
Components
- Apache Hadoop or Apache Spark
- Metadata store
- Local edge router
- Azure ExpressRoute circuit
- Microsoft Edge router
- Data replication (WANdisco's LiveData Migrator for Azure and LiveData Plane for Azure)
- Azure HDInsight
- Azure Virtual Network
Next steps
Learn more about the component technologies:
- What is Azure ExpressRoute?
- Migrate your Hadoop data lakes with WANDisco LiveData Platform for Azure
- What is Azure HDInsight?
- What is Azure Virtual Network?
Related resources
Explore related architectures: