This feature is in Public Preview.
Qlik Replicate helps you pull data from multiple data sources (Oracle, Microsoft SQL Server, SAP, mainframe and more) into Delta Lake. Replicate’s automated change data capture (CDC) helps you avoid the heavy lifting of manually extracting data, transferring using an API script, chopping, staging, and importing. Qlik Compose automates the CDC into Delta Lake.
For information about Qlik Sense, a solution that helps you analyze data in Delta Lake, see Qlik Sense.
For a general demonstration of Qlik Replicate Replicate, watch the following YouTube video (14 minutes).
For a demonstration of data pipelines with Qlik Replicate Replicate, see the following YouTube video (6 minutes).
Here are the steps for using Qlik Replicate with Azure Databricks.
Qlik Replicate authenticates with Azure Databricks using an Azure Databricks personal access token. To generate a personal access token, follow the instructions in Generate a personal access token.
Qlik Replicate will write data to an Azure Data Lake Storage path and the Azure Databricks integration cluster will read data from that location. Therefore the integration cluster requires secure access to the Azure Data Lake Storage path.
Secure access to an Azure Data Lake Storage path
To secure access to data in Azure Data Lake Storage (ADLS) you can use an Azure storage account access key (recommended) or an Azure service principal.
Use an Azure storage account access key
You can configure a storage account access key on the integration cluster as part of the Spark configuration. Ensure that the storage account has access to the ADLS container and file system used for staging data and the ADLS container and file system where you want to write the Delta Lake tables. To configure the integration cluster to use the key, follow the steps in Get started with Azure Data Lake Storage Gen2.
Use an Azure service principal
You can configure a service principal on the Azure Databricks integration cluster as part of the Spark configuration. Ensure that the service principal has access to the ADLS container used for staging data and the ADLS container where you want to write the Delta tables. To configure the integration cluster to use the service principal, follow the steps in Access ADLS Gen2 with service principal or Access ADLS Gen1 with service principal.
Specify the cluster configuration
Set Cluster Mode to Standard.
Set Databricks Runtime Version to a Databricks runtime version.
spark.databricks.delta.optimizeWrite.enabled true spark.databricks.delta.autoCompact.enabled true
Configure your cluster depending on your integration and scaling needs.
For cluster configuration details, see Configure clusters.
See Retrieve the connection details for the steps to obtain the JDBC URL and HTTP path.
To connect an Azure Databricks cluster to Qlik Replicate you need the following JDBC/ODBC connection properties:
- JDBC URL
- HTTP Path
Step 4: Configure Qlik Replicate with Azure Databricks
Go to the Qlik login page and follow the instructions.