I am trying to deploy a real time ETL solution using Microsoft Azure but I am having challenges.
My data source is oracle database, with Integration Runtime installed on the server with the database
I want to pick data from an Oracle Database as they come in (real time) and send it to Google Big Query and Azure Managed Elasticsearch service, using data lake as a staging area.
What I have done currently:
-is to have Integration Runtime installed on the server with the Oracle database and a copy activity in my ADF pipeline that picks yesterday's data from the Oracle database (which was configured in my select script).
-then I have a python script that picks the data from data lake, transforms and sends it to Google Big query and elasticsearch, which is run on Azure databricks and I run the databricks notebook in Azure data factory pipeline.
So my ADF pipeline has an activity that picks data from Oracle and sends it to Data Lake, then another activity runs the databricks notebook.
But I am having issues configuring the streaming from the Oracle database, how do I tell ADF to trigger the pipeline in real time when there's new data in the Oracle database? How do I go about it?