question

Raja-1642 avatar image
0 Votes"
Raja-1642 asked Raja-1642 answered

Coding alternative to the Azure data factory

Hi Team,
I have been using Azure Data Factory for various ETL activities; copying from various data sources, transforming and dumping it into other destinations. Sometimes the pipeline I'm creating becomes very complex (its not simple transformation) . For example I need to connect to external server(using REST) then get data , do many steps and finally write to different files. This requires good amount of logic which makes the data factory look very complex and difficult to read. Is there any better alternative where I can code all of this instead of using defined Azure Data activities?

I can write Python/Java program and instead of using those defined boxes I can write my own custom code. May be Synapse or Data bricks or something else can be used. Which is the better alternative (on similar cost)?

Thanks,
Raja

azure-data-factory
· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

@Raja-1642 did any of the responses help? If one solved your issue, please mark as accepted answer. If not, please let us know how better to assist.

0 Votes 0 ·
MartinJaffer-MSFT avatar image
1 Vote"
MartinJaffer-MSFT answered

Hello @Raja-1642 and welcome to Microsoft Q&A.

In addition to @KuldeepChitrakar-0966 suggestion of Logic Apps and Azure Automate, there is also Azure Functions, which is purely code.

In Azure Data Factory there is the Custom Activity, where Azure Batch is leveraged to run custom code you provide.

For big data, Azure Databricks, Azure Synapse Analytics, and HDInsight are better suited. HDInsight is heavyweight and always-on, so probably the most expensive of the three.
Databricks runs on Spark. There are notebooks where you can write in PySpark, Scala, and SQL.
Azure Synapse is the most diverse. Azure Synapse is the confluence of Data Factory, Spark (like Databricks), SQL (both on-demand, and dedicated), and much more than I can reliably remember.
I think both Databricks and Synapse also do Java. I know they accept .jar files.

Does this help?

5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

KuldeepChitrakar-0966 avatar image
0 Votes"
KuldeepChitrakar-0966 answered

You can think of orchestrating the pipelines using LogicApps or Azure Automate

5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Raja-1642 avatar image
0 Votes"
Raja-1642 answered

Hi @KuldeepChitrakar-0966 ,
Thanks for prompt response. As far as I understand Data factory is good when you want simple ETL job and dont want to code but use simple UI drag and drop stuff.

Out of all the options you have provided , it seems using "Function" is a good choice as I dont need to run a cluster.

5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.