Azure runbooks or Azure batch for using Python 3 as part of data pipelines in Azure data factory?

Luuk van Vliet 276 Reputation points
2021-01-10T15:26:43.367+00:00

Hello,

We have some scenario's where we need to include Python 3 in our ADF pipelines. Workloads are small (under 1 GB of data).

Now I have a few requirements.

  • From within the Python script I must be able to access my azure sql database (so do either support odbc (pyodbc) ? )
  • Packages (and dependencies) should be managed easily. (So far I tried adding packages to an azure runbook (python 3 preview) and I ended up downloading them manually and adding them one by one which is a poor solution.

What would you advise to do? Solutions beyond runbooks and batch service are also very welcome as long as they are supported in azure data factory.

Azure Batch
Azure Batch
An Azure service that provides cloud-scale job scheduling and compute management.
302 questions
Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
9,487 questions
Azure Automation
Azure Automation
An Azure service that is used to automate, configure, and install updates across hybrid environments.
1,111 questions
0 comments No comments
{count} votes

1 answer

Sort by: Most helpful
  1. HimanshuSinha-msft 19,376 Reputation points Microsoft Employee
    2021-01-12T03:41:50.567+00:00

    Hello @Luuk van Vliet ,
    Thanks for the ask and using the forum .
    I think in your pipeline you want to execute some Python script and if thats the intend I think you can use Azure databricks . You can create a cluster and install the packages and have a notebook where you can have the python code . You can initiliaze the cluster from ADF . I have tried this and it worked for me .

    Going back to the point which you made , i think working with Azure batch should also click .

    Thanks
    Himanshu