question

vijie-3055 avatar image
0 Votes"
vijie-3055 asked MilesCole-2434 edited

Single Node Databricks Job cluster from Azure Data Factory

I need to create Single Node Databricks Job Cluster from Azure Data Factory. Currently in Azure Data Factory, there is no option to choose the Cluster mode like Standard or Single Node. We cant mention Worker as 0 since the Standard cluster needs atleast one worker node to execute the Spark commands whereas it is not the case with Single Node. There is an alternate way of using Single Node cluster via Interactive Cluster option, but we want to have it as job cluster so that it gets deleted automatically after the process completes.90835-adf-job-cluster-adb.png

Will this feature get added to ADF in near future?

azure-data-factoryazure-databricks
· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

This is possible. The site below details out how you get ADF to spin up a SingleNode cluster, not too hard, just not straightforward from the UI.

https://methodidacte-org.translate.goog/2020/10/utiliser-un-automated-cluster-single-node/?_x_tr_sch=http&_x_tr_sl=fr&_x_tr_tl=en&_x_tr_hl=en&_x_tr_pto=nui,sc

136980-image.png



From the linked service JSON:



 "newClusterSparkConf": {
                 "spark.master": "local[*, 4]",
                 "spark.databricks.cluster.profile": "singleNode"
             }


0 Votes 0 ·
image.png (32.6 KiB)
PRADEEPCHEEKATLA-MSFT avatar image
0 Votes"
PRADEEPCHEEKATLA-MSFT answered vijie-3055 commented

Hello @vijie-3055,

Thanks for the ask and using Microsoft Q&A platform.

Unfortunately, you cannot use Single Node option using New Job cluster from Azure Data Factory.

I would suggest you to vote up an idea submitted by another Azure customer.

https://feedback.azure.com/forums/270578-data-factory/suggestions/42777137-add-to-databricks-linked-service-the-cluster-mode

All of the feedback you share in these forums will be monitored and reviewed by the Microsoft engineering teams responsible for building Azure.

If you want to use single node clusters, you can create a single node cluster from Azure Databricks portal and select it by choosing Existing interactive cluster while creating a new linked service.

91142-image.png

Hope this helps. Do let us know if you any further queries.


Please don’t forget to Accept Answer and Up-Vote wherever the information provided helps you, this can be beneficial to other community members.


image.png (89.4 KiB)
· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Thank you @PRADEEPCHEEKATLA-MSFT , I have voted for the feature support in ADF. Yes we already have an alternate option of using Interactive cluster.

0 Votes 0 ·
MilesCole-2434 avatar image
0 Votes"
MilesCole-2434 answered MilesCole-2434 edited

This is possible. The site below details out how you get ADF to spin up a SingleNode cluster, not too hard, just not straightforward from the UI.

Credit to the below site where i found this:
utiliser-un-automated-cluster-single-node

137032-image.png


Linked Service JSON:

 {
     "name": "Databricks",
     "properties": {
         "annotations": [],
         "type": "AzureDatabricks",
         "typeProperties": {
             "domain": "<Domain>",
             "authentication": "MSI",
             "workspaceResourceId": "<resouceID>",
             "instancePoolId": "<poolD>",
             "newClusterNodeType": "Standard_DS12_v2",
             **"newClusterNumOfWorker": "0",**
             "newClusterSparkConf": {
                 **"spark.master": "local[*, 4]",
                 "spark.databricks.cluster.profile": "singleNode",**
                 "spark.databricks.delta.preview.enabled": "true"
             },
             "newClusterSparkEnvVars": {
                 "PYSPARK_PYTHON": "/databricks/python3/bin/python3"
             },
             "newClusterVersion": "9.1.x-scala2.12",
             "newClusterInitScripts": []
         },
         "connectVia": {
             "referenceName": "AutoResolveIntegrationRuntime",
             "type": "IntegrationRuntimeReference"
         }
     }
 }



image.png (32.6 KiB)
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.