Databricks support redirects to azure support: unexpected internal error when spinning up a Databricks all-purpose cluster

ADM.Susana Domingos 0 Reputation points
2024-05-02T13:44:55.72+00:00

Hello,
What do we do when we get this error, when spinning up a Databricks all-purpose cluster?
{   "reason": {     "code": "CONTAINER_LAUNCH_FAILURE",     "type": "SERVICE_FAULT",     "parameters": {       "instance_id": "d0...redacted...4a",       "databricks_error_message": "Failed to launch spark container on instance d0...redacted...4a. Exception: Unexpected internal error, please contact Databricks support"     }   },   "add_node_failure_details": {     "failure_count": 4,     "resource_type": "container",     "will_retry": true   } }

Databricks support redirects queries to Azure support.

Thanks

Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
1,955 questions
{count} votes

2 answers

Sort by: Most helpful
  1. Amira Bedhiafi 16,071 Reputation points
    2024-05-02T20:49:11.4233333+00:00

    You have Spark containers failed to launch on the worker instances within your cluster.

    Based on this old thread :

    1. Check the status of the worker instances in your cluster. Make sure that they are all up and running and that there are no issues with the underlying infrastructure. You can also check the logs of the worker instances to see if there are any errors or issues that might be causing the problem.
    2. Check the configuration of your cluster. Make sure that the configuration is correct and that there are no errors or inconsistencies. You can also try changing the configuration and see if that resolves the issue.
    3. Try restarting the cluster. Sometimes, restarting the cluster can resolve the issue. Make sure to save any important data before restarting the cluster.

    More links :

    https://stackoverflow.com/questions/75865288/spark-container-launch-failed

    https://community.databricks.com/t5/data-engineering/cluster-occasionally-fails-to-launch/td-p/7803

    0 comments No comments

  2. BhargavaGunnam-MSFT 27,156 Reputation points Microsoft Employee
    2024-05-10T22:22:07.6033333+00:00

    Hello ADM.Susana Domingos,

    These errors originated from the Spark platform service. The product group has fixed the issue, so you shouldn't see these errors anymore. In case you still encounter them, please let me know.

    I hope this helps.

    0 comments No comments