We had multiple Data Factory pipelines fail over the last few days (4/11-4/13) with several intermittent errors.
One example of the Internal Server Error we saw:
The internal server errors were on pipeline steps that were transforming the data and not moving data from a source to a sink. Is there a way to diagnose Internal Server Errors within Data Factory?
Another error we were seeing was specific to an Azure Resource Provider Throttling error. The following error message appeared:
Unexpected failure while waiting for the cluster (0412-081827-waxen351) to be ready.Cause Unexpected state for cluster (0412-081827-waxen351): AZURE_RESOURCE_PROVIDER_THROTTLING(CLOUD_FAILURE): azure_error_code:AzureResourceProviderThrottling,azure_error_message:Encountered Azure Resource Provider throttling. Please try again later. Details: ,databricks_error_message:Error code: AzureResourceProviderThrottling, error message: Encountered Azure Resource Provider throttling. Please try again later.
Is there a reason why we would be seeing an Azure Resource Provider Throttling error message? Was there an outage or planned maintenance with databricks?
The errors were intermittent, however, we saw enough instances of these errors across our pipelines to be concerned.