How to write to datalakegen2 storage using databricks in delta format when connected using SAS tocken

Sai Sunny Kothuri 0 Reputation points
2024-04-09T23:29:12.6666667+00:00

I converted the data from parquet form to data format. Now I want to write the data to blob storage - datalakegen2. But facing below error while writing. I useed the below command to write my data:

output_path = '/mnt/silver/SalesLT/Address.parquet'
df.write.format('delta').mode('overwrite').save(output_path)

org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 3.0 failed 4 times, most recent failure: Lost task 0.3 in stage 3.0 (TID 6) (10.139.64.4 executor driver): org.apache.spark.SparkException: [TASK_WRITE_FAILED] Task failed while writing rows to dbfs:/mnt/silver/SalesLT/Address.parquet.

I established a connection between datalakegen2 and databricks using SAS tokens.

Azure Data Lake Storage
Azure Data Lake Storage
An Azure service that provides an enterprise-wide hyper-scale repository for big data analytic workloads and is integrated with Azure Blob Storage.
1,349 questions
Azure Storage Accounts
Azure Storage Accounts
Globally unique resources that provide access to data management services and serve as the parent namespace for the services.
2,715 questions
Azure Databricks
Azure Databricks
An Apache Spark-based analytics platform optimized for Azure.
1,938 questions
{count} votes

1 answer

Sort by: Most helpful
  1. KarishmaTiwari-MSFT 18,527 Reputation points Microsoft Employee
    2024-04-10T00:20:52.1+00:00

    This error is usually seen when the file is corrupted/problematic.

    • Ensure that the data types of the columns in your DataFrame (df) are compatible with the data types of the columns in the table you are writing to. Mismatched data types can cause write failures.
    • Verify that there are no incompatible data types or corruption in your data.
    • Make sure that all the files are in the same format.
    0 comments No comments