Scenario: IllegalArgumentException for Apache Spark activity in Azure HDInsight

This article describes troubleshooting steps and possible resolutions for issues when using Apache Spark components in Azure HDInsight clusters.

Issue

You receive the following exception when trying to execute a Spark activity in an Azure Data Factory pipeline:

Exception in thread "main" java.lang.IllegalArgumentException:
Wrong FS: wasbs://additional@xxx.blob.core.windows.net/spark-examples_2.11-2.1.0.jar, expected: wasbs://wasbsrepro-2017-11-07t00-59-42-722z@xxx.blob.core.windows.net

Cause

A Spark job fails if the application jar file is not located in the Spark cluster’s default/primary storage.

This is a known issue with the Spark open-source framework tracked in this bug: Spark job fails if fs.defaultFS and application jar are different url.

This issue has been resolved in Spark 2.3.0.

Resolution

Make sure the application jar is stored on the default/primary storage for the HDInsight cluster. In Azure Data Factory, make sure the ADF linked service is pointed to the HDInsight default container rather than a secondary container.

Next steps

If you didn't see your problem or are unable to solve your issue, visit one of the following channels for more support:

  • Get answers from Azure experts through Azure Community Support.

  • Connect with @AzureSupport - the official Microsoft Azure account for improving customer experience. Connecting the Azure community to the right resources: answers, support, and experts.

  • If you need more help, you can submit a support request from the Azure portal. Select Support from the menu bar or open the Help + support hub. For more detailed information, review How to create an Azure support request. Access to Subscription Management and billing support is included with your Microsoft Azure subscription, and Technical Support is provided through one of the Azure Support Plans.