AnalysisException
when dropping table on Azure-backed metastore
Problem
When you try to drop a table in an external Hive version 2.0 or 2.1 metastore that is deployed on Azure SQL Database, Azure Databricks throws the following exception:
com.databricks.backend.common.rpc.DatabricksExceptions$SQLExecutionException: org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:Exception thrown when executing query : SELECT 'org.apache.hadoop.hive.metastore.model.MStorageDescriptor' AS NUCLEUS_TYPE,A0.INPUT_FORMAT,A0.IS_COMPRESSED,A0.IS_STOREDASSUBDIRECTORIES,A0.LOCATION,A0.NUM_BUCKETS,A0.OUTPUT_FORMAT,A0.SD_ID FROM SDS A0 WHERE A0.CD_ID = ? OFFSET 0 ROWS FETCH NEXT ROW ONLY );
at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:107)
at org.apache.spark.sql.hive.HiveExternalCatalog.doDropTable(HiveExternalCatalog.scala:483)
at org.apache.spark.sql.catalyst.catalog.ExternalCatalog.dropTable(ExternalCatalog.scala:122)
at org.apache.spark.sql.catalyst.catalog.SessionCatalog.dropTable(SessionCatalog.scala:638)
at org.apache.spark.sql.execution.command.DropTableCommand.run(ddl.scala:212)
Cause
This is a known Hive bug (HIVE-14698), caused by another known bug with the datanucleus-rdbms
module in the package. It is fixed in datanucleus-rdbms
4.1.16. However, Hive 2.0 and 2.1 metastores use version 4.1.7 and these versions are affected.
Solution
Do one of the following:
- Upgrade the Hive metastore to version 2.3.0. This also resolves problems due to any other Hive bug that is fixed in version 2.3.0.
- Import the following notebook to your workspace and follow the instructions to replace the
datanucleus-rdbms
JAR. This notebook is written to upgrade the metastore to version 2.1.1. You might want to have a similar version in your server side.
External metastore upgrade notebook
Feedback
Submit and view feedback for