如何设置嵌入式 Apache Hive 元存储How to set up an embedded Apache Hive metastore

可以设置 Azure Databricks 群集来使用嵌入的元存储。You can set up an Azure Databricks cluster to use an embedded metastore. 当只需在群集的生存期内保留表元数据时,可以使用嵌入的元存储。You can use an embedded metastore when you only need to retain table metadata during the life of the cluster. 如果重新启动群集,元数据将丢失。If the cluster is restarted, the metadata is lost.

如果需要在群集重新启动后保留表元数据或其他数据,则应使用默认元存储或设置外部元存储。If you need to persist the table metadata or other data after a cluster restart, then you should use the default metastore or set up an external metastore.

此示例使用 Apache Derby embedded 元存储,它是内存中的轻型数据库。This example uses the Apache Derby embedded metastore, which is an in-memory lightweight database. 按照笔记本中的说明来安装元存储。Follow the instructions in the notebook to install the metastore.

在将此过程应用于其他群集之前,应始终在测试群集上执行此过程。You should always perform this procedure on a test cluster before applying it to other clusters.

设置嵌入式 Hive 元存储笔记本Set up an embedded Hive metastore notebook

获取笔记本Get notebook