如何設定內嵌的 Apache Hive 中繼存放區How to set up an embedded Apache Hive metastore

您可以設定 Azure Databricks 叢集來使用內嵌中繼存放區。You can set up an Azure Databricks cluster to use an embedded metastore. 當您只需要在叢集的生命週期中保留資料表中繼資料時,就可以使用內嵌的中繼存放區。You can use an embedded metastore when you only need to retain table metadata during the life of the cluster. 如果重新開機叢集,中繼資料就會遺失。If the cluster is restarted, the metadata is lost.

如果您需要在叢集重新開機之後保存資料表中繼資料或其他資料,則應該使用預設中繼存放區或設定外部中繼存放區。If you need to persist the table metadata or other data after a cluster restart, then you should use the default metastore or set up an external metastore.

這個範例會使用 Apache Derby embedded 中繼存放區,這是記憶體內部的輕量資料庫。This example uses the Apache Derby embedded metastore, which is an in-memory lightweight database. 遵循筆記本中的指示來安裝中繼存放區。Follow the instructions in the notebook to install the metastore.

您應該一律在測試叢集上執行此程式,再將它套用到其他叢集。You should always perform this procedure on a test cluster before applying it to other clusters.

設定內嵌 Hive 中繼存放區筆記本Set up an embedded Hive metastore notebook

取得筆記本Get notebook