缓存表Cache Table

CACHE [LAZY] TABLE [db_name.]table_name

使用 RDD 缓存将表的内容缓存在内存中。Cache the contents of the table in memory using the RDD cache. 这样,后续查询便可以避免尽可能多地扫描原始文件。This enables subsequent queries to avoid scanning the original files as much as possible.

LAZY

延迟缓存表,而不是积极扫描整个表。Cache the table lazily instead of eagerly scanning the entire table.

有关 RDD 缓存和 Databricks IO 缓存之间的差异,请参阅 增量和 Apache Spark 缓存See Delta and Apache Spark caching for the differences between the RDD cache and the Databricks IO cache.