查找表的大小Find the size of a table

本文介绍如何查找表的大小。This article explains how to find the size of a table.

使用的命令取决于是否要查找增量表或非增量表的大小。The command used depends on if you are trying to find the size of a delta table or a non-delta table.

增量表的大小Size of a delta table

若要查找增量表的大小,可以使用 Apache Spark SQL 命令。To find the size of a delta table, you can use a Apache Spark SQL command.

import com.databricks.sql.transaction.tahoe._
val deltaLog = DeltaLog.forTable(spark, "dbfs:/<path-to-delta-table>")
val snapshot = deltaLog.snapshot               // the current delta table snapshot
println(s"Total file size (bytes): ${deltaLog.snapshot.sizeInBytes}"

非增量表的大小Size of a non-delta table

您可以通过计算基础目录中各个文件的总和来确定非增量表的大小。You can determine the size of a non-delta table by calculating the total sum of the individual files within the underlying directory.

还可以使用 queryExecution.analyzed.stats 返回大小。You can also use queryExecution.analyzed.stats to return the size.

spark.read.table("<non-delta-table-name>").queryExecution.analyzed.stats