Listing table names
This article explains why spark.catalog.listTables()
and %sql show tables
have different performance characteristics.
Problem
To fetch all the table names from metastore you can use either spark.catalog.listTables()
or %sql show tables
.
If you observe the duration to fetch the details you can see spark.catalog.listTables()
usually takes longer than %sql show tables
.
Cause
spark.catalog.listTables()
tries to fetch every table’s metadata first and then show the requested table names. This process is slow when dealing with complex schemas and larger numbers of tables.
Solution
To get only the table names, use %sql show tables
which internally invokes SessionCatalog.listTables
which fetches only the table names.
Feedback
Submit and view feedback for