pyspark.sql.Catalog¶
- 
class pyspark.sql.Catalog(sparkSession: pyspark.sql.session.SparkSession)[source]¶
- User-facing catalog API, accessible through SparkSession.catalog. - This is a thin wrapper around its Scala implementation org.apache.spark.sql.catalog.Catalog. - Changed in version 3.4.0: Supports Spark Connect. - Methods - cacheTable(tableName[, storageLevel])- Caches the specified table in-memory or with given storage level. - Removes all cached tables from the in-memory cache. - createExternalTable(tableName[, path, …])- Creates a table based on the dataset in a data source. - createTable(tableName[, path, source, …])- Creates a table based on the dataset in a data source. - Returns the current default catalog in this session. - Returns the current default database in this session. - databaseExists(dbName)- Check if the database with the specified name exists. - dropGlobalTempView(viewName)- Drops the global temporary view with the given view name in the catalog. - dropTempView(viewName)- Drops the local temporary view with the given view name in the catalog. - functionExists(functionName[, dbName])- Check if the function with the specified name exists. - getDatabase(dbName)- Get the database with the specified name. - getFunction(functionName)- Get the function with the specified name. - getTable(tableName)- Get the table or view with the specified name. - isCached(tableName)- Returns true if the table is currently cached in-memory. - listCatalogs([pattern])- Returns a list of catalogs in this session. - listColumns(tableName[, dbName])- Returns a list of columns for the given table/view in the specified database. - listDatabases([pattern])- Returns a list of databases available across all sessions. - listFunctions([dbName, pattern])- Returns a list of functions registered in the specified database. - listTables([dbName, pattern])- Returns a list of tables/views in the specified database. - recoverPartitions(tableName)- Recovers all the partitions of the given table and updates the catalog. - refreshByPath(path)- Invalidates and refreshes all the cached data (and the associated metadata) for any DataFrame that contains the given data source path. - refreshTable(tableName)- Invalidates and refreshes all the cached data and metadata of the given table. - registerFunction(name, f[, returnType])- An alias for - spark.udf.register().- setCurrentCatalog(catalogName)- Sets the current default catalog in this session. - setCurrentDatabase(dbName)- Sets the current default database in this session. - tableExists(tableName[, dbName])- Check if the table or view with the specified name exists. - uncacheTable(tableName)- Removes the specified table from the in-memory cache.