Extends QueryExecution with hive specific features.
Caches the specified table in-memory.
Caches the specified table in-memory.
:: Experimental ::
Creates an empty parquet file with the schema of class A
, which can be registered as a table.
:: Experimental ::
Creates an empty parquet file with the schema of class A
, which can be registered as a table.
This registered table can be used as the target of future insertInto
operations.
val sqlContext = new SQLContext(...) import sqlContext._ case class Person(name: String, age: Int) createParquetFile[Person]("path/to/file.parquet").registerAsTable("people") sql("INSERT INTO people SELECT 'michael', 29")
A case class type that describes the desired schema of the parquet file to be created.
The path where the directory containing parquet metadata should be created. Data inserted into this table will also be stored at this location.
When false, an exception will be thrown if this directory already exists.
A Hadoop configuration object that can be used to specify options to the parquet output format.
Creates a SchemaRDD from an RDD of case classes.
Creates a SchemaRDD from an RDD of case classes.
Creates a table using the schema of the given class.
Creates a table using the schema of the given class.
A case class that is used to describe the schema of the table to be created.
The name of the table to create.
When false, an exception will be thrown if the table already exists.
SQLConf and HiveConf contracts: when the hive session is first initialized, params in HiveConf will get picked up by the SQLConf.
SQLConf and HiveConf contracts: when the hive session is first initialized, params in HiveConf will get picked up by the SQLConf. Additionally, any properties set by set() or a SET command inside hql() or sql() will be set in the SQLConf *as well as* in the HiveConf.
Executes a query expressed in HiveQL using Spark, returning the result as a SchemaRDD.
An alias for hiveql
.
Returns true if the table is currently cached in-memory.
Returns true if the table is currently cached in-memory.
:: Experimental ::
:: Experimental ::
Loads a JSON file (one object per line), returning the result as a SchemaRDD.
Loads a JSON file (one object per line), returning the result as a SchemaRDD. It goes through the entire dataset once to determine the schema.
:: Experimental ::
:: Experimental ::
Loads an RDD[String] storing JSON objects (one object per record), returning the result as a SchemaRDD.
Loads an RDD[String] storing JSON objects (one object per record), returning the result as a SchemaRDD. It goes through the entire dataset once to determine the schema.
:: DeveloperApi :: Allows catalyst LogicalPlans to be executed as a SchemaRDD.
:: DeveloperApi :: Allows catalyst LogicalPlans to be executed as a SchemaRDD. Note that the LogicalPlan interface is considered internal, and thus not guaranteed to be stable. As a result, using them directly is not recommended.
Loads a Parquet file, returning the result as a SchemaRDD.
Loads a Parquet file, returning the result as a SchemaRDD.
Prepares a planned SparkPlan for execution by binding references to specific ordinals, and inserting shuffle operations as needed.
Prepares a planned SparkPlan for execution by binding references to specific ordinals, and inserting shuffle operations as needed.
Registers the given RDD as a temporary table in the catalog.
Registers the given RDD as a temporary table in the catalog. Temporary tables exist only during the lifetime of this instance of SQLContext.
Execute the command using Hive and return the results as a sequence.
Execute the command using Hive and return the results as a sequence. Each element in the sequence is one row.
Runs the specified SQL query using Hive.
Runs the specified SQL query using Hive.
Executes a SQL query using Spark, returning the result as a SchemaRDD.
Executes a SQL query using Spark, returning the result as a SchemaRDD.
Returns the specified table as a SchemaRDD
Returns the specified table as a SchemaRDD
Removes the specified table from the in-memory cache.
Removes the specified table from the in-memory cache.
An instance of the Spark SQL execution engine that integrates with data stored in Hive. Configuration for Hive is read from hive-site.xml on the classpath.