public class SparkLauncher
extends java.lang.Object
Use this class to start Spark applications programmatically. The class uses a builder pattern to allow clients to configure the Spark application and launch it as a child process.
Modifier and Type | Field and Description |
---|---|
static java.lang.String |
CHILD_CONNECTION_TIMEOUT
Maximum time (in ms) to wait for a child process to connect back to the launcher server
when using @link{#start()}.
|
static java.lang.String |
CHILD_PROCESS_LOGGER_NAME
Logger name to use when launching a child process.
|
static java.lang.String |
DEPLOY_MODE
The Spark deploy mode.
|
static java.lang.String |
DRIVER_EXTRA_CLASSPATH
Configuration key for the driver class path.
|
static java.lang.String |
DRIVER_EXTRA_JAVA_OPTIONS
Configuration key for the driver VM options.
|
static java.lang.String |
DRIVER_EXTRA_LIBRARY_PATH
Configuration key for the driver native library path.
|
static java.lang.String |
DRIVER_MEMORY
Configuration key for the driver memory.
|
static java.lang.String |
EXECUTOR_CORES
Configuration key for the number of executor CPU cores.
|
static java.lang.String |
EXECUTOR_EXTRA_CLASSPATH
Configuration key for the executor class path.
|
static java.lang.String |
EXECUTOR_EXTRA_JAVA_OPTIONS
Configuration key for the executor VM options.
|
static java.lang.String |
EXECUTOR_EXTRA_LIBRARY_PATH
Configuration key for the executor native library path.
|
static java.lang.String |
EXECUTOR_MEMORY
Configuration key for the executor memory.
|
static java.lang.String |
NO_RESOURCE
A special value for the resource that tells Spark to not try to process the app resource as a
file.
|
static java.lang.String |
SPARK_MASTER
The Spark master.
|
Constructor and Description |
---|
SparkLauncher() |
SparkLauncher(java.util.Map<java.lang.String,java.lang.String> env)
Creates a launcher that will set the given environment variables in the child.
|
Modifier and Type | Method and Description |
---|---|
SparkLauncher |
addAppArgs(java.lang.String... args)
Adds command line arguments for the application.
|
SparkLauncher |
addFile(java.lang.String file)
Adds a file to be submitted with the application.
|
SparkLauncher |
addJar(java.lang.String jar)
Adds a jar file to be submitted with the application.
|
SparkLauncher |
addPyFile(java.lang.String file)
Adds a python file / zip / egg to be submitted with the application.
|
SparkLauncher |
addSparkArg(java.lang.String arg)
Adds a no-value argument to the Spark invocation.
|
SparkLauncher |
addSparkArg(java.lang.String name,
java.lang.String value)
Adds an argument with a value to the Spark invocation.
|
java.lang.Process |
launch()
Launches a sub-process that will start the configured Spark application.
|
SparkLauncher |
setAppName(java.lang.String appName)
Set the application name.
|
SparkLauncher |
setAppResource(java.lang.String resource)
Set the main application resource.
|
SparkLauncher |
setConf(java.lang.String key,
java.lang.String value)
Set a single configuration value for the application.
|
static void |
setConfig(java.lang.String name,
java.lang.String value)
Set a configuration value for the launcher library.
|
SparkLauncher |
setDeployMode(java.lang.String mode)
Set the deploy mode for the application.
|
SparkLauncher |
setJavaHome(java.lang.String javaHome)
Set a custom JAVA_HOME for launching the Spark application.
|
SparkLauncher |
setMainClass(java.lang.String mainClass)
Sets the application class name for Java/Scala applications.
|
SparkLauncher |
setMaster(java.lang.String master)
Set the Spark master for the application.
|
SparkLauncher |
setPropertiesFile(java.lang.String path)
Set a custom properties file with Spark configuration for the application.
|
SparkLauncher |
setSparkHome(java.lang.String sparkHome)
Set a custom Spark installation location for the application.
|
SparkLauncher |
setVerbose(boolean verbose)
Enables verbose reporting for SparkSubmit.
|
SparkAppHandle |
startApplication(SparkAppHandle.Listener... listeners)
Starts a Spark application.
|
public static final java.lang.String SPARK_MASTER
public static final java.lang.String DEPLOY_MODE
public static final java.lang.String DRIVER_MEMORY
public static final java.lang.String DRIVER_EXTRA_CLASSPATH
public static final java.lang.String DRIVER_EXTRA_JAVA_OPTIONS
public static final java.lang.String DRIVER_EXTRA_LIBRARY_PATH
public static final java.lang.String EXECUTOR_MEMORY
public static final java.lang.String EXECUTOR_EXTRA_CLASSPATH
public static final java.lang.String EXECUTOR_EXTRA_JAVA_OPTIONS
public static final java.lang.String EXECUTOR_EXTRA_LIBRARY_PATH
public static final java.lang.String EXECUTOR_CORES
public static final java.lang.String CHILD_PROCESS_LOGGER_NAME
public static final java.lang.String NO_RESOURCE
public static final java.lang.String CHILD_CONNECTION_TIMEOUT
public SparkLauncher()
public SparkLauncher(java.util.Map<java.lang.String,java.lang.String> env)
env
- Environment variables to set.public static void setConfig(java.lang.String name, java.lang.String value)
name
- Config name.value
- Config value.public SparkLauncher setJavaHome(java.lang.String javaHome)
javaHome
- Path to the JAVA_HOME to use.public SparkLauncher setSparkHome(java.lang.String sparkHome)
sparkHome
- Path to the Spark installation to use.public SparkLauncher setPropertiesFile(java.lang.String path)
path
- Path to custom properties file to use.public SparkLauncher setConf(java.lang.String key, java.lang.String value)
key
- Configuration key.value
- The value to use.public SparkLauncher setAppName(java.lang.String appName)
appName
- Application name.public SparkLauncher setMaster(java.lang.String master)
master
- Spark master.public SparkLauncher setDeployMode(java.lang.String mode)
mode
- Deploy mode.public SparkLauncher setAppResource(java.lang.String resource)
resource
- Path to the main application resource.public SparkLauncher setMainClass(java.lang.String mainClass)
mainClass
- Application's main class.public SparkLauncher addSparkArg(java.lang.String arg)
Use this method with caution. It is possible to create an invalid Spark command by passing unknown arguments to this method, since those are allowed for forward compatibility.
arg
- Argument to add.public SparkLauncher addSparkArg(java.lang.String name, java.lang.String value)
It is safe to add arguments modified by other methods in this class (such as
setMaster(String)
- the last invocation will be the one to take effect.
Use this method with caution. It is possible to create an invalid Spark command by passing unknown arguments to this method, since those are allowed for forward compatibility.
name
- Name of argument to add.value
- Value of the argument.public SparkLauncher addAppArgs(java.lang.String... args)
args
- Arguments to pass to the application's main class.public SparkLauncher addJar(java.lang.String jar)
jar
- Path to the jar file.public SparkLauncher addFile(java.lang.String file)
file
- Path to the file.public SparkLauncher addPyFile(java.lang.String file)
file
- Path to the file.public SparkLauncher setVerbose(boolean verbose)
verbose
- Whether to enable verbose output.public java.lang.Process launch() throws java.io.IOException
The startApplication(SparkAppHandle.Listener...)
method is preferred when launching
Spark, since it provides better control of the child application.
java.io.IOException
public SparkAppHandle startApplication(SparkAppHandle.Listener... listeners) throws java.io.IOException
This method returns a handle that provides information about the running application and can be used to do basic interaction with it.
The returned handle assumes that the application will instantiate a single SparkContext
during its lifetime. Once that context reports a final state (one that indicates the
SparkContext has stopped), the handle will not perform new state transitions, so anything
that happens after that cannot be monitored. If the underlying application is launched as
a child process, SparkAppHandle.kill()
can still be used to kill the child process.
Currently, all applications are launched as child processes. The child's stdout and stderr
are merged and written to a logger (see java.util.logging
). The logger's name
can be defined by setting CHILD_PROCESS_LOGGER_NAME
in the app's configuration. If
that option is not set, the code will try to derive a name from the application's name or
main class / script file. If those cannot be determined, an internal, unique name will be
used. In all cases, the logger name will start with "org.apache.spark.launcher.app", to fit
more easily into the configuration of commonly-used logging systems.
listeners
- Listeners to add to the handle before the app is launched.java.io.IOException