-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Description
Motivation
Spark load currently needs to get status and kill application running on YARN cluster by hadoop-yarn-client api.However, this approach is not suitable for multiple environments.
For example, KILL operation requires authentication. If user has their own security authentication system which is different from hadoop official authentication (simple, kerberos), the KILL operation will fail.
Therefore, I suggest to add a configurable yarn environment, and use yarn command to get status and kill the application. By default, the official yarn environment is used, but users can configure their own environment.
Description
The format of the yarn command is generally as follows:
yarn --config confdir application <-kill | -status> <Application ID>
We can manage yarn configuration files's directory by --config option, and gernerate configuration files (e.g. core-site.xml) into the specified directory.
Furthermore, I plan to use script to generate configuration files.
The generated files will be like:
core-site.xml
<configuration>
<property>
<name>hadoop.job.ugi</name>
<value>user,password</value>
</property>
<property>
<name>hadoop.security.authentication</name>
<value>simple</value>
</property>
</configuration>
yarn-site.xml
<configuration>
<property>
<name>yarn.resourcemanager.address</name>
<value>host:port</value>
</property>
</configuration>