[SPARK-10001][Core] Interrupt tasks in repl with Ctrl+C #12557

jodersky · 2016-04-21T03:39:56Z

What changes were proposed in this pull request?

Improve signal handling to allow interrupting running tasks from the REPL (with Ctrl+C).
If no tasks are running or Ctrl+C is pressed twice, the signal is forwarded to the default handler resulting in the usual termination of the application.

This PR is a rewrite of -- and therefore closes #8216 -- as per @piaozhexiu's request

How was this patch tested?

Signal handling is not easily testable therefore no unit tests were added. Nevertheless, the new functionality is implemented in a best-effort approach, soft-failing in case signals aren't available on a specific OS.

SparkQA · 2016-04-21T03:43:11Z

Test build #56469 has finished for PR 12557 at commit 0142eb1.

This patch fails RAT tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-04-21T03:49:23Z

Test build #56471 has finished for PR 12557 at commit 0539450.

This patch fails Scala style tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-04-21T05:13:25Z

Test build #56480 has finished for PR 12557 at commit 94323b9.

This patch fails to build.
This patch merges cleanly.
This patch adds no public classes.

holdenk · 2016-04-21T06:59:59Z

repl/src/main/scala/org/apache/spark/repl/Signaling.scala

+   * This makes it possible to interrupt a running shell job by pressing Ctrl+C.
+   */
+  def cancelOnInterrupt(ctx: SparkContext): Unit = USignaling.register("INT") {
+    if (!ctx.statusTracker.getActiveJobIds().isEmpty) {


So if the user presses ctrl-c with no running jobs, exits the shell but if there are running jobs cancels the current running jobs?

Would pressing ctrl-c twice actually exit if the current jobs aren't all finished being canceled?

Also this behaviour seems a bit confusing maybe (not the double ctrl c to cancel, but ctrl c on no jobs to exit and ctrl c with job to cancel).

That's correct. However I personally do not find it confusing at all. I would argue that killing blocking jobs fulfills the SIGINT description of interrupting. If you want to kill a shell externally with a single signal, you can always send a SIGTERM.
Moreover, as Ryan Blue commented in the related JIRA ticket, nodejs, python and ruby shells have the same behaviour (and sbt kind of does too if you enable the cancelable setting).

So I don't see this behaviour in my Python shell or my regular bash shell:

holden@yogapanda:~/repos/spark$ python
Python 2.7.6 (default, Jun 22 2015, 17:58:13)
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.

KeyboardInterrupt

KeyboardInterrupt

But if it matches other interactive shells than that seems fine.

can't vouch for python but what do you mean with your "regular bash shell"? I would not expect my bash shell to exit when getting a sigint, rather it should kill the process that is using the current pty. In this analogy, an interrupt to spark shell should kill its "child processes" first.

holdenk · 2016-04-21T07:01:20Z

Thanks for continuing #8216 :) cc @davies to take a look maybe?

SparkQA · 2016-04-21T07:03:27Z

Test build #56482 has finished for PR 12557 at commit 4f9bf69.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

syepes · 2016-04-21T07:06:33Z

@jodersky Thanks for this addition, its going to be really useful.
Just cant count how many times I have pressed Ctrl+C and had to restart all over..

jodersky · 2016-04-21T07:21:11Z

Test failure is from hive, seems unrelated.
jenkins, retest this please

SparkQA · 2016-04-21T09:20:05Z

Test build #2843 has finished for PR 12557 at commit 4f9bf69.

This patch fails Spark unit tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2016-04-21T15:17:51Z

Test build #2844 has finished for PR 12557 at commit 4f9bf69.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

davies · 2016-04-21T17:11:52Z

repl/src/main/scala/org/apache/spark/repl/Signaling.scala

+  def cancelOnInterrupt(ctx: SparkContext): Unit = USignaling.register("INT") {
+    if (!ctx.statusTracker.getActiveJobIds().isEmpty) {
+      logWarning("Cancelling all active jobs, this can take a while. " +
+        "Press Ctrl+C again to exit now.")


We can type Ctrl+D to quit the shell, should we always use Ctrl+C to cancel the running job and clear the current line input?

C-d is slightly different in that it just emits an end-of-transmission
(eot) character to the active terminal.
When the shell is running a job, it cannot listen to input and hence C-d
has no effect.

C-c on the other hand, sends an asynchronous signal by the host operating
system, which will be executed by the signal handler in a separate thread
and thus be able to stop running jobs.
On Apr 21, 2016 10:13 AM, "Davies Liu" notifications@github.com wrote:

In repl/src/main/scala/org/apache/spark/repl/Signaling.scala
#12557 (comment):

+import org.apache.spark.SparkContext
+import org.apache.spark.internal.Logging
+import org.apache.spark.util.{Signaling => USignaling}
+
+private[repl] object Signaling extends Logging {
+

/**

* Register a SIGINT handler, that terminates all active spark jobs or terminates

* when no jobs are currently running.

* This makes it possible to interrupt a running shell job by pressing Ctrl+C.

*/

def cancelOnInterrupt(ctx: SparkContext): Unit = USignaling.register("INT") {

if (!ctx.statusTracker.
getActiveJobIds().isEmpty) {

logWarning("Cancelling all active jobs, this can take a while. " +

"Press Ctrl+C again to exit now.")

We can type Ctrl+D to quit the shell, should we always use Ctrl+C to cancel
the running job and clear the current line input?

—
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
https://github.com/apache/spark/pull/12557/files/4f9bf695344a5c4c54372eaa6bf54af0d2da1f74#r60619566

Oh, I think I missunderstood your comment, are you proposing of dropping
the in-job condition and always catch a C-c?
In that case I'm not sure that I agree, since the main purpose of a sigint
is to interrupt a process. Killing jobs first and catching the signal is a
compromise for convenience. Imagine a spark shell that is stuck in some
arbitrary user code (not running a spark job), in that case C-c should imo
still exit the process.
On Apr 21, 2016 10:51 AM, "Jakob Odersky" jakob@odersky.com wrote:

C-d is slightly different in that it just emits an end-of-transmission
(eot) character to the active terminal.
When the shell is running a job, it cannot listen to input and hence C-d
has no effect.

C-c on the other hand, sends an asynchronous signal by the host operating
system, which will be executed by the signal handler in a separate thread
and thus be able to stop running jobs.
On Apr 21, 2016 10:13 AM, "Davies Liu" notifications@github.com wrote:

In repl/src/main/scala/org/apache/spark/repl/Signaling.scala
#12557 (comment):

+import org.apache.spark.SparkContext
+import org.apache.spark.internal.Logging
+import org.apache.spark.util.{Signaling => USignaling}
+
+private[repl] object Signaling extends Logging {
+

/**

* Register a SIGINT handler, that terminates all active spark jobs or terminates

* when no jobs are currently running.

* This makes it possible to interrupt a running shell job by pressing Ctrl+C.

*/

def cancelOnInterrupt(ctx: SparkContext): Unit = USignaling.register("INT") {

if (!ctx.statusTracker.
getActiveJobIds().isEmpty) {

logWarning("Cancelling all active jobs, this can take a while. " +

"Press Ctrl+C again to exit now.")

We can type Ctrl+D to quit the shell, should we always use Ctrl+C to
cancel the running job and clear the current line input?

—
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
https://github.com/apache/spark/pull/12557/files/4f9bf695344a5c4c54372eaa6bf54af0d2da1f74#r60619566

That make sense, looks good to me.

holdenk · 2016-04-21T22:55:06Z

LGTM as well (note: from chatting with @jodersky it doesn't depend on the conditional to check for empty if a second ctrl-c is issued, rather it only re-register the signal handler after its completed killing all of the jobs and if another signal is received before then uses the default handler)

jodersky · 2016-04-21T23:02:48Z

core/src/main/scala/org/apache/spark/util/Signaling.scala

+     */
+    override def handle(sig: Signal): Unit = {
+      // register old handler, will receive incoming signals while this handler is running
+      Signal.handle(signal, prevHandler)


I should have mentioned this in the PR description. When a signal is caught, the first thing the handler does is de-register itself. That way it only ever accepts the first signal and subsequent signals are escalated.

davies · 2016-04-22T05:03:50Z

Merging this into master, thanks!

rxin · 2016-04-22T07:16:25Z

core/src/main/scala/org/apache/spark/util/SignalLogger.scala

-      }
-      log.info("Registered signal handlers for [" + signals.mkString(", ") + "]")
+  def register(log: Logger): Unit = Seq("TERM", "HUP", "INT").foreach{ sig =>
+    Signaling.register(sig) {


did u remove the double registration check?

o.a.s.util.Signaling takes care of enforcing single instantiation of a handler per jvm. Do you mean the specific case of double registering logging?

## What changes were proposed in this pull request? This is a follow-up to #12557, with the following changes: 1. Fixes some of the style issues. 2. Merges Signaling and SignalLogger into a new class called SignalUtils. It was pretty confusing to have Signaling and Signal in one file, and it was also confusing to have two classes named Signaling and one called the other. 3. Made logging registration idempotent. ## How was this patch tested? N/A. Author: Reynold Xin <rxin@databricks.com> Closes #12605 from rxin/SPARK-10001.

jodersky changed the title ~~Allow interrupting tasks in repl with Ctrl+C~~ [SPARK-10001][Core] Allow interrupting tasks in repl with Ctrl+C Apr 21, 2016

jodersky changed the title ~~[SPARK-10001][Core] Allow interrupting tasks in repl with Ctrl+C~~ [SPARK-10001][Core] Interrupt tasks in repl with Ctrl+C Apr 21, 2016

Allow interrupting tasks in repl with Ctrl+C

4f9bf69

holdenk reviewed Apr 21, 2016
View reviewed changes

davies reviewed Apr 21, 2016
View reviewed changes

jodersky reviewed Apr 21, 2016
View reviewed changes

asfgit closed this in 8012793 Apr 22, 2016

rxin reviewed Apr 22, 2016
View reviewed changes

rxin mentioned this pull request Apr 22, 2016

[SPARK-10001] Consolidate Signaling and SignalLogger. #12605

Closed

[SPARK-10001][Core] Interrupt tasks in repl with Ctrl+C #12557

[SPARK-10001][Core] Interrupt tasks in repl with Ctrl+C #12557

Uh oh!

Conversation

jodersky commented Apr 21, 2016

What changes were proposed in this pull request?

How was this patch tested?

Uh oh!

SparkQA commented Apr 21, 2016

Uh oh!

SparkQA commented Apr 21, 2016

Uh oh!

SparkQA commented Apr 21, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jodersky Apr 21, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

holdenk commented Apr 21, 2016

Uh oh!

SparkQA commented Apr 21, 2016

Uh oh!

syepes commented Apr 21, 2016

Uh oh!

jodersky commented Apr 21, 2016

Uh oh!

SparkQA commented Apr 21, 2016

Uh oh!

SparkQA commented Apr 21, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

holdenk commented Apr 21, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

davies commented Apr 22, 2016

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

jodersky Apr 21, 2016 •

edited

Loading