Skip to content

Call stopGracefully() when batch ingestion is killed explicitly#8075

Closed
ankit0811 wants to merge 1 commit intoapache:masterfrom
ankit0811:mr_kill_fix
Closed

Call stopGracefully() when batch ingestion is killed explicitly#8075
ankit0811 wants to merge 1 commit intoapache:masterfrom
ankit0811:mr_kill_fix

Conversation

@ankit0811
Copy link
Copy Markdown
Contributor

Fixes #7962.

Fixed by calling task.stopGracefully() from ForkingTaskRunner when hadoop ingestion task is killed, as the control goes to the ForkingTaskRunner::shutdown
and not to SingleTaskBackgroundRunner::stop


This PR has:

  • been self-reviewed.

  • been tested in a test Druid cluster.


…estion task is killed, the control does to the ForkingTaskRunner::shutdown
@jihoonson
Copy link
Copy Markdown
Contributor

Hmm, @ankit0811 do you know why stopGracefully didn't get called by ForkingTaskRunner? It's supposed to be called when the hadoop index task is killed. And how did you kill the hadoop index task?

@ankit0811
Copy link
Copy Markdown
Contributor Author

The ingestion task was issued a kill command by clicking the kill link on the UI
This did not trigger calling the stopGracefully method

@jihoonson
Copy link
Copy Markdown
Contributor

The ingestion task was issued a kill command by clicking the kill link on the UI

Do you mean the Druid task issued a kill command to kill Hadoop job? or you issued a kill command to kill the Druid task? If it's the later case, then it sounds like something went wrong with the kill command issued to the Druid task. How did you know stopGracefully wasn't called? I guess Tried killing job: [%s], status: [%s] wasn't printed?

@ankit0811
Copy link
Copy Markdown
Contributor Author

ankit0811 commented Jul 15, 2019

Yes, it is the later case. And yes the log line is not printed

@clintropolis
Copy link
Copy Markdown
Member

clintropolis commented Jul 15, 2019

I looked into this a bit over the weekend, and have what I think is an alternative fix to this (that also fixes a logging issue). It seemed easier to just open a separate PR to capture all of the logs and details from the change I made there than to hijack this PR.

#8085 fixes a logging issue so we can be sure that lifecycle stop is happening, but I don't believe it fixes the issue with the Hadoop task not getting killed, see here for more details.

@ankit0811 ankit0811 closed this Aug 1, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Killing hadoop ingestion task does not kill spawned Hadoop MR task

3 participants