Skip to content

Grandchild processes ignoring SIGTERM are not killed #2

@oleg-vinted

Description

@oleg-vinted

Repro: https://gist.github.com/oleg-vinted/97ee727e8d0b1050f40fc3ec9f940281

Even with the process group change, in some cases processes can still be orphaned.

In that repro, we have: overman -> service -> stubborn-process -> sleep process hierarchy.

What I think is happening:

  • On Ctrl-C, SIGINT is sent only to overman, because its children are running in a different process group.
  • Overman sends SIGTERM to the child process group correctly, impacting service, stubborn-process, sleep.
  • service and sleep terminate upon receiving SIGTERM.
  • stubborn-process ignores SIGTERM and spawns another instance of sleep and keeps running.
  • overman sees that its immediate child (service) has exited and removes its PID from the running list: through here, here and here.
  • overman thinks its children have exited, so it does not send SIGKILL to the process group, even though SIGKILL is still necessary because stubborn-process is still running (Edit: not true, sending a SIGKILL to the process group doesn't work at this point. We need some other way to detect if children have exited.)

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workinghelp wantedExtra attention is needed

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions