Skip to content

Refactor interrupting blocking threads#1948

Merged
wenyongh merged 6 commits intobytecodealliance:dev/interrupt_block_insnfrom
wenyongh:dev/interrupt_block_insn
Feb 13, 2023
Merged

Refactor interrupting blocking threads#1948
wenyongh merged 6 commits intobytecodealliance:dev/interrupt_block_insnfrom
wenyongh:dev/interrupt_block_insn

Conversation

@wenyongh
Copy link
Collaborator

@wenyongh wenyongh commented Feb 9, 2023

  • Disable the feature by default, use cmake -DWAMR_BUILD_INTERRUPT_BLOCK_INSN to enable it
  • Reuse the original signal handler
  • Move kill threads to be after notifying atomic waiting threads
  • Fix some issues and refine the code

bool ret;

wasm_runtime_set_exec_env_tls(exec_env);
wasm_exec_env_push_jmpbuf(exec_env, &jmpbuf_node);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we handle the case in which the jmpbuf is pushed here but then we receive a signal before the os_setjmp in the next line? In that case the signal handler would use an uninitialized jmpbuf

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's an issue, some operations should not be interrupted by the signal, here we had better disable signal before pushing jmpbuf and enable again after setjmp.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's why I moved wasm_exec_env_push_jmpbuf after os_setjmp in my previous implementation;
the approach described here https://notes.shichao.io/apue/ch10/ with static volatile sig_atomic_t canjump; can be a possible solution too

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, seems the static volatile sig_atomic_t canjump is better: moving wasm_exec_env_push_jmpbuf after os_setjmp also cannot protect the operations of os_setjmp and pushing jmpbuf.
How about adding field volatile sig_atomic_t canjump in exec_env, and changing the process to:

    exec_env->canjump = 0;
    wasm_exec_env_push_jmpbuf(exec_env, &jmpbuf_node);
    wasm_runtime_set_exec_env_tls(exec_env);

    if (os_setjmp(jmpbuf_node.jmpbuf) == 0) {
        exec_env->canjump = 1;
        ret = invoke_native_block_insn_interrupt(exec_env, func_ptr, func_type,
                                                 signature, attachment, argv,
                                                 argc, argv_ret);
    }
    else {
        /* Exception has been set in signal handler before calling longjmp */
        ret = false;
    }

    exec_env->canjump = 0;
    jmpbuf_node_pop = wasm_exec_env_pop_jmpbuf(exec_env);
    exec_env->canjump = 1;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that should fix the problem

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. BTW, any other comments? Shall we merge it after some basic tests? So that we can start to fix the memory allocation issue.

Copy link
Contributor

@eloparco eloparco Feb 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's still missing is to use the canjump variable inside the signal handler interrupt_block_insn_sig_handler to decide if we can jump or not, right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, forgot to do it, done.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried the changes and it seems to be working.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The canjump flag only controls whether it is allowed to jump to the place of setjmp inside signal handler of SIGUSR1, but another issue is that: whether it is allowed to receive the signal SIGUSR1 in a thread to trigger its signal handler? Since the signal handler may triggered in each place (each machine instruction) of the thread, it very dangerous, I tend to limit it only in some special places. I uploaded a patch set to enable it only when calling the callback API of pthread_create, maybe we can narrow it more, e.g. enable it only when calling native APIs.

@wenyongh wenyongh changed the title [WIP] Refactor interrupting blocking threads Refactor interrupting blocking threads Feb 10, 2023
Copy link
Contributor

@eloparco eloparco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@wenyongh wenyongh merged commit ed98d88 into bytecodealliance:dev/interrupt_block_insn Feb 13, 2023
@wenyongh wenyongh deleted the dev/interrupt_block_insn branch February 16, 2023 09:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants