Refactor interrupting blocking threads#1948
Refactor interrupting blocking threads#1948wenyongh merged 6 commits intobytecodealliance:dev/interrupt_block_insnfrom
Conversation
| bool ret; | ||
|
|
||
| wasm_runtime_set_exec_env_tls(exec_env); | ||
| wasm_exec_env_push_jmpbuf(exec_env, &jmpbuf_node); |
There was a problem hiding this comment.
Do we handle the case in which the jmpbuf is pushed here but then we receive a signal before the os_setjmp in the next line? In that case the signal handler would use an uninitialized jmpbuf
There was a problem hiding this comment.
Yes, that's an issue, some operations should not be interrupted by the signal, here we had better disable signal before pushing jmpbuf and enable again after setjmp.
There was a problem hiding this comment.
that's why I moved wasm_exec_env_push_jmpbuf after os_setjmp in my previous implementation;
the approach described here https://notes.shichao.io/apue/ch10/ with static volatile sig_atomic_t canjump; can be a possible solution too
There was a problem hiding this comment.
Thanks, seems the static volatile sig_atomic_t canjump is better: moving wasm_exec_env_push_jmpbuf after os_setjmp also cannot protect the operations of os_setjmp and pushing jmpbuf.
How about adding field volatile sig_atomic_t canjump in exec_env, and changing the process to:
exec_env->canjump = 0;
wasm_exec_env_push_jmpbuf(exec_env, &jmpbuf_node);
wasm_runtime_set_exec_env_tls(exec_env);
if (os_setjmp(jmpbuf_node.jmpbuf) == 0) {
exec_env->canjump = 1;
ret = invoke_native_block_insn_interrupt(exec_env, func_ptr, func_type,
signature, attachment, argv,
argc, argv_ret);
}
else {
/* Exception has been set in signal handler before calling longjmp */
ret = false;
}
exec_env->canjump = 0;
jmpbuf_node_pop = wasm_exec_env_pop_jmpbuf(exec_env);
exec_env->canjump = 1;There was a problem hiding this comment.
Yes, that should fix the problem
There was a problem hiding this comment.
Done. BTW, any other comments? Shall we merge it after some basic tests? So that we can start to fix the memory allocation issue.
There was a problem hiding this comment.
What's still missing is to use the canjump variable inside the signal handler interrupt_block_insn_sig_handler to decide if we can jump or not, right?
There was a problem hiding this comment.
Oh, forgot to do it, done.
There was a problem hiding this comment.
I tried the changes and it seems to be working.
There was a problem hiding this comment.
The canjump flag only controls whether it is allowed to jump to the place of setjmp inside signal handler of SIGUSR1, but another issue is that: whether it is allowed to receive the signal SIGUSR1 in a thread to trigger its signal handler? Since the signal handler may triggered in each place (each machine instruction) of the thread, it very dangerous, I tend to limit it only in some special places. I uploaded a patch set to enable it only when calling the callback API of pthread_create, maybe we can narrow it more, e.g. enable it only when calling native APIs.
cmake -DWAMR_BUILD_INTERRUPT_BLOCK_INSNto enable it