Fix misues of is_main_runtime_thread over is_main_browser_thread#15503
Fix misues of is_main_runtime_thread over is_main_browser_thread#15503
is_main_runtime_thread over is_main_browser_thread#15503Conversation
|
This is probably a valid change, but just to be sure we're covering the other aspect here: the literal main browser thread is special in that we can't block there, and a lot of the "if main thread then do complex thing" is to work around that. If the main runtime thread is not the main browser thread then we don't need those hacks? But OTOH the main runtime thread is special in that lots of stuff ends up proxied to it, so perhaps we do want to not block there either. I think this can make sense to do, but given the above I think it is sort of a policy change as well. If so, the changelog might be worth updating. |
I don't think its a policy change, but more of a bug fix. In both these busy loops we call If there was a case where Is it ever possible to run emscripten code on the main browser thread that is not also the main runtime thread? I think not right, since there would be now way to start a thread on the main browser thread if the application itself was started in a worker. |
|
Let me try make the argument another way, there are currently four different places where we avoid blocking so we can call
Here we are using
Here we are using
Here we are using
emscripten/src/library_pthread.js Line 906 in 4b9a0d0 Here we are using I believe that in all of these cases the busy loop is intended for the runtime thread, but in two of them we mistakenly check for the browser thread. The net result is that program that start in a worker will not work.. but since that is not common we have not noticed this bug. |
Perhaps the main runtime thread of Node.js counts as non-main browser thread? IIUC, blocking on the main runtime thread in Node.js is fine, see for example this check: emscripten/src/library_pthread.js Lines 967 to 968 in 4b9a0d0 Currently however, both #include <stdio.h>
#include <emscripten/threading.h>
int main() {
printf("emscripten_is_main_runtime_thread() = %d\n", emscripten_is_main_runtime_thread());
printf("emscripten_is_main_browser_thread() = %d\n", emscripten_is_main_browser_thread());
return 0;
}$ emcc test.c -o test.js
$ node test.js
emscripten_is_main_runtime_thread() = 1
emscripten_is_main_browser_thread() = 1(I've tried to change this behavior in the past with commit kleisauke@36236cf but unfortunately it didn't work out) |
The history here is that running pthreads enabled builds in PROXY_TO_WORKER mode (or manually launching an Emscripten pthreads compiled app in a Worker) has not been supported. To recall,
main_browser thread cannot Atomics.wait(), so it must busy spin hack. Very originally in ~2015 or so when this was written, pthreads build mode had its own proxy-to-worker mode that was a bit similar to PROXY_TO_WORKER, but implemented independently to avoid complicating the existing non-pthreads worker builds mode. That original build mode would have main runtime thread be different than the main runtime thread. That mode cared about distinguishing main_browser & main_runtime threads, which would be separate threads. That mode was some time after, 2015-1016 dropped and replaced with the current PROXY_TO_PTHREAD build mode as it is considerably simpler (just one pass-through main() wrapper to compile in between the code). In that build mode, the main_browser thread will be the same as the main_runtime thread, but the main_browser thread will only launch user In 2019-2020 or so Alon commented that pthreads + PROXY_TO_WORKER should be supported, so that support was opened up. However there are limitations to that, e.g. it requires nested Worker support, and causes limitations to MAIN_THREAD_EM_ASM and other proxying architecture, since those use postMessage(), they will always post to their parent thread, which in PROXY_TO_WORKER builds will not be the main_browser thread, but the proxy owning Worker thread. MAIN_THREAD_EM_ASM() would be more precisely called MAIN_RUNTIME_THREAD_EM_ASM(), which might not be the main browser thread in this kind of complex PROXY_TO_WORKER build mode.
There are two reasons that the spinloop exists there: main_browser thread cannot Atomics.wait(), so it must fully busyspin, and main_runtime thread must stay responsive to process incoming proxy requests, so it can at best sleep in small slices.
Agreed. This is a fallout of the relaxed pthreads + PROXY_TO_WORKER build mode compatibility. If we did not support that, the concepts of the two threads main_browser and main_runtime could be fused just to one. Given the subtlety here, it would be good to have a test for this in the suite if possible?
This does not sound like something we need to support. If user is doing this kind of more complex pthreads + PROXY_TO_WORKER build, we can assume they are sensible, and don't then try to post the module back to main browser thread to attempt to have that act as some kind of extra worker thread that would not be the main runtime thread. So
I haven't closely followed the Node.js threading side, though if Node.js differs from the browser by allowing blocking on the main Node thread, then that would mean that we can Atomics.wait() to sleep there, but will still need to wait at most small time slices on the runtime thread to stay responsive to handle incoming proxy requests. |
| #ifdef __EMSCRIPTEN__ | ||
| double msecsToSleep = top ? (top->tv_sec * 1000 + top->tv_nsec / 1000000.0) : INFINITY; | ||
| int is_main_thread = emscripten_is_main_browser_thread(); | ||
| int is_main_thread = emscripten_is_main_runtime_thread(); |
There was a problem hiding this comment.
To be most accurate, this code should read
int can_futex_wait_on_current_thread = !emscripten_is_main_browser_thread() || current_environment_is_node/*node.js main thread can do Atomics.wait() */;
int is_main_runtime_thread = emscripten_is_main_runtime_thread();
...
if ((!can_futex_wait_on_current_thread || is_main_runtime_thread) || ...) {
...
// Assist other threads by executing proxied operations that are effectively singlethreaded.
if (is_main_runtime_thread) emscripten_main_thread_process_queued_calls();
...There was a problem hiding this comment.
I don't think that should be necessary because the emscripten_futex_wait is already safe for all on the main thread and will busy look in a tight loop. There should be no need to also have this outer busy loop that calls emscripten_futex_wait with smaller values.
In fact, as a followup I think we can remove the is_main_thread part of this condition completely, but I'll leave that as a followup.
There was a problem hiding this comment.
another way of putting it: both code paths here call emscripten_futex_wait... avoiding calling emscripten_futex_wait is not what this code does. This code instead relies on being able to call emscripten_futex_wait regardless.
There was a problem hiding this comment.
Oh, good observation - I did not recall that emscripten_futex_wait was already guarded for main thread to be able to call it. lgtm then.
| a_inc(&inst->finished); | ||
| #ifdef __EMSCRIPTEN__ | ||
| int is_main_thread = emscripten_is_main_browser_thread(); | ||
| int is_main_thread = emscripten_is_main_runtime_thread(); |
There was a problem hiding this comment.
Here likewise to the above code.
|
|
||
| // main thread may need to run proxied calls, so sleep in very small slices to be responsive. | ||
| const double maxMsecsSliceToSleep = emscripten_is_main_browser_thread() ? 1 : 100; | ||
| const double maxMsecsSliceToSleep = emscripten_is_main_runtime_thread() ? 1 : 100; |
There was a problem hiding this comment.
This looks good - might be good to change the comment to say // main runtime thread may need to ...
|
|
||
| // Ensure that files are preloaded from the main thread. | ||
| assert(emscripten_is_main_browser_thread()); | ||
| assert(emscripten_is_main_runtime_thread()); |
There was a problem hiding this comment.
These two wasmfs changes lgtm.
In all these places the code is really attempting to figure out if it is the main runtime thread. i.e. the first place the program is loaded and the place that runs the callback and async events send from secondary threads. The reason this mistake often goes unnoticed is that in almost all cases the main runtime thread is also running on the main browser thread. One easy way to see that `is_main_browser_thread` is the wrong question to be asking in many of these cases is to remember that when emscripten is started in a worker there is no main browser involved and so this function will return false on *all* threads. In the cast of `__timedwait.c` and `pthread_barrier_wait.c` the desire is to avoid blocking the main runtime threads so that calls from other threads can be processed by `emscripten_main_thread_process_queued_calls`. `emscripten_main_thread_process_queued_calls` is expected to always run on the main runtime thread, and not necessarily on the main browser thread. Indeed its first line is: `assert(emscripten_is_main_runtime_thread());`
…mscripten-core#15503) In all these places the code is really attempting to figure out if it is the main runtime thread. i.e. the first place the program is loaded and the place that runs the callback and async events send from secondary threads. The reason this mistake often goes unnoticed is that in almost all cases the main runtime thread is also running on the main browser thread. One easy way to see that `is_main_browser_thread` is the wrong question to be asking in many of these cases is to remember that when emscripten is started in a worker there is no main browser involved and so this function will return false on *all* threads. In the cast of `__timedwait.c` and `pthread_barrier_wait.c` the desire is to avoid blocking the main runtime threads so that calls from other threads can be processed by `emscripten_main_thread_process_queued_calls`. `emscripten_main_thread_process_queued_calls` is expected to always run on the main runtime thread, and not necessarily on the main browser thread. Indeed its first line is: `assert(emscripten_is_main_runtime_thread());`
In all these places the code is really attempting to figure out if it is
the main runtime thread. i.e. the first place the program is loaded and
the place that runs the callback and async events send from secondary
threads.
The reason this mistake often goes unnoticed is that in almost all cases
the main runtime thread is also running on the main browser thread.
One easy way to see that
is_main_browser_threadis the wrong questionto be asking in many of these cases is to remember that when emscripten
is started in a worker there is no main browser involved and so this
function will return false on all threads.
In the case of
__timedwait.candpthread_barrier_wait.cthe desireis to avoid blocking the main runtime thread so that calls from other
threads can be processed by
emscripten_main_thread_process_queued_calls.emscripten_main_thread_process_queued_callsis expected to always runon the main runtime thread, and not necessarily on the main browser
thread. Indeed its first line is:
assert(emscripten_is_main_runtime_thread());