Skip to content

Module.pthreadPoolSize#10263

Merged
kripken merged 4 commits intomasterfrom
pool
Feb 5, 2020
Merged

Module.pthreadPoolSize#10263
kripken merged 4 commits intomasterfrom
pool

Conversation

@kripken
Copy link
Copy Markdown
Member

@kripken kripken commented Jan 22, 2020

This does the same as PTHREAD_POOL_SIZE but can be specified
at runtime. While the compile-time setting is great for testing, in
practice I think the Module option is probably more useful, since lots
of projects want to use a pool equal to the number of actual cores
on the user's machine.

fixes #10231

@kripken kripken requested a review from juj January 22, 2020 18:09
@juj
Copy link
Copy Markdown
Collaborator

juj commented Jan 22, 2020

Hmm, this is tricky, because it adds un-DCEable code that cannot be optimized away.

How about instead:

  1. use -s PTHREAD_POOL_SIZE=-1 to mean a compile time decision "give me navigator.hardwareConcurrency threads in the pool" (or "give me Module['pthreadPoolSize'] || navigator.hardwareConcurrency` threads in the pool?), and/or
  2. use -s PTHREAD_POOL_SIZE=2*#+1 to mean a compile time decision "give me 2*navigator.hardwareConcurrency + 1 threads in the pool, and/or
  3. document a code snippet that let's users create an appropriate preRun/--pre-js code that initializes a pool?

That way we would not build more unoptimizable runtime bits. I know the increase in this PR does not add much, but it feels to me that we should strongly prefer to add only DCEable runtime elements.

Or I suppose if the references to Module['pthreadPoolSize'] are gated behind -s PTHREAD_POOL_SIZE setting, then the added size is not bad.

Basically the pthread pool is intended to be an emulation setting for use cases where sync pthread_create() is absolutely required, so I would not want to see all users have to pay bytes for that emulation setting.

@kripken
Copy link
Copy Markdown
Member Author

kripken commented Jan 22, 2020

Fair point about code size. I feel it's a little clunky to need to set an option in order to get Module.pthreadsPoolSize to be noticed, but I guess it's not that bad. Changed it that way.

I did consider having special values for "number of cores" or such, but there are going to be more general cases that that doesn't handle, so the Module option seems best.

Comment thread src/preamble.js Outdated
addOnPreRun(function() {
if (typeof SharedArrayBuffer !== 'undefined') {
addRunDependency('pthreads');
PThread.allocateUnusedWorkers(pthreadPoolSize, function() {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uh hmm we have now both PThread.allocateUnusedWorkers and PThread.createNewWorkers(pthreadPoolSize);, that seems very wrong. To me this looks like we are allocating 2x the amount of Workers that are needed? Or perhaps I am very confused here. The intent of #9394 where this came into existence does not make sense to me. hmm...

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe the idea is that creating and instantiating the workers is a separate thing. One can create the Worker objects without blocking startup on being able to instantiate them with the info they require etc. But we can discuss in your comments over there.

However, all of that is separate from this PR, and should not block it, unless I'm missing something?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can relayer on top of that, but that one seems to need discussion (so we figure out the history there) so unclear how long it will take to land, while this one is ready to go, isn't it? 😄 Or is there some reason you'd prefer to delay landing this?

Comment thread src/preamble.js Outdated
if (!ENVIRONMENT_IS_PTHREAD) addOnPreRun(function() { if (typeof SharedArrayBuffer !== 'undefined') { addRunDependency('pthreads'); PThread.allocateUnusedWorkers({{{PTHREAD_POOL_SIZE}}}, function() { removeRunDependency('pthreads'); }); }});
#if USE_PTHREADS && PTHREAD_POOL_SIZE

var pthreadPoolSize = Module['pthreadPoolSize'] || {{{ PTHREAD_POOL_SIZE }}};
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If one wanted to set Module['pthreadPoolSize'] = 0; to disable pthread pool, this has the effect of using {{{PTHREAD_POOL_SIZE}}} instead?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, good point. Perhaps this should be more careful then.

Another thought I had meanwhile, if we're considering delaying this PR anyhow as discussed above - perhaps PTHREAD_POOL_SIZE could be a string - then the user could just set it to navigator.hardwareConcurrency directly, what do you think?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that idea - they could then set it to Module['pthreadPoolSize'] as well to customize.

Comment thread src/library_pthread.js Outdated
#endif
PThread.preallocatedWorkers = PThread.createNewWorkers(pthreadPoolSize);
}
#endif // PTHREAD_POOL_SIZE
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure why this code block even exists.. this seems to double allocate a new pool of Workers that has already been created in the code below in src/preamble.js?

@kripken
Copy link
Copy Markdown
Member Author

kripken commented Feb 4, 2020

Ok, rewritten to simply make PTHREAD_POOL_SIZE a string,
PTAL @juj

This makes PTHREAD_POOL_SIZE a string, which allows stuff
like -s PTHREAD_POOL_SIZE=navigator.hardwareConcurrency
in addition to the old -s PTHREAD_POOL_SIZE=8 etc. (This works
because our parser is smart and doesn't require unnecessary
quotes for string values.) The value the user provided is simply
emitted in the JS, and used, so it could be anything that is valid
JS.

Copy link
Copy Markdown
Collaborator

@juj juj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

@kripken kripken merged commit dfc667c into master Feb 5, 2020
@kripken kripken deleted the pool branch February 5, 2020 22:33
mattsoulanille added a commit to tensorflow/tfjs that referenced this pull request Apr 22, 2021
Without the linker option `PTHREAD_POOL_SIZE` set, the wasm backend does not create any webworkers and spins forever waiting for them to be created (crashing the browser). This PR sets `PTHREAD_POOL_SIZE` to `Math.min(4, Math.max(1, (navigator.hardwareConcurrency || 1) / 2))`, meaning it will use half of the logical cores available up to a max of 4 (or only 1 if `navigator.hardwareConcurrency` is not available). See [this comment](emscripten-core/emscripten#10263 (comment)) for details on why this works. This option apparently supersedes the value passed to [`pthreadpool_create` in `backend.cc`](https://github.com/tensorflow/tfjs/blob/c9dfebddfa34f531d2f0c363b6ea574a9fe9745d/tfjs-backend-wasm/src/cc/backend.cc#L65-L66), but its current setting should match the value computed in the `.cc` file. Fixes #4932 

This PR also fixes a bug where webworkers created by the wasm backend were not removed when `dispose` is called. Fixes #4796.

Fixes #4934 as well.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add an option to set the pthread pool size at startup

2 participants