[wasm] Fix wasm backend not starting any threads#4957
[wasm] Fix wasm backend not starting any threads#4957mattsoulanille merged 12 commits intotensorflow:masterfrom
Conversation
Add PTHREAD_POOL_SIZE=4 to linker options to ensure threaded wasm creates workers.
|
The thread count computed in |
| import {backend_util, BackendTimingInfo, DataStorage, DataType, deprecationWarn, engine, env, KernelBackend, TensorInfo, util} from '@tensorflow/tfjs-core'; | ||
|
|
||
| import {BackendWasmModule, WasmFactoryConfig} from '../wasm-out/tfjs-backend-wasm'; | ||
| import {BackendWasmThreadedSimdModule} from '../wasm-out/tfjs-backend-wasm-threaded-simd'; |
There was a problem hiding this comment.
Does the Simd module need a separate instance too or not?
There was a problem hiding this comment.
The simd module doesn't need a separate interface because its API is the same as the non-simd API. The threaded simd module, however, does need a separate interface because it now has the PThread.terminateAllThreads function exposed (via the type declarations).
lina128
left a comment
There was a problem hiding this comment.
Reviewable status: 0 of 1 approvals obtained (waiting on @mattsoulanille)
tfjs-backend-wasm/src/backend_wasm.ts, line 22 at r1 (raw file):
Previously, mattsoulanille (Matthew Soulanille) wrote…
The simd module doesn't need a separate interface because its API is the same as the non-simd API. The threaded simd module, however, does need a separate interface because it now has the
PThread.terminateAllThreadsfunction exposed (via the type declarations).
Got it,
tfjs-backend-wasm/src/backend_wasm.ts, line 164 at r1 (raw file):
this.wasm.tfjs.dispose(); if ('PThread' in this.wasm) { this.wasm.PThread.terminateAllThreads();
Is PThread in both wasm types? Is it better to use a type guard, like:
function isWasmThreaded(object: BackendWasmModule|BackendWasmThreadedSimdModule): object is BackendWasmThreadedSimdModule {
return object != null && 'PThread' in object;
}
if (isWasmThreaded(this.wasm)) {
this.wasm.PThread.terminateAllThreads();
}
| "-s PROXY_TO_PTHREAD=1", | ||
| # Many x86-64 processors have 2 threads per core, so we divide by 2. | ||
| "-s PTHREAD_POOL_SIZE=" + | ||
| "'Math.min(4, Math.max(1, (navigator.hardwareConcurrency || 1) / 2))'", |
There was a problem hiding this comment.
Is it doable that we register a flag WASM_PTHREAD_POOL_SIZE in flags_wasm.ts and initialized with 'Math.min(4, Math.max(1, (navigator.hardwareConcurrency || 1) / 2))'. Then in here and backend.cc, we both use the flag value?
There was a problem hiding this comment.
I think I've found the root cause of this bug, and I'm pretty sure it's caused by the PROXY_TO_PTHREADS link option not working. During my testing, I noticed that emscripten was running on the main thread, and that the browser freeze was due to the _emscripten_futex_wait function running a busy wait loop. This matches the behavior described in the second paragraph of Blocking on the Main Browser Thread.
I'll look into why PROXY_TO_PTHREADS isn't working as expected.
Reviewable status: 0 of 1 approvals obtained (waiting on @lina128, @mattsoulanille, and @qjia7)
tfjs-backend-wasm/src/backend_wasm.ts, line 164 at r1 (raw file):
Previously, lina128 (Na Li) wrote…
Is PThread in both wasm types? Is it better to use a type guard, like:
function isWasmThreaded(object: BackendWasmModule|BackendWasmThreadedSimdModule): object is BackendWasmThreadedSimdModule { return object != null && 'PThread' in object; } if (isWasmThreaded(this.wasm)) { this.wasm.PThread.terminateAllThreads(); }
PThread is only in the BackendWasmThreadedSimdModule type (and it only actually appears in the JS object when the simd threaded build is run). We could use a type guard if that's the recommended style, but tsserver knows that this.wasm is a BackendWasmThreadedSimdModule after I check if ('PThread' in this.wasm), so I don't think a type guard is necessary.
tfjs-backend-wasm/src/cc/BUILD, line 58 at r1 (raw file):
Previously, qjia7 (Jiajia Qin) wrote…
Is it doable that we register a flag
WASM_PTHREAD_POOL_SIZEinflags_wasm.tsand initialized with'Math.min(4, Math.max(1, (navigator.hardwareConcurrency || 1) / 2))'. Then in here andbackend.cc, we both use the flag value?
This is a good idea, but I think it will have to be done in a separate PR after this fix is out. Feel free to create a feature request.
lina128
left a comment
There was a problem hiding this comment.
Thank you Matt. This is great! LGTM
Reviewable status:
complete! 1 of 1 approvals obtained (waiting on @lina128, @mattsoulanille, and @qjia7)
lina128
left a comment
There was a problem hiding this comment.
Reviewable status:
complete! 1 of 1 approvals obtained (waiting on @lina128, @mattsoulanille, and @qjia7)
tfjs-backend-wasm/src/cc/BUILD, line 58 at r1 (raw file):
Previously, mattsoulanille (Matthew Soulanille) wrote…
This is a good idea, but I think it will have to be done in a separate PR after this fix is out. Feel free to create a feature request.
Hi Jiajia, the flag you proposed is different than this one, #4942, right? For both flags, do you expect them to be tunable?
|
Fixes #4934: |
Yes. #4942 is just a get method. I hope we can control the threads number or pool size. Do you know the relationship between the |
|
They should be same. When pthread_create() is called, if we need to create a new Web Worker, then that requires returning the main event loop. That is, you cannot call pthread_create and then keep running code synchronously that expects the worker to start running - it will only run after you return to the event loop. This is a violation of POSIX behavior and will break common code which creates a thread and immediately joins it or otherwise synchronously waits to observe an effect such as a memory write. There are several solutions to this: |

Without the linker option
PTHREAD_POOL_SIZEset, the wasm backend does not create any webworkers and spins forever waiting for them to be created (crashing the browser). This PR setsPTHREAD_POOL_SIZEtoMath.min(4, Math.max(1, (navigator.hardwareConcurrency || 1) / 2)), meaning it will use half of the logical cores available up to a max of 4 (or only 1 ifnavigator.hardwareConcurrencyis not available). See this comment for details on why this works. This option apparently supersedes the value passed topthreadpool_createinbackend.cc, but its current setting should match the value computed in the.ccfile. Fixes #4932This PR also fixes a bug where webworkers created by the wasm backend were not removed when
disposeis called. Fixes #4796.Fixes #4934 as well.
To see the logs from the Cloud Build CI, please join either our discussion or announcement mailing list.
This change is