Implement -s SINGLE_FILE_BINARY_ENCODE=1 option#25599
Implement -s SINGLE_FILE_BINARY_ENCODE=1 option#25599juj merged 21 commits intoemscripten-core:mainfrom
Conversation
…as binary-encoded form instead of base64 form in SINGLE_FILE mode. Continuation of emscripten-core#21478.
sbc100
left a comment
There was a problem hiding this comment.
Very nice!
Do you have some rough number you could put in the PR description?
How about adding a codesize check that asserts the overall result is smaller?
Assuming no unforseen issues do you expect we could just make this setting always-enabled one day? (and remove the old base64 fallback?)
Added.
Added.
Maybe one day. For now I think it's good to have the opt-out to give people the fallback if issues come up. |
src/preamble.js
Outdated
| #elif SINGLE_FILE | ||
| return base64Decode('<<< WASM_BINARY_DATA >>>'); | ||
| #elif AUDIO_WORKLET || !EXPORT_ES6 | ||
| // For an Audio Worklet, we cannot use `new URL()`. |
There was a problem hiding this comment.
Its hard to see why the locateFile path is only used in EXPORT_ES6 mode... do you know why EXPORT_ES6 is here?
There was a problem hiding this comment.
I don't know. It is indeed a little bit odd code flow, which is why I cleaned the ifdefs a bit here as a drive-by. Something to ponder further in a separate PR.
sbc100
left a comment
There was a problem hiding this comment.
Maybe worth hinting in the docs that the plan is to remove this new setting if nobody uses it.
src/preamble.js
Outdated
| #endif | ||
|
|
||
| #if SINGLE_FILE && SINGLE_FILE_BINARY_ENCODE && !WASM2JS | ||
| #include "binaryDecode.js" |
There was a problem hiding this comment.
Do we not want to support this in MINIMAL_RUNTIME too? If so maybe this could go in runtime_common.js instead?
There was a problem hiding this comment.
Thanks, we do. I lost this when resurrecting the code. Added back, and updated test to cover minimal runtime as well.
|
Are there any cases where |
| o[i] = bin.charCodeAt(i) - 1; | ||
| } | ||
| return o; | ||
| } |
There was a problem hiding this comment.
Since this is only 5 lines i wonder if its worth creating a completely new flie? Maybe just inline into runtime_common.js?
base64 encoding encodes each byte uniformly with +33% overhead. Binary encoding encodes compatible bytes in the original byte (+0% overhead), but incompatible bytes (09h, 0Ch, 21h, 5Bh, 80h-FFh) will be encoded as two bytes, so with +100% overhead. If the distribution of input bytes were uniform, this comes out to a +49.61% overhead. So a larger size in uncompressed form. But the trick here is that a) since all those +100% code points are byte-delimited, they compress efficiently with gzip and brotli compression, and |
|
|
||
| function findWasmBinary() { | ||
| return locateFile('a.out.wasm'); | ||
| // For an Audio Worklet, we cannot use `new URL()`. |
There was a problem hiding this comment.
Maybe the new location of this comment is misplaced?
There was a problem hiding this comment.
Hmm the code size before was relying on the comment being swallowed by virtue of being at the end of an #ifdef.. moved it back up there.
This is an automatic change generated by tools/maint/rebaseline_tests.py. The following (1) test expectation files were updated by running the tests with `--rebaseline`: ``` codesize/test_unoptimized_code_size.json: 180812 => 180812 [+0 bytes / +0.00%] Average change: +0.00% (+0.00% - +0.00%) ```
|
@juj @sbc100 This caused a regression I bisected to here, reported on the mailing list: https://groups.google.com/g/emscripten-discuss/c/E_HmYqXGjN8 |
|
@juj can you take a look at this early next week? If you don't have time, or there is no easy fix I suggest we just flip the default for |
|
Thanks for the ping - posted PR #25704 to address this. |
Implement -s SINGLE_FILE_BINARY_ENCODE=1 option to embed Wasm binary as binary-encoded form instead of base64 form in SINGLE_FILE mode. Continuation of emscripten-core#21478. For comparison of code size, see emscripten-core#21426 (comment).
Implement -s SINGLE_FILE_BINARY_ENCODE=1 option to embed Wasm binary as binary-encoded form instead of base64 form in SINGLE_FILE mode. Continuation of #21478.
For comparison of code size, see #21426 (comment).