Remove support for TEXTDECODER=0 #24700

sbc100 · 2025-07-14T21:35:16Z

Setting this to zero doesn't provide much value to anyone since that conditional usage
is relatively cheap (compared to the fallback) and the API available in all browsers these days.

We still need the fallback for things like audio worklets and JS shell environments.

sbc100

The codesize changes here are because of the removal of the comments in the output (they only effected tests were the unoptimized output tests, that include comments).

kripken

After this, I wonder if we can enable mode 2 when ENVIRONMENT / the browser versions imply that?

ChangeLog.md

sbc100 · 2025-07-14T23:50:34Z

After this, I wonder if we can enable mode 2 when ENVIRONMENT / the browser versions imply that?

Yes, I was thinking that too. Or at least we could default to it. I do wonder about the short-string optimization where we currently opt out of using the TextDecoder for short strings.. i wonder if that really is an optimization still on modern browsers.

arsnyder16 · 2025-07-15T17:18:14Z

@sbc100 @kripken Just want to point out that TextDecoder=0 was being used to work around some bugs that haven't been really understood fully. Maybe those bugs have disappeared maybe they still exists but the workaround no longer exists

#18034 (comment)

#16994 (comment)

sbc100 · 2025-07-15T18:21:07Z

@sbc100 @kripken Just want to point out that TextDecoder=0 was being used to work around some bugs that haven't been really understood fully. Maybe those bugs have disappeared maybe they still exists but the workaround no longer exists

#18034 (comment)

#16994 (comment)

Oh wow, I didn't remember those. It looks like #16994 can now be closed. I'm not sure about #18034 .. that looks very odd, and is hopefully fixed.

arsnyder16 · 2025-07-15T18:24:35Z

I believe #18034 is likely fixed indirectly from #22844, Thanks! I am good going with this and can remove TEXT_DECODE=0 workaround once i update to 4.0.12+

juj · 2025-08-31T11:22:35Z

I believe this causes a regression for browsers that are unable to use TextDecoder.decode() on SharedArrayBuffers, because Emscripten is unable to detect that (unless it did a test decode and caught an exception).

juj · 2025-08-31T11:24:56Z

Setting this to zero doesn't provide much value to anyone since that conditional usage
is relatively cheap (compared to the fallback) and the API available in all browsers these days.

Users that want to ensure that their code works on earlier browser versions could set TEXTDECODER=0 and be sure that they wouldn't hit browser bugs.

I am a bit surprised what the benefit was for removing TEXTDECODER=0 support.

sbc100 · 2025-09-02T22:59:32Z

Setting this to zero doesn't provide much value to anyone since that conditional usage
is relatively cheap (compared to the fallback) and the API available in all browsers these days.

Users that want to ensure that their code works on earlier browser versions could set TEXTDECODER=0 and be sure that they wouldn't hit browser bugs.

Do such bugs exist?

Couldn't such as user just do TextDecoder = undefined in a pre-js file?

I am a bit surprised what the benefit was for removing TEXTDECODER=0 support.

Its quite a bit of complexity being removed, and in paritucalar #if/#else conditions which have a pretty high complexity cost in emscripten as they make the code hard to reason about (IMHO).

IIRC it also lays the groundwork for followup simplifications, although I don't remember now what they were.

juj · 2025-09-02T23:08:28Z

Do such bugs exist?

Yes, there is a very long range of browsers that after adding support for SharedArrayBuffer, didn't yet support TextDecoder on SAB views. E.g. maybe two to three years of Firefoxes at least, probably same duration of Chromes.

Couldn't such as user just do TextDecoder = undefined in a pre-js file?

The user will have to discover that the bug exists in the old browsers first. And then that would mean that any other uses of TextDecoder site wide code in the same JS context would not get to access TextDecoder either, even on non-Emscripten content.

Though maybe it won't become a problem to users. For us at least this won't be an issue, we are eyeing towards WebGPU only browsers for the future.

sbc100 · 2025-09-02T23:15:02Z

Do such bugs exist?

Yes, there is a very long range of browsers that after adding support for SharedArrayBuffer, didn't yet support TextDecoder on SAB views. E.g. maybe two to three years of Firefoxes at least, probably same duration of Chromes.

I thought TextDecode was simply/officially not supported with SAB, and we have code for dealing with that, don't we?

e.g.

return UTF8Decoder.decode(heapOrArray.buffer ? {{{ getUnsharedTextDecoderView('heapOrArray', 'idx', 'endPtr') }}} : new Uint8Array(heapOrArray.slice(idx, endPtr)))

Are there browsers that do support this and wouldn't need the call to getUnsharedTextDecoderView here?

Couldn't such as user just do TextDecoder = undefined in a pre-js file?

The user will have to discover that the bug exists in the old browsers first. And then that would mean that any other uses of TextDecoder site wide code in the same JS context would not get to access TextDecoder either, even on non-Emscripten content.

True, asking users to do TextDecoder = undefined is probably not reasonable.

Better for us to handle any TextDecoder bugs if we find any. If there are TextDecoder bugs that we don't handle, I'm happy to add workarounds for them.

Though maybe it won't become a problem to users. For us at least this won't be an issue, we are eyeing towards WebGPU only browsers for the future.

sbc100 · 2025-09-02T23:15:25Z

(Like the workaround we already have for the SAB limitation)

juj · 2025-09-02T23:18:02Z

and we have code for dealing with that, don't we?

Ohhh. I didn't recall that at all. Then nevermind, nothing to see here. (Performance might be appalling with {{{ getUnsharedTextDecoderView('heapOrArray', 'idx', 'endPtr') }}} construct though)

RReverser · 2025-09-04T11:16:53Z

I thought TextDecode was simply/officially not supported with SAB, and we have code for dealing with that, don't we?

FWIW that's not entirely correct. The spec has allow TextDecoder to work with SABs for a while now, it's just that none of the browsers bother to prioritise and implement it, so we are stuck with the slow clone workaround.

I wonder if Emscripten could nudge someone to help it happen at Chrome.

Better for us to handle any TextDecoder bugs if we find any. If there are TextDecoder bugs that we don't handle, I'm happy to add workarounds for them.

I saw this bug in wasm-bindgen today and thought of this thread: wasm-bindgen/wasm-bindgen#4471

I suspect Emscripten will run into the same issue after decoding 2GB worth of strings, as we also use a shared TextDecoder instance (like everyone else). Might be worth testing and adding a workaround.

juj · 2025-09-04T11:30:25Z

Uhh, what a bug.

I think it would be best to restore TEXTDECODER=0 then.

RReverser · 2025-09-04T11:33:37Z

I think it's worth adding a workaround anyway, counting how much we've decoded (or catching the error) is not that hard.

sbc100 · 2025-09-04T14:11:48Z

I thought TextDecode was simply/officially not supported with SAB, and we have code for dealing with that, don't we?

FWIW that's not entirely correct. The spec has allow TextDecoder to work with SABs for a while now, it's just that none of the browsers bother to prioritise and implement it, so we are stuck with the slow clone workaround.

Oh, so one day, once browsers start implementing it we can probe for support for this? Is there some way to track this feature or know when one of the browsers implements it?

I wonder if Emscripten could nudge someone to help it happen at Chrome.

Better for us to handle any TextDecoder bugs if we find any. If there are TextDecoder bugs that we don't handle, I'm happy to add workarounds for them.

I saw this bug in wasm-bindgen today and thought of this thread: wasm-bindgen/wasm-bindgen#4471

I suspect Emscripten will run into the same issue after decoding 2GB worth of strings, as we also use a shared TextDecoder instance (like everyone else). Might be worth testing and adding a workaround.

sbc100 · 2025-09-04T14:14:19Z

I guess we need to write a test the decoded 3gb of data now. Great. Thanks Safari.

RReverser · 2025-09-04T15:27:31Z

Oh, so one day, once browsers start implementing it we can probe for support for this? Is there some way to track this feature or know when one of the browsers implements it?

Yup you can see AllowSharedBufferSource in the spec: https://encoding.spec.whatwg.org/#textdecoder

For individual bugs, you'd have to track them down in corresponding browser bug trackers, but e.g. it's already implemented in Node.js: nodejs/node#32203

I guess we need to write a test the decoded 3gb of data now. Great. Thanks Safari.

IIUC, it's also enough to feed same small chunk data in a loop until you reach total 3gb, you don't necessarily need to allocate that much.

sbc100 requested review from RReverser and kripken July 14, 2025 21:35

sbc100 force-pushed the TEXTDECODER_zero branch from 75c8efd to fe847ae Compare July 14, 2025 21:37

sbc100 commented Jul 14, 2025

View reviewed changes

sbc100 force-pushed the TEXTDECODER_zero branch from fe847ae to 7685ad0 Compare July 14, 2025 22:51

kripken approved these changes Jul 14, 2025

View reviewed changes

ChangeLog.md Outdated Show resolved Hide resolved

ChangeLog.md Outdated Show resolved Hide resolved

Remove support for TEXTDECODER=0

3cf19b4

sbc100 force-pushed the TEXTDECODER_zero branch from 7685ad0 to 3cf19b4 Compare July 14, 2025 23:53

sbc100 merged commit b490c44 into emscripten-core:main Jul 14, 2025
4 of 13 checks passed

sbc100 deleted the TEXTDECODER_zero branch July 14, 2025 23:53

Remove support for TEXTDECODER=0 #24700

Remove support for TEXTDECODER=0 #24700

Uh oh!

Conversation

sbc100 commented Jul 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sbc100 left a comment

Choose a reason for hiding this comment

Uh oh!

kripken left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

sbc100 commented Jul 14, 2025

Uh oh!

Uh oh!

arsnyder16 commented Jul 15, 2025

Uh oh!

sbc100 commented Jul 15, 2025

Uh oh!

arsnyder16 commented Jul 15, 2025

Uh oh!

juj commented Aug 31, 2025

Uh oh!

juj commented Aug 31, 2025

Uh oh!

sbc100 commented Sep 2, 2025

Uh oh!

juj commented Sep 2, 2025

Uh oh!

sbc100 commented Sep 2, 2025

Uh oh!

sbc100 commented Sep 2, 2025

Uh oh!

juj commented Sep 2, 2025

Uh oh!

RReverser commented Sep 4, 2025

Uh oh!

juj commented Sep 4, 2025

Uh oh!

RReverser commented Sep 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sbc100 commented Sep 4, 2025

Uh oh!

sbc100 commented Sep 4, 2025

Uh oh!

RReverser commented Sep 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

sbc100 commented Jul 14, 2025 •

edited

Loading

RReverser commented Sep 4, 2025 •

edited

Loading