You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If a message containing a lone surrogate (e.g. half of a split emoji) is sent, the Anthropic API returns a 400:
API ERROR 400: invalid_request_error: The request body is not valid JSON:
no low surrogate in string: line 1 column 174077 (char 174076)
The message is already appended to the conversation history before the API call, so every subsequent request also fails with the same error. The session is permanently corrupted and requires a fresh session to recover.
API returns 400. All future messages in this session also 400.
Impact: Unrecoverable session corruption. All context and conversation history in the session is lost.
Guidance
Two layers:
1. Sanitise before sending (defensive):
Strip lone surrogates from the message text before it reaches the SDK. A lone surrogate is never intentional user input -- it's always corruption from a bug or bad paste.
Strip silently -- no need to reject the whole message. Optionally log: "stripped N invalid character(s)".
2. Fix deleteBackward (root cause prevention): #140 fixed moveLeft/moveRight to use Intl.Segmenter for grapheme-cluster navigation, but deleteBackward still deletes by single code unit. Pressing backspace on an emoji deletes one surrogate and leaves the other orphaned in the text.
Apply the same Intl.Segmenter pattern: delete the entire last grapheme cluster, not just one code unit.
Both layers are needed: layer 1 catches any future source of lone surrogates (paste, external input, other editor operations), layer 2 prevents the most common way to create them.
Symptoms
If a message containing a lone surrogate (e.g. half of a split emoji) is sent, the Anthropic API returns a 400:
The message is already appended to the conversation history before the API call, so every subsequent request also fails with the same error. The session is permanently corrupted and requires a fresh session to recover.
Reproduction:
Impact: Unrecoverable session corruption. All context and conversation history in the session is lost.
Guidance
Two layers:
1. Sanitise before sending (defensive):
Strip lone surrogates from the message text before it reaches the SDK. A lone surrogate is never intentional user input -- it's always corruption from a bug or bad paste.
Strip silently -- no need to reject the whole message. Optionally log:
"stripped N invalid character(s)".2. Fix
deleteBackward(root cause prevention):#140 fixed
moveLeft/moveRightto useIntl.Segmenterfor grapheme-cluster navigation, butdeleteBackwardstill deletes by single code unit. Pressing backspace on an emoji deletes one surrogate and leaves the other orphaned in the text.Apply the same
Intl.Segmenterpattern: delete the entire last grapheme cluster, not just one code unit.Both layers are needed: layer 1 catches any future source of lone surrogates (paste, external input, other editor operations), layer 2 prevents the most common way to create them.
Related