fix: validate Unicode codepoints in utf8_encode() by hobostay · Pull Request #4 · ghostty-org/ghostling

hobostay · 2026-03-23T07:13:11Z

Summary

Add validation to utf8_encode() to ensure codepoints are within valid Unicode range (U+0000 to U+10FFFF)
Replace invalid codepoints with Unicode replacement character U+FFFD
Prevents generation of malformed UTF-8 sequences

Details

The Unicode standard defines the maximum valid codepoint as U+10FFFF (RFC 3629). The current utf8_encode() function accepts any 32-bit value >= 0x10000 and encodes it as 4-byte UTF-8, which can produce invalid sequences for values > 0x10FFFF.

This fix validates the input codepoint and replaces out-of-range values with U+FFFD () before encoding, ensuring the output is always valid UTF-8.

Test plan

Code compiles without warnings
Follows RFC 3629 UTF-8 encoding rules
Maintains backward compatibility for valid codepoints

🤖 Generated with Claude Code

The Unicode standard defines the maximum valid codepoint as U+10FFFF. Codepoints above this value are invalid and produce malformed UTF-8 sequences. This patch adds validation to replace out-of-range codepoints with the Unicode replacement character U+FFFD. This follows RFC 3629 which restricted UTF-8 to encode no more than U+10FFFF to avoid UTF-16 surrogate pairs and maintain consistency with the Unicode standard. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

mitchellh merged commit d6e707a into ghostty-org:main Mar 23, 2026
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: validate Unicode codepoints in utf8_encode()#4

fix: validate Unicode codepoints in utf8_encode()#4
mitchellh merged 1 commit into
ghostty-org:mainfrom
hobostay:fix/utf8-codepoint-validation

hobostay commented Mar 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

hobostay commented Mar 23, 2026

Summary

Details

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants