Skip to content

Cursor navigation skips inside emoji and multi-codepoint characters #138

@bananabot9000

Description

@bananabot9000

Symptoms

When navigating through text containing emoji (e.g. 🍌) or other multi-codepoint characters using arrow keys:

  1. It takes two right-arrow presses to move past a single emoji
  2. On the second press, the cursor disappears -- it's positioned "inside" the character where the terminal can't render it
  3. Same behaviour in reverse with left-arrow

Reproduction:

  1. Type hello 🍌 世界 in the editor
  2. Use left-arrow to move back through the text
  3. Observe: cursor jumps into the middle of the emoji and becomes invisible for one keypress

Expected: One arrow press moves past the entire visible character. Cursor never lands inside a character.

Guidance

The editor cursor moves by JavaScript code units (UTF-16). Emoji like 🍌 are surrogate pairs (2 code units), so .length is 2 and the cursor has an intermediate position between the surrogates that doesn't correspond to any visible character boundary.

This is separate from rendering width (#133, #137) -- the string displays correctly at 2 columns. The issue is that cursor movement doesn't skip the full grapheme cluster.

Fix: Use Intl.Segmenter (built into Node 16+, no dependency needed) to identify grapheme cluster boundaries, and move the cursor by whole clusters instead of individual code units:

const segmenter = new Intl.Segmenter('en', { granularity: 'grapheme' });
// segments give you the boundaries to snap the cursor to

This also handles ZWJ sequences (👨‍👩‍👧 = 7 code units, 1 grapheme), flag emoji (🇦🇺 = 4 code units, 1 grapheme), and combining marks.

Key files: Editor input handling -- wherever arrow key left/right adjusts editor.cursor.col.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions