Skip to content

fix: Align emoji background rendering with text positioning#136

Merged
kshivang merged 20 commits intomasterfrom
fix/emoji-background-alignment
Dec 23, 2025
Merged

fix: Align emoji background rendering with text positioning#136
kshivang merged 20 commits intomasterfrom
fix/emoji-background-alignment

Conversation

@kshivang
Copy link
Copy Markdown
Owner

Summary

  • Add visualCol tracking to renderBackgrounds() to match renderText() positioning
  • Handle surrogate pairs for proper actualCodePoint calculation
  • Expand emoji detection ranges to include Dingbats (U+2700-27BF) and Misc Symbols (U+2B00-2BFF)
  • Skip standalone variation selectors (U+FE0F/FE0E) as they don't occupy visual space
  • Treat all emoji as 2-cell width for background rendering

Problem

Emoji with variation selectors (⚠️, ☁️, ❤️) and emoji without them (✅, ⭐) had misaligned background colors. The background would be offset by one cell or not cover the full emoji width.

Root Cause

renderBackgrounds() used buffer column positions (col) for x-positioning while renderText() uses visual positions (visualCol). When invisible characters like variation selectors exist in the buffer, these diverge.

Test plan

  • Run /tmp/test_emoji_bg.sh to verify emoji backgrounds align correctly
  • Test emoji with variation selectors: ⚠️ ☁️ ❤️ ☀️
  • Test emoji without variation selectors: ✅ ⭐
  • Test emoji mixed with regular text
  • Verify build succeeds

🤖 Generated with Claude Code

- Add visualCol tracking to renderBackgrounds() to match renderText()
- Handle surrogate pairs for proper actualCodePoint calculation
- Expand emoji detection ranges to include Dingbats (U+2700-27BF) and
  Misc Symbols (U+2B00-2BFF) for emoji like ✅ and ⭐
- Skip standalone variation selectors (U+FE0F/FE0E) as they don't
  occupy visual space
- Treat all emoji as 2-cell width for background rendering

Fixes background misalignment for emoji with variation selectors
(e.g., ⚠️, ☁️, ❤️) and emoji without them (e.g., ✅, ⭐).

🤖 Generated with [Claude Code](https://claude.com/claude-code)
@claude
Copy link
Copy Markdown

claude Bot commented Dec 21, 2025

Code Review: Emoji Background Rendering Alignment

Summary

This PR addresses a critical rendering bug where emoji backgrounds were misaligned with their text positioning. The fix introduces visualCol tracking to renderBackgrounds() to match renderText() logic, handling surrogate pairs, variation selectors, and emoji width calculations.

✅ Strengths

  1. Root Cause Analysis: Correctly identifies the divergence between buffer positions (col) and visual positions (visualCol) when invisible characters exist
  2. Comprehensive Emoji Coverage: Extends emoji detection to include Dingbats (U+2700-27BF) and Misc Symbols (U+2B00-2BFF), covering ⚠️ ☁️ ❤️ ✅ ⭐
  3. Consistent Logic: Explicitly mirrors renderText() logic with comments referencing line numbers (e.g., "must match renderText logic lines 384-398")
  4. Proper Surrogate Pair Handling: Correctly computes actualCodePoint from high/low surrogate pairs
  5. Good Documentation: Clear comments explaining each step and code point ranges

⚠️ Issues & Concerns

1. Code Duplication (Maintenance Risk)

The surrogate pair handling, emoji detection, and width calculation logic is now duplicated between renderBackgrounds() and renderText(). This violates DRY principles and creates maintenance burden.

Recommendation: Extract shared logic into helper methods:

private data class CharacterInfo(
    val actualCodePoint: Int,
    val isWcwidthDoubleWidth: Boolean,
    val isDoubleWidth: Boolean,
    val isEmojiOrWideSymbol: Boolean,
    val isEmojiWithVariationSelector: Boolean,
    val visualWidth: Int,
    val lowSurrogate: Char?
)

private fun analyzeCharacter(
    char: Char,
    col: Int,
    line: TerminalLine,
    visibleCols: Int,
    ambiguousCharsAreDoubleWidth: Boolean
): CharacterInfo {
    // Shared logic for both methods
}

2. Emoji Range Inconsistency

renderBackgrounds() now has more emoji ranges than renderText():

Range renderBackgrounds renderText Issue
0x2600-26FF Match
0x2700-27BF DIVERGENCE
0x2B00-2BFF DIVERGENCE
0x1F100-1F1FF Match

This is dangerous - backgrounds will be 2 cells wide but text might be rendered differently, causing new misalignment bugs.

Recommendation: Add the same ranges to renderText() at line 434-441, OR extract to a shared isEmojiCodePoint(codePoint: Int) function.

3. Variation Selector Handling Asymmetry

renderBackgrounds (line 141-145):

if (char.code == 0xFE0F || char.code == 0xFE0E) {
    col++
    continue
}

renderText (line 452-455):

if ((char.code == 0xFE0F || char.code == 0xFE0E) && !isEmojiOrWideSymbol) {
    col++
    continue
}

The conditions differ! renderText() checks !isEmojiOrWideSymbol, but renderBackgrounds() skips ALL variation selectors unconditionally. This could cause subtle bugs if a variation selector appears standalone.

Recommendation: Use identical logic in both methods. The renderText() version seems safer.

4. Bounds Checking Edge Case

Line 153-154 and 189-190 check col + vsOffset < ctx.visibleCols, but vsOffset could be 2 if isWcwidthDoubleWidth is true. If we're at col = visibleCols - 1, we might miss the variation selector.

Test Case:

Terminal width: 80 cols
Emoji at col 79: ⚠ (double-width) + DWC at col 80 + VS at col 81

This would read beyond visibleCols.

Recommendation: Add bounds check: col + vsOffset < line.length or use min().

5. Visual Width Logic Simplification

Lines 195-199 could be simplified:

val visualWidth = when {
    isEmojiOrWideSymbol -> 2  // All emoji are 2 cells (with or without VS)
    isDoubleWidth -> 2
    else -> 1
}

Since isDoubleWidth is derived from isWcwidthDoubleWidth OR actualCodePoint >= 0x1F100, and most emoji fall into high codepoints, this seems redundant.

Question: Can isDoubleWidth ever be true when isEmojiOrWideSymbol is false for codepoints >= 0x1F100? If not, simplify to:

val visualWidth = if (isEmojiOrWideSymbol || isDoubleWidth) 2 else 1

6. Performance Consideration

The loop now does significantly more work per character:

  • 3 buffer lookups (charAtCol1, charAtCol2, vsChar)
  • Multiple conditional checks
  • Surrogate pair conversion

For a 200x80 terminal (16,000 cells), this runs on every frame. Consider:

  • Caching character analysis results if redraw rate is high
  • Early exit optimizations for ASCII-only lines

🧪 Testing Gaps

The test plan mentions manual verification, but no automated tests. Consider adding:

  1. Unit tests for character analysis logic (surrogate pairs, emoji ranges, VS detection)
  2. Regression tests with known problematic emoji sequences
  3. Snapshot tests comparing background vs text positioning

🔒 Security & Safety

No security concerns - this is rendering logic operating on pre-validated terminal buffer data.

📊 Performance Impact

Estimated: ~5-10% increase in renderBackgrounds() time due to additional lookups and checks. This should be acceptable for 60fps rendering, but profile if frame drops occur.

✨ Recommendations Summary

High Priority:

  1. Fix emoji range inconsistency between renderBackgrounds() and renderText()
  2. Align variation selector handling logic
  3. Fix bounds checking edge case

Medium Priority:
4. Extract shared character analysis logic to reduce duplication
5. Add unit tests for character analysis

Low Priority:
6. Profile and optimize if frame rate issues arise

Verdict

Conditional Approval ✅ with changes requested:

  • The fix correctly addresses the root cause
  • Implementation is thorough and well-documented
  • Critical issue: emoji range inconsistency MUST be fixed before merge
  • Code duplication is a maintenance risk but acceptable short-term

Great work on the detailed analysis and comprehensive solution! The PR moves BossTerm closer to correct emoji rendering. After addressing the emoji range inconsistency, this should be safe to merge.


🤖 Review generated with Claude Code

…ssification

- Use explicit Apple Color Emoji font for variation selector emoji on macOS
  (fixes ⚠️ rendering as text symbol instead of color emoji)
- Fix isEmojiPresentation to only include Unicode Emoji_Presentation=Yes chars
  (fixes ✳ ✴ ✔ ✖ and others incorrectly rendering as 2-cell)
- Add skip logic for ZWJ sequences, skin tone modifiers, and gender symbols
  in background rendering
- Properly handle emoji with variation selectors in both rendering paths

🤖 Generated with [Claude Code](https://claude.com/claude-code)
@claude
Copy link
Copy Markdown

claude Bot commented Dec 23, 2025

Code Review: Emoji Background Alignment Fix

Summary

This PR fixes a critical alignment issue between emoji backgrounds and text rendering by introducing visualCol tracking and enhancing emoji detection. The fix properly handles variation selectors, surrogate pairs, and various emoji ranges.


✅ Strengths

  1. Root Cause Analysis: Excellent diagnosis - the divergence between buffer positions (col) and visual positions (visualCol) is the exact issue.

  2. Comprehensive Emoji Detection: The new isEmojiPresentation() function is well-curated with good documentation about which characters have Emoji_Presentation=Yes by default.

  3. Proper Surrogate Pair Handling: The code correctly handles surrogate pairs for proper actualCodePoint calculation in both renderBackgrounds() and renderText().

  4. Good Code Comments: Clear inline explanations for complex logic (ZWJ sequences, skin tone modifiers, etc.).

  5. Visual Position Parity: The renderBackgrounds() method now mirrors the renderText() logic for consistent positioning.


🔍 Issues & Concerns

1. Critical: Duplicate Font Logic (High Priority)

Location: TerminalCanvasRenderer.kt:917-927

The code attempts to match "Apple Color Emoji" on macOS but this may fail:

if (isMacOS) {
    val appleColorEmoji = FontMgr.default.matchFamilyStyle("Apple Color Emoji", ...)
    if (appleColorEmoji != null) {
        FontFamily(androidx.compose.ui.text.platform.Typeface(appleColorEmoji))
    } else {
        FontFamily.Default  // Fallback
    }
}

Problems:

  • FontMgr.default.matchFamilyStyle() may fail if the font name doesn't exactly match
  • The previous implementation used FontFamily.Default which already includes emoji fonts on macOS
  • This adds complexity without clear benefit

Recommendation: Revert to FontFamily.Default for emoji with variation selectors. The system font stack already includes Apple Color Emoji on macOS.

2. Code Duplication: isEmojiWithVariationSelector Logic

Locations: Lines 481, 869

The variation selector detection is duplicated:

// In renderBackgrounds (line 481):
val hasVariationSelector = charAtCol1 != null && (charAtCol1.code == 0xFE0F || charAtCol1.code == 0xFE0E)

// In renderText (line 869):  
val hasVariationSelector = nextChar != null && (nextChar.code == 0xFE0F || nextChar.code == 0xFE0E)

Recommendation: Extract to a helper function:

private fun isVariationSelector(char: Char?): Boolean =
    char != null && (char.code == 0xFE0F || char.code == 0xFE0E)

3. Emoji Scaling Logic Change (Potential Regression)

Location: TerminalCanvasRenderer.kt:1002-1020

The new scaling logic for emoji without variation selectors changed significantly:

  • Before: Centered within single cell
  • After: Scaled to 2 cells with scaleValue.coerceIn(0.8f, 2.5f)

Concerns:

  • Why allow scaling down to 0.8f? This could make emoji smaller than intended
  • The pivot point calculation Offset(x, y + ctx.cellHeight / 2f) differs from the previous Offset(x + ctx.cellWidth/2, y + ctx.cellHeight/2)
  • The centering calculation (centerX - x) / scaleValue is complex and hard to verify

Recommendation: Add test case specifically for emoji without variation selectors (like ⭐) to ensure they render correctly.

4. Performance: String Concatenation in Hot Loop

Location: TerminalCanvasRenderer.kt:963-968

val textToRender = if (isEmojiWithVariationSelector && nextChar != null &&
    (nextChar.code == 0xFE0F || nextChar.code == 0xFE0E)) {
    "$charTextToRender$nextChar"  // String concatenation
} else {
    charTextToRender
}

Problem: String concatenation in the render loop (called for every character) could create GC pressure.

Recommendation: Consider using a StringBuilder or caching these strings if performance becomes an issue.

5. Incomplete ZWJ Handling

Location: TerminalCanvasRenderer.kt:147-159

The ZWJ skip logic in renderBackgrounds() is good, but there's no corresponding skip in the main loop advancement at line 261:

col++
if (isWcwidthDoubleWidth) col++  // Skip DWC marker
if (isEmojiWithVariationSelector) col++  // Skip variation selector
if (lowSurrogate != null) col++  // Skip low surrogate

Question: Should there be a check to skip past ZWJ sequences in the main advancement logic? The current code may work, but it's unclear if ZWJ components are being handled consistently.

6. Hardcoded Emoji Ranges

Location: GraphemeUtils.kt:221-261

The isEmojiPresentation() function uses hardcoded ranges and specific code points. This works but:

  • Maintenance burden: Unicode adds new emoji regularly (e.g., Unicode 15.1, 16.0)
  • Incomplete coverage: Only includes "commonly rendered as emoji" - what about newer emoji?

Recommendation: Document the Unicode version this is based on (appears to be Unicode 13-14) and add a TODO for future updates.

7. Missing Bounds Check

Location: TerminalCanvasRenderer.kt:164-169

val nextChar = if (col + 1 < ctx.visibleCols) line.charAt(col + 1) else null

This checks ctx.visibleCols, but should it check snapshot.width instead? If the buffer is wider than the visible area, this could miss surrogate pairs at the visible boundary.


🧪 Testing Concerns

  1. Test Script Not in Repo: The PR mentions /tmp/test_emoji_bg.sh, but this isn't committed. Consider adding to repo as test/scripts/test_emoji_bg.sh for reproducibility.

  2. Regression Tests Needed:

    • Test emoji WITHOUT variation selectors (⭐, ✅) - the scaling logic changed significantly
    • Test emoji WITH variation selectors (⚠️, ☁️, ❤️)
    • Test ZWJ sequences (👨‍👩‍👧‍👦)
    • Test surrogate pairs (𝕳, 🎨)
    • Test mixed content (emoji + regular text + symbols)
  3. Visual Regression Testing: Consider screenshot-based testing to catch rendering regressions.


📊 Code Quality

Strengths:

  • ✅ Good variable naming (visualCol, hasVariationSelector)
  • ✅ Clear comments explaining complex logic
  • ✅ Consistent with existing code style

Areas for Improvement:

  • ⚠️ High cyclomatic complexity in renderBackgrounds() (consider extracting helper methods)
  • ⚠️ Magic numbers (0xFE0F, 0x200D, etc.) could be constants
  • ⚠️ Some duplicate logic between renderBackgrounds() and renderText()

🔒 Security

No security concerns identified. The code handles user input (terminal output) but doesn't introduce injection vulnerabilities.


🏎️ Performance

Concerns:

  1. String concatenation in render loop (issue ⚙️ Implement user settings and configuration system #4 above)
  2. Multiple character lookups (charAtCol1, charAtCol2) - could be optimized
  3. FontMgr.default.matchFamilyStyle() called in hot path - should be cached

Impact: Likely minimal for typical use, but could affect performance during high-volume output.


📝 Recommendations

Must Fix (Before Merge):

  1. ❗ Test the new emoji scaling logic thoroughly (issue 🔗 Implement hyperlink detection and click handling #3)
  2. ❗ Verify ZWJ sequence handling is complete (issue 🌏 Implement IME (Input Method Editor) support for CJK languages #5)
  3. ❗ Add test script to repo for reproducibility

Should Fix:

  1. Extract duplicate variation selector detection logic (issue 🔍 Implement text search/find functionality (Ctrl+F) #2)
  2. Document Unicode version for emoji ranges (issue 📝 Add context menu / right-click menu #6)
  3. Consider reverting complex font matching logic (issue feat: Terminal rendering improvements + 4 high-priority features (#2, #3, #4, #5) #1)

Nice to Have:

  1. Extract magic numbers to constants
  2. Add visual regression tests
  3. Profile performance with large emoji-heavy output

🎯 Verdict

Approval Status: ⚠️ Needs Minor Changes

The core fix is sound and addresses a real issue. The visualCol tracking is exactly the right approach. However, there are a few concerns around:

  1. The new emoji scaling logic (potential regression)
  2. Code duplication
  3. Performance considerations

Once the scaling logic is verified with tests and the ZWJ handling is confirmed complete, this should be good to merge.


💡 Alternative Approach to Consider

Instead of duplicating much of the renderText() logic in renderBackgrounds(), consider:

  1. Extract a shared CharacterIterator class that handles the complex advancement logic
  2. Both renderBackgrounds() and renderText() use the same iterator
  3. This guarantees they stay in sync and reduces duplication

This would be a larger refactor but would make the code more maintainable long-term.


Great work on identifying and fixing this tricky rendering issue! The attention to detail with surrogate pairs and variation selectors is impressive. 🎉

- Add buffer-to-visual column conversion functions in TerminalCanvasRenderer
- Update selection highlight rendering to use visual columns
- Add helper functions for grapheme cluster width calculation
- Update ProperTerminal selection handling for proper grapheme boundaries

🤖 Generated with [Claude Code](https://claude.com/claude-code)
@claude
Copy link
Copy Markdown

claude Bot commented Dec 23, 2025

Code Review: Emoji Background Alignment Fix

This PR addresses a critical rendering issue where emoji backgrounds were misaligned due to buffer vs visual column mismatch.

Strengths

  1. Root Cause Analysis - Correctly identifies renderBackgrounds() using buffer positions while renderText() uses visual positions
  2. Comprehensive Solution - Handles surrogate pairs, variation selectors, ZWJ, skin tones, gender symbols
  3. Consistent Logic - Mirrored across BossTerminal.kt and TerminalCanvasRenderer.kt
  4. Enhanced Emoji Detection - isEmojiPresentation() correctly identifies Emoji_Presentation=Yes chars

Critical Issues

1. Code Duplication (High Priority)

bufferColToVisualCol() and visualColToBufferCol() duplicated across:

  • BossTerminal.kt (187 lines)
  • TerminalCanvasRenderer.kt (138 lines)

Recommendation: Extract to shared utility class (ColumnConversionUtils.kt)

2. Performance: Repeated Snapshot Creation

ProperTerminal.kt lines 1067, 1075 call textBuffer.createSnapshot() on EVERY mouse move during drag. Per CLAUDE.md, snapshots = ~200KB allocation.

Problem: Fast drags create dozens of snapshots/sec = GC pressure

Recommendation: Cache snapshot at drag start, reuse during drag

3. Gender Symbol Bug

Lines 178-181 in TerminalCanvasRenderer.kt unconditionally skip gender symbols, but BossTerminal.kt checks if preceded by ZWJ. If ♀ or ♂ appears standalone, background won't render.

Recommendation: Only skip if preceded by ZWJ

4. Inconsistent Error Handling

BossTerminal.kt uses try-catch, TerminalCanvasRenderer.kt uses bounds checking

Recommendation: Standardize approach

Testing Concerns

  • Missing test script (/tmp/test_emoji_bg.sh not in repo)
  • No automated tests for column conversion functions
  • Edge cases: empty lines, RTL/LTR mix, long graphemes

Overall Assessment

  • Code Quality: 7/10 (good but duplicated)
  • Correctness: 9/10 (comprehensive but has edge case bug)
  • Performance: 6/10 (snapshot allocation concern)
  • Test Coverage: 3/10 (missing automated tests)

Verdict: ⚠️ Approve with Changes

Required:

  1. Deduplicate column conversion logic
  2. Optimize drag selection snapshots
  3. Fix gender symbol handling

Great work on root cause analysis and Unicode handling!

- Extract column conversion to shared utility class (ColumnConversionUtils.kt)
- Cache snapshot during drag for performance (reduces GC pressure)
- Fix gender symbol bug - only skip if preceded by ZWJ
- Remove duplicate helper functions from BossTerminal.kt and TerminalCanvasRenderer.kt

🤖 Generated with [Claude Code](https://claude.com/claude-code)
@claude
Copy link
Copy Markdown

claude Bot commented Dec 23, 2025

Code Review: Emoji Background Alignment Fix (Updated)

This PR addresses emoji background/text misalignment by introducing visual column tracking and creating a shared utility for buffer-to-visual column conversion.

✅ Strengths

  1. Architectural Improvement: Created ColumnConversionUtils.kt to centralize buffer/visual column conversion
  2. Comprehensive Unicode Support: Handles surrogate pairs, variation selectors, ZWJ sequences, skin tone modifiers
  3. Grapheme-Aware Selection: ProperTerminal.kt now uses visualColToBufferCol() for proper click-to-select
  4. Performance Optimization: Cached snapshot during drag operations prevents redundant buffer snapshots
  5. Enhanced Emoji Detection: Added isEmojiPresentation() in GraphemeUtils.kt for proper emoji width
  6. Robust Width Calculation: getCharacterVisualWidth() correctly scans forward through extenders

⚠️ Issues & Concerns

1. Code Duplication Still Exists (Medium Priority)

renderBackgrounds() and renderText() still duplicate emoji/surrogate pair detection logic in TerminalCanvasRenderer.kt. Future emoji range updates must be applied in multiple places.

Recommendation: Extract shared logic into a helper method for character analysis.

2. Bounds Checking Edge Case (Low Priority)

TerminalCanvasRenderer.kt:190-192 - nextCharOffset can be 2 when isWcwidthDoubleWidth is true. If col = width - 1, this could read beyond bounds.

3. Selection Rendering Algorithm Change (Review Needed)

Selection now draws a single rectangle per row. visualColToBufferCol() snaps to grapheme boundaries. Is partial grapheme selection intended to expand to full grapheme boundaries? If so, document this behavior.

🧪 Testing Gaps

  1. Unit Tests Missing for ColumnConversionUtils functions
  2. Edge Case Tests: Terminal width boundaries, empty lines
  3. Integration Tests: Verify background rect width matches text width for all emoji types

📊 Performance Impact

Estimated: Background rendering ~5-10% slower (acceptable for 60fps), Drag operations IMPROVED via cachedDragSnapshot. Overall performance impact is negligible.

✨ Recommendations Summary

High Priority

None - no blocking issues

Medium Priority

  1. Extract shared character analysis logic
  2. Add unit tests for ColumnConversionUtils

Low Priority

  1. Fix bounds checking for nextChar lookup
  2. Document emoji range exclusions
  3. Clarify/document partial grapheme selection behavior

Verdict

✅ APPROVED - Ready to merge with minor follow-up improvements recommended.

Why Approve?

  • Fixes the critical background alignment bug
  • Architectural improvement with ColumnConversionUtils
  • Comprehensive emoji support
  • Performance optimization for drag operations
  • No blocking issues identified

Post-Merge Follow-Up

Consider creating follow-up issues for refactoring shared character analysis logic and adding comprehensive unit tests.

Great work on this complex fix! The refactoring to extract ColumnConversionUtils significantly improves maintainability.


🤖 Review generated with Claude Code

- Fixed gender symbol handling in renderBackgrounds() to only skip
  ♀/♂ symbols when preceded by ZWJ (matches ColumnConversionUtils fix)
- Added documentation for grapheme-snapping behavior in selection rendering

🤖 Generated with [Claude Code](https://claude.com/claude-code)
@claude
Copy link
Copy Markdown

claude Bot commented Dec 23, 2025

Code Review: Emoji Background Alignment Fix

Summary

This PR addresses a critical rendering bug where emoji backgrounds were misaligned due to divergence between buffer column positions and visual column positions. The solution introduces proper buffer-to-visual column conversion with comprehensive grapheme cluster handling.

✅ Strengths

1. Root Cause Analysis

The PR correctly identifies that renderBackgrounds() used buffer positions (col) while renderText() used visual positions (visualCol), causing misalignment when invisible characters (variation selectors, ZWJ, etc.) exist in the buffer.

2. Comprehensive Grapheme Handling

The new ColumnConversionUtils.kt properly handles complex Unicode scenarios:

  • Surrogate pairs (characters beyond BMP like 🎨)
  • Variation selectors (U+FE0F, U+FE0E)
  • Zero-Width Joiners (ZWJ sequences like 👨‍💻)
  • Skin tone modifiers (U+1F3FB-U+1F3FF)
  • Gender symbols in ZWJ sequences (♀️, ♂️)

3. Code Reusability

Extracting conversion logic into a shared utility class (ColumnConversionUtils) avoids duplication and ensures consistency across rendering and selection highlighting.

4. Enhanced Emoji Detection

The expanded isEmojiPresentation() function in GraphemeUtils.kt now includes:

  • Dingbats range (U+2700-27BF) with curated emoji subset
  • Misc Symbols range (U+2B00-2BFF)
  • Weather/zodiac symbols (U+2600-26FF)

This matches real-world emoji rendering behavior more accurately.

⚠️ Issues & Concerns

1. Missing Test Coverage (Critical)

  • No unit tests for the new ColumnConversionUtils class
  • No automated tests for the emoji rendering alignment
  • Only manual testing via shell script mentioned

Recommendation:

// Add to bossterm-core-mpp/src/jvmTest/kotlin/
class ColumnConversionUtilsTest {
    @Test
    fun `test buffer to visual conversion with variation selector`() {
        // Test case: ⚠️ = [U+26A0][U+FE0F][DWC]
        // Buffer positions: 0=⚠, 1=FE0F, 2=DWC
        // Visual positions: 0-1 occupied (2 cells)
        val line = createMockLine(charArrayOf('\u26A0', '\uFE0F', CharUtils.DWC))
        assertEquals(0, bufferColToVisualCol(line, 0, 3))
        assertEquals(0, bufferColToVisualCol(line, 1, 3)) // FE0F doesn't advance visual
        assertEquals(2, bufferColToVisualCol(line, 3, 3)) // After grapheme
    }
    
    @Test
    fun `test ZWJ sequence visual width`() {
        // Test case: 👨‍💻 = [👨][ZWJ][💻][DWC]
        // Should occupy 2 visual cells, not 4
    }
}

2. Performance Concerns (Medium)

  • bufferColToVisualCol() has O(n) complexity per call, scanning from column 0 to target
  • renderSelectionHighlight() calls this for every row in the selection range
  • Worst case: selecting 1000 rows × O(100 cols) = 100K character scans per frame

Suggestion:
Implement caching or incremental conversion:

// Cache visual->buffer mappings per line
data class LineConversionCache(
    val lineVersion: Long,
    val visualToBuffer: IntArray,
    val bufferToVisual: IntArray
)

private val conversionCache = LRUCache<LineKey, LineConversionCache>(128)

3. Code Duplication (Medium)

The grapheme skip logic is duplicated between:

  • renderBackgrounds() (lines 137-186)
  • bufferColToVisualCol() in ColumnConversionUtils (lines 30-76)

These must be kept in sync manually, creating maintenance burden.

Recommendation:
Extract shared logic into a single iterator:

class GraphemeAwareColumnIterator(private val line: TerminalLine, private val width: Int) {
    private var bufferCol = 0
    private var visualCol = 0
    
    fun advance(): Boolean {
        if (bufferCol >= width) return false
        val skipCount = countGraphemeExtenders(line, bufferCol, width)
        bufferCol += skipCount + 1
        visualCol += getVisualWidth(line, bufferCol - skipCount - 1, width)
        return true
    }
}

4. Inconsistent Emoji Width Handling (Low)

Line 169 in TerminalCanvasRenderer.kt:

val isBaseDoubleWidth = if (actualCodePoint >= 0x1F100) true else isWcwidthDoubleWidth

This hardcodes U+1F100+ as double-width, but doesn't match the nuanced isEmojiPresentation() logic that checks specific ranges. Some emoji in U+1F100-1F1FF may actually be single-width (regional indicators used individually).

Suggestion:
Consolidate width detection into GraphemeUtils.getGraphemeWidth() and use it consistently.

5. Selection Highlighting Edge Case (Low)

Lines 666-683 in TerminalCanvasRenderer.kt: Selection uses bufferColToVisualCol() but the comment says "Selection automatically snaps to grapheme boundaries."

However, I don't see where this snapping happens. If selection coordinates in ctx.selectionStart/End are raw buffer positions within a grapheme, the conversion may yield incorrect visual bounds.

Verification needed:
Does SelectionEngine already snap to grapheme boundaries before passing to renderer? If not, add:

val snappedStart = snapToGraphemeStart(line, bufColStart)
val snappedEnd = snapToGraphemeEnd(line, bufColEnd)

6. Magic Number (Minor)

Line 155 in TerminalCanvasRenderer.kt:

val isBaseDoubleWidth = if (actualCodePoint >= 0x1F100) true else isWcwidthDoubleWidth

Use a named constant:

private const val EMOJI_BLOCK_START = 0x1F100 // Enclosed Alphanumeric Supplement

🔒 Security

No security concerns identified. Character handling is bounds-checked.

📊 Performance Impact

  • Rendering: O(rows × cols) complexity unchanged, but ~20-30% more operations per character due to grapheme skipping logic
  • Selection: O(selected rows × cols) for conversion - acceptable for typical selections but may lag on full-buffer selection (10K+ rows)

🎯 Recommendations Priority

  1. High: Add unit tests for ColumnConversionUtils (critical for Unicode correctness)
  2. High: Verify selection boundary snapping behavior
  3. Medium: Profile rendering performance with large emoji-heavy buffers
  4. Medium: Consider caching for selection rendering optimization
  5. Low: Refactor to eliminate code duplication between renderer and converter

✅ Approval Conditions

Before merging:

  • Add unit tests covering all edge cases in ColumnConversionUtilsTest.kt
  • Verify selection highlighting works correctly with partial grapheme selection
  • Test performance with 1000+ lines of emoji (e.g., for i in {1..1000}; do echo "⚠️✅⭐❤️"; done)

📝 Minor Code Quality Issues

  1. Line 133: var visualCol = 0 // Track visual position - comment is redundant with clear variable name
  2. Line 164: Complex boolean expression could be extracted to named variable for readability
  3. ColumnConversionUtils.kt line 69-75: Nested if-else could use early return pattern

Overall Assessment

Good: The PR correctly solves a real rendering bug with a technically sound approach.

Needs Work: Test coverage is insufficient for complex Unicode handling logic. Performance impact on selection highlighting needs verification.

Recommendation: Request changes - add tests before merging.


Great work identifying and fixing this tricky rendering issue! The grapheme handling logic is thorough and well-considered. Once tests are added, this will be a solid improvement to BossTerm's emoji support.

…ation

- Created CharacterAnalysis data class to encapsulate character rendering info
- Created analyzeCharacter() helper function shared by renderBackgrounds() and renderText()
- Eliminates ~60 lines of duplicated surrogate pair handling, double-width detection,
  and variation selector logic
- Future emoji range updates now only need to be applied in one place

🤖 Generated with [Claude Code](https://claude.com/claude-code)
@claude
Copy link
Copy Markdown

claude Bot commented Dec 23, 2025

Code Review: Emoji Background Alignment Fix

Overview

This PR addresses a critical rendering bug where emoji backgrounds were misaligned due to the mismatch between buffer column positioning and visual column positioning. The solution introduces proper buffer-visual column conversion utilities and refactors rendering to use visual positioning consistently.

Strengths

1. Well-Architected Solution

  • New ColumnConversionUtils utility (213 lines): Clean abstraction for buffer-visual conversion with comprehensive grapheme handling: ZWJ, variation selectors, skin tones, gender symbols, surrogates
  • analyzeCharacter() helper eliminates duplication between renderBackgrounds() and renderText()
  • isEmojiPresentation() function adds 40+ code point ranges
  • Thorough documentation with KDoc and inline comments

Issues & Concerns

1. Incomplete Emoji Detection Coverage (CRITICAL)
Location: GraphemeUtils.kt:221-262

The isEmojiPresentation() function only covers ~60 emoji code points. The Unicode standard has 1000+ emoji with Emoji_Presentation=Yes. Missing categories: Smileys & People (U+1F600-1F64F), Animals & Nature (U+1F400-1F4FF), Food & Drink (U+1F32D-1F37F). Emoji outside this list will render as 1-cell width when they should be 2-cell, causing backgrounds to only fill half the emoji.

Recommendation: Add broad ranges for common emoji categories or use CharUtils.mk_wcwidth() as fallback.

2. Performance: Redundant Character Analysis
Both renderBackgrounds() and renderText() call analyzeCharacter() for the SAME characters in the SAME frame (potentially 120K calls/sec at 60fps with 2000 cells). Estimated impact: 10-15% rendering slowdown for emoji-heavy screens.

Suggestion: Consider memoization or single-pass rendering. Profile first to measure actual impact.

3. Code Duplication in ColumnConversionUtils
Lines 23-81 (bufferColToVisualCol) and 93-158 (visualColToBufferCol) have ~80% identical skip logic. Consider extracting common helper function.

4. Selection Rendering Verification
The diff is truncated at line 702 where selection rendering uses ColumnConversionUtils. Critical to verify: Does selection properly handle grapheme boundaries? Are buffer-visual conversions correct for inclusive/exclusive modes? Edge case: Selection spanning multi-cell emoji across line boundaries.

5. Missing Test Coverage
No unit tests for ColumnConversionUtils or analyzeCharacter() edge cases. Recommendation: Add ColumnConversionUtilsTest.kt with tests for emoji with variation selectors, ZWJ sequences, surrogate pairs, skin tone modifiers.

6. Code Style Minor Issues

  • Magic numbers (0xFE0F, 0x200D, 0x1F3FB) scattered throughout - extract as named constants
  • Long functions (renderBackgrounds 110 lines, renderText 200+ lines) - consider extracting sub-functions

Security Review

No security concerns identified. Changes are purely rendering logic.


Recommendations Summary

Critical (must fix):

  1. Expand isEmojiPresentation() to cover all common emoji ranges (smileys, animals, food)
  2. Verify selection rendering completeness

High Priority (should fix):
3. Add unit tests for ColumnConversionUtils and edge cases
4. Profile performance and add caching if needed
5. Refactor duplicated logic in conversion functions

Nice-to-Have:
6. Extract magic numbers as named constants
7. Break up long functions


Approval Status

Conditionally approve pending:

  1. Expansion of emoji detection to cover common emoji ranges
  2. Addition of basic unit tests for ColumnConversionUtils
  3. Confirmation that selection rendering is complete

The core architecture is solid and solves the root cause effectively. Great work on a complex Unicode rendering problem!

Point 1 - Extended isEmojiPresentation() to cover:
- All supplementary plane emoji (U+1F100-U+1FAFF)
- BMP emoji with Emoji_Presentation=Yes
- Previously only covered ~60 code points, now covers 1000+ emoji

Point 3 - Extracted common skip logic in ColumnConversionUtils:
- Created shouldSkipChar() helper with SkipResult data class
- Eliminates ~80% code duplication between bufferColToVisualCol/visualColToBufferCol
- Future changes to skip logic only need one update

🤖 Generated with [Claude Code](https://claude.com/claude-code)
@claude
Copy link
Copy Markdown

claude Bot commented Dec 23, 2025

Code Review - PR #136: Fix emoji background alignment

Summary

This PR addresses emoji background rendering misalignment by introducing visual column tracking and comprehensive emoji/grapheme handling. The changes are well-structured and follow the project's established patterns.


✅ Strengths

  1. Excellent Abstraction: ColumnConversionUtils.kt is a clean utility class that centralizes buffer-to-visual column conversion logic, making it reusable across the codebase.

  2. DRY Principle: The analyzeCharacter() helper function eliminates code duplication between renderBackgrounds() and renderText(), reducing maintenance burden.

  3. Comprehensive Grapheme Handling: Properly handles complex Unicode scenarios:

    • Surrogate pairs (U+10000+)
    • Variation selectors (FE0E, FE0F)
    • ZWJ sequences (👨‍💻)
    • Skin tone modifiers (👍🏽)
    • Gender symbols in sequences
  4. Consistent Column Advancement: Both renderBackgrounds() and renderText() now use the same visualCol tracking logic, ensuring perfect alignment.

  5. Good Documentation: Clear comments explain the buffer format (e.g., [⚠][FE0F][DWC]) and rationale for each skip condition.


🔍 Code Quality Issues

1. Potential Off-by-One Error in analyzeCharacter() (Lines 147-152)

// Check for DWC at col+1, OR DWC at col+2 when col+1 is variation selector
val hasVariationSelectorAtCol1 = charAtCol1 != null && (charAtCol1.code == 0xFE0F || charAtCol1.code == 0xFE0E)
val isWcwidthDoubleWidth = charAtCol1 == CharUtils.DWC ||
    (hasVariationSelectorAtCol1 && charAtCol2 == CharUtils.DWC) ||
    wcwidthResult

Issue: For surrogate pairs, the DWC marker position calculation may be incorrect. Consider:

  • Buffer: [HIGH][LOW][FE0F][DWC] (emoji+variation selector)
  • col+1 is LOW surrogate, NOT FE0F
  • This logic would miss the DWC at col+3

Recommendation: Use ColumnConversionUtils.getCharacterVisualWidth() instead of duplicating this logic, or add explicit surrogate pair handling before checking for variation selectors.


2. Redundant Width Calculation

In analyzeCharacter(), the code calculates isWcwidthDoubleWidth, isBaseDoubleWidth, isDoubleWidth, AND visualWidth (lines 143-164). This creates confusion about which value is authoritative.

Recommendation: Simplify to a single source of truth:

val visualWidth = ColumnConversionUtils.getCharacterVisualWidth(line, col, width)
val isDoubleWidth = visualWidth == 2

3. Missing Bounds Check in renderBackgrounds() (Line 289)

if (col > 0 && line.charAt(col - 1).code == 0x200D) {

Issue: col - 1 is checked for bounds, but col itself might be >= ctx.visibleCols, causing an out-of-bounds exception.

Fix: Add bounds check for col:

if (col > 0 && col < ctx.visibleCols && line.charAt(col - 1).code == 0x200D) {

4. Inconsistent Emoji Detection Logic

GraphemeUtils.isEmojiPresentation() (lines 212-289) and analyzeCharacter() line 169-180 have overlapping but inconsistent emoji classification:

  • GraphemeUtils: Includes U+2705 (✅), U+2B50 (⭐), U+2728 (✨)
  • analyzeCharacter: Excludes U+2605/U+2606 (★/☆) from U+2600-26FF range

This is confusing because ⭐ (U+2B50) is classified as emoji, but ★ (U+2605) is not, despite visual similarity.

Recommendation: Document the distinction (Unicode Emoji_Presentation vs. text symbols) or unify the logic.


5. Performance: Repeated Line Access in renderBackgrounds()

Lines 257-293 make multiple line.charAt(col) calls without caching:

  • Line 242: line.charAt(col)
  • Line 274: line.charAt(col + 1)
  • Line 289: line.charAt(col - 1)

Recommendation: Cache frequently accessed characters:

val char = line.charAt(col)
val prevChar = if (col > 0) line.charAt(col - 1) else null
val nextChar = if (col + 1 < ctx.visibleCols) line.charAt(col + 1) else null

🐛 Potential Bugs

6. Infinite Loop Risk in ZWJ Skip Logic (Lines 260-268)

if (char.code == 0x200D) {
    col++
    while (col < ctx.visibleCols) {
        val nextChar = line.charAt(col)
        if (nextChar == CharUtils.DWC) break
        col++
    }
    continue
}

Issue: If the line is malformed (missing DWC after ZWJ sequence), this loop will consume all remaining columns without advancing visualCol, causing alignment drift.

Recommendation: Add a counter or validate buffer integrity:

var skipped = 0
while (col < ctx.visibleCols && skipped < 10) {  // Max 10 chars in ZWJ sequence
    val nextChar = line.charAt(col)
    if (nextChar == CharUtils.DWC) break
    col++
    skipped++
}

7. Missing visualCol Update in Skip Paths

When skin tone modifiers or gender symbols are skipped (lines 271-293), visualCol is NOT advanced. This is correct IF these characters don't consume visual space, but if they do, backgrounds will misalign again.

Verification needed: Confirm that skin tone modifiers never render as separate cells. If they do, visualCol must be updated.


🧪 Testing Concerns

  1. No Unit Tests: This PR adds 662 lines of complex logic but no automated tests. Critical edge cases are not covered:

    • Emoji at line boundaries (col = width - 1)
    • Malformed buffers (missing DWC, orphaned surrogates)
    • ZWJ sequences without DWC markers
    • Mixed RTL/LTR text with emoji
  2. Manual Testing Only: The test plan relies on visual inspection (/tmp/test_emoji_bg.sh). Consider adding:

    • Unit tests for ColumnConversionUtils (buffer ↔ visual conversion)
    • Unit tests for analyzeCharacter() with known emoji sequences
    • Snapshot tests for rendering output

🔐 Security Considerations

No significant security issues found. The code handles buffer bounds checks appropriately in most cases (except issue #3 above).


📊 Performance Considerations

  1. Per-Character Analysis Overhead: analyzeCharacter() is called for every visible cell, performing 3-4 lookahead checks. For 80x24 terminal = 1,920 calls per frame @ 60fps = 115,200 calls/sec.

    Mitigation: The existing VersionedBufferSnapshot and render batching likely amortize this cost, but profile with large terminals (e.g., 200x50).

  2. GraphemeUtils.isEmojiPresentation(): Long when/range chain (78 lines). Consider a lookup table or range tree for O(1) checks.


✨ Suggestions for Improvement

  1. Extract Constants: Magic numbers like 0x200D, 0xFE0F, 0x1F3FB should be named constants:

    private const val ZERO_WIDTH_JOINER = 0x200D
    private const val VARIATION_SELECTOR_15 = 0xFE0E
    private const val VARIATION_SELECTOR_16 = 0xFE0F
    private const val SKIN_TONE_MIN = 0x1F3FB
    private const val SKIN_TONE_MAX = 0x1F3FF
  2. Add Debug Assertions: In renderBackgrounds(), assert that visualCol <= col (visual position should never exceed buffer position).

  3. Document Buffer Format: Add a top-level comment explaining the buffer layout for complex graphemes:

    /**
     * Buffer Layout Examples:
     * - Simple: [a][b][c]
     * - CJK: [中][DWC][国][DWC]
     * - Emoji: [🎨][DWC]
     * - Emoji+VS: [⚠][FE0F][DWC]
     * - Surrogate: [HIGH][LOW][DWC]
     * - ZWJ: [👨][DWC][ZWJ][💻][DWC]
     */
    

📝 Final Recommendation

Approve with Minor Revisions

This PR successfully fixes the emoji background alignment issue and introduces well-structured utilities. The core approach (visual column tracking) is sound. However, the following must be addressed before merge:

Critical (must fix):

High Priority (strongly recommended):

Low Priority (nice to have):


🎯 Alignment with CLAUDE.md

Matches patterns: Uses existing CharUtils.DWC, VersionedBufferSnapshot, grapheme handling
Clear comments: Explains complex logic (buffer layouts, skip conditions)
No backwards-compat hacks: Direct, clean implementation
Security: Proper bounds checking (mostly)


Great work on tackling this complex Unicode rendering challenge! The visual column tracking approach is the right solution. With the identified issues addressed, this will be a solid addition to BossTerm. 🚀

Symbols like ▶◀⏹⏺ are commonly used as 1-cell text symbols in TUI
applications. They should only render as 2-cell emoji when explicitly
followed by variation selector FE0F.

Removed from isEmojiPresentation():
- Media controls (⏩⏪⏫⏬⏭⏮⏯⏰⏱⏲⏳⏸⏹⏺)
- Play/reverse buttons (▶◀)
- Small squares (▪▫◻◼◽◾)

🤖 Generated with [Claude Code](https://claude.com/claude-code)
@claude
Copy link
Copy Markdown

claude Bot commented Dec 23, 2025

Code Review: Emoji Background Alignment Fix

This PR addresses a critical rendering bug where emoji backgrounds were misaligned. The solution introduces visual column tracking and proper Unicode handling. Overall, this is high-quality work with excellent attention to detail.


✅ Strengths

1. Correct Problem Diagnosis

The root cause analysis is spot-on: renderBackgrounds() used buffer positions while renderText() used visual positions, causing divergence when invisible characters (variation selectors, ZWJ, etc.) are present.

2. Comprehensive Unicode Handling

  • Proper surrogate pair detection with bounds checking
  • ZWJ sequence handling (e.g., 👨‍💻)
  • Skin tone modifiers (U+1F3FB-U+1F3FF)
  • Gender symbols in ZWJ sequences (♀️ ♂️)
  • Variation selectors (FE0E, FE0F)

3. Code Reusability

  • analyzeCharacter() eliminates duplication between renderBackgrounds() and renderText()
  • ColumnConversionUtils provides reusable conversion logic
  • Well-encapsulated logic in private helper functions

4. Documentation

Excellent KDocs explaining complex Unicode behavior and edge cases.


🔍 Issues & Suggestions

1. Critical: Incomplete Diff in Rendering Code

Location: compose-ui/.../rendering/TerminalCanvasRenderer.kt (line 564)

The diff shows truncation at:

if (analys
... [387 lines truncated] ...

This suggests the PR may have incomplete changes or the diff is corrupted. Please verify that:

  • renderText() properly uses visualCol tracking
  • All column advancement logic matches between renderBackgrounds() and renderText()
  • The build actually succeeds with the full changes

2. Performance: Redundant Character Analysis

Location: TerminalCanvasRenderer.kt lines 247, 507

Both renderBackgrounds() and renderText() call analyzeCharacter() for every character. This duplicates work (surrogate pair detection, DWC scanning, etc.).

Suggestion: Consider a single-pass rendering approach or caching analysis results:

// Option 1: Single pass with combined rendering
data class CharRenderInfo(val analysis: CharacterAnalysis, val style: TextStyle, ...)

// Option 2: Cache analysis in a sparse map (if memory allows)
val analysisCache = mutableMapOf<Int, CharacterAnalysis>()

3. Potential Bug: Gender Symbol ZWJ Detection

Location: ColumnConversionUtils.kt lines 67-71, TerminalCanvasRenderer.kt lines 383-387

if (col > 0 && line.charAt(col - 1).code == 0x200D) {
    return SkipResult(true, 1)
}

This assumes ZWJ is immediately before the gender symbol. But in complex emoji like 👨‍⚕️ (man health worker), the buffer layout is:

[👨(high)][👨(low)][DWC][ZWJ][⚕][FE0F][DWC]

Gender symbols might have variation selectors between them and the ZWJ. Consider:

// Look back through variation selectors to find ZWJ
var prevCol = col - 1
while (prevCol >= 0) {
    val prevChar = line.charAt(prevCol)
    if (prevChar.code == 0x200D) return SkipResult(true, 1)
    if (prevChar.code != 0xFE0E && prevChar.code != 0xFE0F) break
    prevCol--
}

4. Emoji Detection Ranges Incomplete

Location: GraphemeUtils.kt lines 225-285

The new isEmojiPresentation() function is excellent, but missing some ranges:

  • Symbols and Pictographs Extended-A: Listed but code checks 0x1FA70..0x1FAFF (should this be 0x1FA00..0x1FAFF?)
  • Missing ranges:
    • Enclosed Ideographic Supplement (U+1F200-U+1F2FF)
    • Mahjong/Domino tiles (U+1F000-U+1F02F)
    • Playing cards (U+1F0A0-U+1F0FF)

Suggestion: Cross-reference with Unicode 15.0 Emoji data files or use ICU4J's emoji properties.

5. Inconsistent Visual Width Calculation

Location: ColumnConversionUtils.kt line 154, TerminalCanvasRenderer.kt line 247

Two different approaches to visual width:

  1. ColumnConversionUtils.getCharacterVisualWidth() - Scans for DWC marker
  2. analyzeCharacter() - Combines wcwidth + emoji detection

Recommendation: Consolidate to a single source of truth. Consider using GraphemeUtils.getGraphemeWidth() which already has caching and comprehensive emoji handling.

6. Missing Edge Case Tests

Per CLAUDE.md, the project uses comprehensive testing. Consider adding tests for:

  • ZWJ sequences: 👨‍👩‍👧‍👦, 👨‍💻, 🏳️‍🌈
  • Skin tone variants: 👍, 👍🏻, 👍🏽, 👍🏿
  • Gender variants: 🧑‍⚕️ vs 👨‍⚕️ vs 👩‍⚕️
  • Mixed buffer: ASCII + emoji + CJK
  • Cursor positioning: Click-to-position on emoji boundaries

7. Potential Optimization: Early Exit

Location: ColumnConversionUtils.kt line 154

fun getCharacterVisualWidth(line: TerminalLine, col: Int, width: Int): Int {
    if (col >= width) return 1
    
    val char = line.charAt(col)
    
    // Early exit for ASCII (most common case)
    if (char.code < 0x80) {
        return if (col + 1 < width && line.charAt(col + 1) == CharUtils.DWC) 2 else 1
    }
    
    // ... complex Unicode handling ...
}

This avoids surrogate pair checks for ASCII, which is likely 90%+ of terminal content.


🔒 Security Considerations

No security issues identified. The code:

  • Uses proper bounds checking for array accesses
  • No external input parsing beyond terminal output (already trusted)
  • No potential for buffer overflows or injection attacks

🚀 Performance Considerations

Current Impact: Acceptable for typical usage

  • Dual-pass rendering (backgrounds → text) now matches column iteration logic
  • Character analysis adds overhead but is unavoidable for correct emoji rendering

Future Optimizations:

  1. Cache CharacterAnalysis results between background/text passes
  2. Use SIMD for ASCII fast-path detection
  3. Leverage existing GraphemeUtils LRU cache (1024 entries)

📋 Test Coverage

Per the PR description, manual testing was done. Recommendation: Add automated tests:

@Test
fun testEmojiBackgroundAlignment() {
    val line = createLineWith("⚠️ ☁️ ❤️")
    val bgPositions = renderBackgrounds(line)
    val textPositions = renderText(line)
    
    assertEquals(bgPositions, textPositions, "Background and text positions must match")
}

🎯 Final Verdict

Approve with minor concerns

This PR solves a real bug with a solid architectural approach. The main concerns are:

  1. Verify the truncated diff - ensure changes are complete
  2. Consider performance optimization - cache analysis between passes
  3. Add edge case tests - ZWJ sequences, skin tones, etc.

The code follows CLAUDE.md conventions:

  • ✅ Proper comments for non-obvious logic
  • ✅ Clear function names and documentation
  • ✅ No backwards-compatibility hacks
  • ✅ Security-conscious bounds checking

Recommendation: Merge after verifying the diff completeness and considering the performance optimization.


Review generated by Claude Code assistant per repository guidelines

The renderer's isEmojiOrWideSymbol was too broad (all 0x2600-0x26FF),
causing symbols like ☐☑☒☠☢☣☮☯☸ to render as 2-cell while buffer
only allocated 1-cell (no DWC marker).

Now isEmojiOrWideSymbol matches GraphemeUtils.isEmojiPresentation()
exactly, ensuring renderer and buffer are consistent.

🤖 Generated with [Claude Code](https://claude.com/claude-code)
@claude
Copy link
Copy Markdown

claude Bot commented Dec 23, 2025

Fresh Code Review: Emoji Background Alignment Fix (PR #136)

Executive Summary

This PR successfully addresses emoji background misalignment by introducing visual column tracking and creating robust utility functions for buffer-to-visual column conversion. The implementation is comprehensive, well-documented, and handles complex Unicode scenarios correctly.


✅ Major Strengths

1. Excellent Architectural Design

  • ColumnConversionUtils.kt (196 lines): Clean, focused utility class with well-defined responsibilities
  • CharacterAnalysis data class: Encapsulates all character properties, eliminating duplication
  • analyzeCharacter() helper: Single source of truth for character classification shared between renderBackgrounds() and renderText()

2. Comprehensive Unicode Support

The implementation correctly handles:

  • Surrogate pairs (U+10000+)
  • Variation selectors (U+FE0F, U+FE0E)
  • Zero-Width Joiners (U+200D)
  • Skin tone modifiers (U+1F3FB-U+1F3FF)
  • Gender symbols in ZWJ sequences

3. Enhanced Emoji Detection

GraphemeUtils.isEmojiPresentation() (lines 221-286):

  • Carefully curated emoji ranges based on actual Emoji_Presentation=Yes behavior
  • Excellent inline comments explaining WHY certain ranges are excluded
  • Distinguishes between unambiguously emoji vs context-dependent symbols

4. Visual Position Consistency

renderBackgrounds() now uses visualCol tracking to match renderText(), ensuring perfect alignment


🔍 Issues Found

Issue #1: Potential Off-by-One in ZWJ Sequence Skipping (Low Priority)

Location: TerminalCanvasRenderer.kt:345-356

After the ZWJ skip loop, col points to the DWC marker, then the outer loop increments col++ again (line 360). This skips the DWC correctly, BUT if we're at the last column, col could exceed bounds before the next iteration's check.

Likelihood: Very low (would require ZWJ sequence ending exactly at terminal width boundary)

Recommendation: Add explicit DWC skip inside the break condition


Issue #2: analyzeCharacter() Calls in Hot Path (Medium Priority - Performance)

Location: TerminalCanvasRenderer.kt:269, 538

Both renderBackgrounds() and renderText() call analyzeCharacter() for every character in every frame. For a 200x80 terminal (16,000 cells) at 60fps, that's 960,000 calls/sec.

Estimated Impact: 5-10% frame time increase (acceptable for 60fps, but could cause drops during high-volume streaming)

Verdict: Not blocking for merge, but consider profiling if users report performance issues


Issue #3: Missing Unit Tests (High Priority)

The PR adds 196 lines of critical Unicode handling logic in ColumnConversionUtils.kt, but no unit tests are included.

Required Test Cases:

  • Test basic ASCII (fast path)
  • Test surrogate pair handling
  • Test variation selector skipping
  • Test ZWJ sequence width
  • Test skin tone modifiers
  • Test gender symbols in ZWJ vs standalone
  • Test boundary conditions (empty line, last column, width=1)

Recommendation: Add these tests before merging to prevent regressions


📊 Performance Analysis

Operation Before After Impact
renderBackgrounds() O(cols) O(cols) with +5 checks/char +10-15% time
renderText() O(cols) O(cols) unchanged ~0%
Selection rendering O(rows×cols) O(rows×cols) with conversion +5% time

Overall: Acceptable for 60fps rendering. Profile if frame drops occur during high-volume output.


📋 Recommendations

Must Fix Before Merge:

  1. Already Fixed: Code duplication eliminated via analyzeCharacter()
  2. ⚠️ Action Required: Add unit tests for ColumnConversionUtils (critical for Unicode correctness)

Should Fix (Optional):

  1. Fix potential off-by-one in ZWJ skip logic (Issue feat: Terminal rendering improvements + 4 high-priority features (#2, #3, #4, #5) #1)
  2. Verify selection highlighting with truncated diff

Future Enhancements:

  1. Add performance profiling for emoji-heavy workloads
  2. Consider caching analyzeCharacter() results per line
  3. Add visual regression tests (screenshot comparison)

✅ Final Verdict: APPROVE WITH CONDITION

Approval Condition: Add unit tests for ColumnConversionUtils before merge

Why Approve:

  • ✅ Correctly solves the root cause (buffer vs visual column mismatch)
  • ✅ Comprehensive emoji/grapheme handling
  • ✅ Clean architecture with shared utilities
  • ✅ No security concerns
  • ✅ Performance impact acceptable
  • ✅ Well-documented code

Why Tests Required:

  • Complex Unicode logic is error-prone
  • Regressions would be difficult to catch manually
  • 196 lines of new logic without tests is risky

Post-Merge Follow-Up:

  • Profile rendering performance with emoji-heavy buffers
  • Add integration tests for end-to-end alignment verification
  • Consider optimization if performance issues arise

🎉 Closing Remarks

This is excellent work on a complex Unicode rendering issue. The solution is well-thought-out, properly handles edge cases, and follows good engineering practices. The ColumnConversionUtils abstraction is a solid addition to the codebase that will benefit future features.

Once unit tests are added, this PR will be production-ready. Great job! 👏


🤖 Review by Claude Code (Sonnet 4.5)

Add AnalysisCache to store CharacterAnalysis results between render passes:
- renderBackgrounds() populates cache during Pass 1
- renderText() reuses cached analysis in Pass 2
- Fallback to compute if cache miss (shouldn't happen in normal flow)

This eliminates ~50% of analyzeCharacter() calls in the hot render path.

🤖 Generated with [Claude Code](https://claude.com/claude-code)
@claude
Copy link
Copy Markdown

claude Bot commented Dec 23, 2025

Code Review: PR #136 - Emoji Background Alignment Fix

Summary

This PR successfully fixes the emoji background misalignment issue by introducing proper buffer column ↔ visual column conversion. The implementation is well-structured with effective performance optimizations. Recommend approval with minor follow-up improvements noted below.

Overall Assessment

  • Code Quality: 8.5/10 - Well-structured, readable, comprehensive
  • Correctness: 9/10 - Core logic is sound with minor edge cases
  • Performance: 9/10 - Excellent caching strategy, minimal hot-path overhead
  • Completeness: 7/10 - Missing unit tests, could expand emoji coverage

Strengths

1. Architecture ✅

  • Clear separation of concerns with dedicated ColumnConversionUtils
  • Shared analyzeCharacter() helper eliminates duplication
  • Well-designed CharacterAnalysis data class encapsulates complex logic
  • Proper use of caching to avoid redundant computation

2. Performance Optimization ✅

  • 50% reduction in character analysis calls via analysisCache
  • Zero allocations in hot path (all primitive operations)
  • Pixel boundary rounding prevents anti-aliasing artifacts
  • Snapshot caching for drag operations (~2ms vs repeated 1ms snapshots)

3. Unicode Handling ✅

  • Comprehensive surrogate pair support
  • Proper ZWJ sequence handling
  • Skin tone modifier detection (U+1F3FB-1F3FF)
  • Variation selector awareness
  • Gender symbol handling in ZWJ contexts

4. Correctness ✅

  • Visual column tracking properly separates from buffer positions
  • Buffer column advancement logic handles all grapheme extenders
  • Fallback logic for cache misses prevents crashes
  • Boundary checks throughout

Issues Found

Medium Priority

1. ColumnConversionUtils: Orphaned Low Surrogate Handling

Location: ColumnConversionUtils.kt:50-52

Issue: Unconditionally skips ALL low surrogates without validation

if (Character.isLowSurrogate(char)) {
    return SkipResult(true, 1)
}

Problem: If buffer contains orphaned low surrogate (not preceded by high surrogate), this causes column misalignment.

Suggested Fix:

if (Character.isLowSurrogate(char)) {
    // Only skip if preceded by high surrogate
    if (col > 0 && Character.isHighSurrogate(line.charAt(col - 1))) {
        return SkipResult(true, 1)
    }
}

Severity: Low in practice (orphaned surrogates are rare), but worth fixing for robustness.


2. Emoji Detection Inconsistency

Location: GraphemeUtils.kt vs TerminalCanvasRenderer.kt:171-195

Issue: Watch/Hourglass emoji (U+231A-B) included in GraphemeUtils.isEmojiPresentation() but missing from analyzeCharacter()'s isEmojiOrWideSymbol check.

Impact: Possible font selection mismatch for ⌚⏳ characters.

Suggested Fix: Add to TerminalCanvasRenderer.kt:171-195:

// Watch and hourglass - commonly rendered as emoji
codePoint == 0x231A || codePoint == 0x231B -> true

Better Fix: Deduplicate by making GraphemeUtils.isEmojiPresentation() public and calling it directly:

val isEmojiOrWideSymbol = GraphemeUtils.isEmojiPresentation(actualCodePoint)

Low Priority

3. Incomplete Dingbats Range

Location: GraphemeUtils.kt:256-271

Current Coverage: 0x2705, 0x2728, 0x274C, 0x274E, 0x2753-2755, 0x2757, 0x2795-2797, 0x27B0, 0x27BF

Missing Notable Emoji:

  • U+270A-270D (✊✋✌✍ Raised fist, victory hand, writing hand)
  • U+2708 (✈ Airplane) - though also used as text symbol
  • U+2709 (✉ Envelope)

Note: This appears to be a deliberate design decision (focusing on unambiguous emoji), but worth documenting the rationale.


Performance Analysis

Hot Path Impact (60 FPS rendering)

Before: ~2× redundant character analysis (once in backgrounds, once in text)
After: ~1× analysis + cache lookups

Memory Overhead:

  • AnalysisCache: 10-50 KB (100 bytes × 80×24 to 200×50 terminals)
  • cachedDragSnapshot: ~200 KB during drag operations

Trade-off: 50KB memory for 50% CPU reduction. Worth it.

Selection Performance

  • Snapshot creation: ~1ms
  • Grapheme traversal: O(visualCol), typically <1ms
  • Total click latency: <2ms (negligible UX impact)

Missing Test Coverage

The PR mentions /tmp/test_emoji_bg.sh but it's not in the repo. Recommended unit tests:

1. Column Conversion Tests

@Test fun `bufferColToVisualCol handles emoji with variation selector`()
@Test fun `bufferColToVisualCol skips DWC markers`()
@Test fun `bufferColToVisualCol handles ZWJ sequences`()
@Test fun `visualColToBufferCol snaps to grapheme boundaries`()
@Test fun `round trip conversion preserves position`()

2. Character Analysis Tests

@Test fun `analyzeCharacter detects surrogate pairs correctly`()
@Test fun `analyzeCharacter identifies double-width emoji`()
@Test fun `analyzeCharacter handles variation selectors`()

3. Edge Cases

@Test fun `handles orphaned surrogate pairs gracefully`()
@Test fun `handles buffer width boundaries`()
@Test fun `cache hit rate exceeds 95 percent`()

Recommendations for Follow-up

High Priority

  1. Add unit tests for column conversion and character analysis
  2. Fix orphaned surrogate handling in ColumnConversionUtils
  3. Deduplicate emoji detection logic - make isEmojiPresentation() the single source of truth

Medium Priority

  1. Document cache behavior - invalidation strategy, lifetime, partial redraws
  2. Expand Dingbats coverage or document exclusion rationale
  3. Add integration test for emoji background rendering

Low Priority

  1. Consider adding metrics for cache hit rate in debug builds
  2. Profile memory overhead with large terminals (200×50+)

Security Considerations

No security concerns identified. Buffer boundary checks are properly implemented throughout.


Conclusion

This is solid work that fixes the reported bug effectively while improving code organization and performance. The column conversion utility is well-designed and will be useful for future features (e.g., grapheme-aware cursor movement, word selection).

The caching strategy demonstrates good engineering judgment - the 50KB memory overhead is trivial compared to the CPU savings in the render loop.

The minor issues noted above are not blockers for merge. They can be addressed in follow-up PRs.

Recommendation: APPROVE and merge


Review generated with technical analysis of 1106-line diff across 4 modified files + 1 new file. Focused on correctness, performance, Unicode handling, and architectural patterns per CLAUDE.md guidelines.

Flag emoji like 🇺🇸 🇬🇧 🇯🇵 are composed of two Regional Indicator symbols
(U+1F1E6-U+1F1FF). This change adds proper handling for them:

GraphemeUtils.kt:
- Added Regional Indicator sequence detection in calculateGraphemeWidth()
- Flags now correctly return width 2

TerminalCanvasRenderer.kt:
- Added checkRegionalIndicatorSequence() helper function
- renderBackgrounds() now handles flags as single 2-cell units
- renderText() now renders flags via renderZWJSequence()

🤖 Generated with [Claude Code](https://claude.com/claude-code)
@claude
Copy link
Copy Markdown

claude Bot commented Dec 23, 2025

Code Review: Emoji Background Rendering Alignment

Summary

This PR fixes a significant rendering issue where emoji backgrounds were misaligned due to divergence between buffer column positioning and visual column positioning. The solution introduces proper visual column tracking and shared character analysis logic.

✅ Strengths

1. Well-Structured Architecture

  • Excellent separation of concerns with ColumnConversionUtils.kt (196 lines) providing reusable buffer↔visual conversion logic
  • CharacterAnalysis data class (lines 88-107 in TerminalCanvasRenderer.kt) elegantly encapsulates all character metadata
  • Shared analyzeCharacter() helper eliminates code duplication between renderBackgrounds() and renderText()
  • Analysis caching (AnalysisCache) prevents redundant computation across render passes - smart optimization

2. Comprehensive Unicode Support

  • Properly handles surrogate pairs (high + low surrogates)
  • Correctly skips zero-width characters (DWC markers, variation selectors, ZWJ, skin tone modifiers)
  • Handles Regional Indicator sequences (flag emoji like 🇺🇸)
  • Gender symbol handling in ZWJ sequences (♀️, ♂️)

3. Consistency with Existing Patterns

  • Follows the project's snapshot-based rendering architecture (matching CLAUDE.md guidelines)
  • Uses the existing CharUtils.DWC marker convention
  • Aligns with GraphemeUtils.isEmojiPresentation() logic for consistent emoji classification

4. Documentation

  • Excellent inline comments explaining complex logic (e.g., "Buffer = [⚠][FE0F][DWC] - DWC is at col+2")
  • Clear KDoc for public APIs in ColumnConversionUtils
  • Good PR description with test plan

⚠️ Issues & Concerns

1. Critical: Potential Index Out of Bounds

Location: TerminalCanvasRenderer.kt lines 320-336 (Regional Indicator sequence handling)

// This assumes 4 chars always exist (2 surrogate pairs)
col += 4  // Skip both Regional Indicators (2 surrogate pairs = 4 chars)
while (col < ctx.visibleCols && line.charAt(col) == CharUtils.DWC) {
    col++  // Skip any trailing DWC markers
}

Problem: If the buffer is malformed or the flag is at the end of a line, accessing col+4 could exceed ctx.visibleCols. The while loop guards against OOB, but the initial col += 4 doesn't check bounds.

Fix: Add boundary check:

// Skip all chars in the flag sequence (2 surrogate pairs + possible DWC markers)
val charsToSkip = minOf(4, ctx.visibleCols - col)
col += charsToSkip
while (col < ctx.visibleCols && line.charAt(col) == CharUtils.DWC) {
    col++
}

2. Bug: Missing Verification for checkRegionalIndicatorSequence()

Location: TerminalCanvasRenderer.kt line 321

The code calls checkRegionalIndicatorSequence(line, col, ctx.visibleCols) but this function is NOT visible in the diff. Either:

  • This function is newly added but missing from the PR (compilation error), OR
  • It's defined elsewhere but not shown in the diff

Action Required: Please verify this function exists and is accessible. If it's new, it should be in the diff.

3. Performance: Redundant Gender Symbol Check

Location: ColumnConversionUtils.kt lines 66-71

// Skip gender symbols only when preceded by ZWJ (part of ZWJ sequences)
if (char.code == 0x2640 || char.code == 0x2642) {
    if (col > 0 && line.charAt(col - 1).code == 0x200D) {
        return SkipResult(true, 1)
    }
}

Problem: This checks col > 0 and reads the previous character, but if the previous char was ZWJ, it would have been skipped already (lines 45-48). This check might never trigger.

Question: Is this dead code, or is there an edge case where gender symbols appear after ZWJ that wasn't already handled? Consider removing if redundant.

4. Code Duplication: Emoji Classification Logic

Location:

  • GraphemeUtils.isEmojiPresentation() (lines 221-291)
  • TerminalCanvasRenderer.analyzeCharacter() (lines 207-216)

Both contain nearly identical emoji code point range checks. The renderer's version is a subset.

Recommendation: The renderer should call GraphemeUtils.isEmojiPresentation() instead of duplicating the logic. This ensures consistency and reduces maintenance burden.

Suggested refactor in analyzeCharacter():

// Character classification for font selection
val isCursiveOrMath = actualCodePoint in 0x1D400..0x1D7FF
val isTechnicalSymbol = actualCodePoint in 0x23E9..0x23FF
val isEmojiOrWideSymbol = GraphemeUtils.isEmojiPresentation(actualCodePoint)

However, note that GraphemeUtils.isEmojiPresentation() is private. You'd need to make it internal/public first.

5. Edge Case: Apple Color Emoji Font Fallback

Location: TerminalCanvasRenderer.kt lines 917-925

val appleColorEmoji = FontMgr.default.matchFamilyStyle("Apple Color Emoji", ...)
if (appleColorEmoji != null) {
    FontFamily(androidx.compose.ui.text.platform.Typeface(appleColorEmoji))
} else {
    FontFamily.Default
}

Issue: If Apple Color Emoji font is not found on macOS (unlikely but possible), falls back to FontFamily.Default. Consider logging a warning since this would break emoji rendering on macOS.

Suggestion:

} else {
    println("WARNING: Apple Color Emoji font not found, falling back to system default")
    FontFamily.Default
}

6. Minor: Unclear Variable Naming

Location: TerminalCanvasRenderer.kt lines 271-277

val charAtCol1 = if (col + 1 < width) line.charAt(col + 1) else null
val charAtCol2 = if (col + 2 < width) line.charAt(col + 2) else null

Suggestion: Rename to charAtNextCol and charAtCol2 or nextChar and char2 for clarity. The "1" in "Col1" is ambiguous (is it column 1 absolute, or col+1?).

🧪 Testing Concerns

Missing Test Cases

The PR description mentions a test script (/tmp/test_emoji_bg.sh) but it's not included in the PR. Consider adding:

  1. Unit tests for ColumnConversionUtils:

    • Test buffer↔visual conversion with surrogate pairs
    • Test edge cases (empty lines, DWC markers at end)
    • Test ZWJ sequences and skin tone modifiers
  2. Integration tests for rendering:

    • Verify backgrounds align for all emoji types
    • Test flag emoji (Regional Indicators)
    • Test mixed emoji+text lines
  3. Performance benchmarks:

    • Measure rendering performance with the analysis cache
    • Compare memory usage before/after

🔒 Security Assessment

No security concerns identified. The changes are purely rendering logic with proper bounds checking (after fixing issue #1 above).

📊 Performance Impact

Positive:

  • ✅ Analysis cache eliminates redundant character analysis between render passes
  • ✅ Shared helper functions reduce code duplication

Neutral:

  • The additional visual column tracking adds minimal overhead
  • The SkipResult data class allocations are negligible (escape analysis should optimize)

📝 Code Quality Score: 7.5/10

Strengths: Excellent architecture, comprehensive Unicode support, good documentation
Deductions: Missing bounds check (-1.5), potential dead code (-0.5), code duplication (-0.5)

✅ Recommendations

Must Fix (Blocking):

  1. ✅ Add bounds check in Regional Indicator handling (issue feat: Terminal rendering improvements + 4 high-priority features (#2, #3, #4, #5) #1)
  2. ✅ Verify checkRegionalIndicatorSequence() exists (issue 🔍 Implement text search/find functionality (Ctrl+F) #2)

Should Fix (High Priority):

  1. 🔶 Remove redundant gender symbol check or document edge case (issue 🔗 Implement hyperlink detection and click handling #3)
  2. 🔶 Deduplicate emoji classification logic (issue ⚙️ Implement user settings and configuration system #4)
  3. 🔶 Add warning for missing Apple Color Emoji font (issue 🌏 Implement IME (Input Method Editor) support for CJK languages #5)

Nice to Have (Low Priority):

  1. 📝 Improve variable naming for clarity (issue 📝 Add context menu / right-click menu #6)
  2. 📝 Add unit tests for ColumnConversionUtils
  3. 📝 Include test script in PR

🎯 Verdict

Conditional Approval: This is high-quality work that solves a real problem, but needs the two blocking issues fixed before merge. Once addressed, this will be a solid contribution to the codebase.

The architecture is sound and aligns well with the project's existing patterns. Great job on handling the complex Unicode edge cases! 👍


Reviewed by: Claude Sonnet 4.5
Review Date: 2025-12-23

- Made GraphemeUtils.isEmojiPresentation() public
- Updated TerminalCanvasRenderer.analyzeCharacter() to use shared function
- Eliminates ~25 lines of duplicated emoji code point checks
- Ensures buffer and renderer stay in sync automatically

🤖 Generated with [Claude Code](https://claude.com/claude-code)
@claude
Copy link
Copy Markdown

claude Bot commented Dec 23, 2025

Code Review - PR #136: Emoji Background Alignment Fix

Overview

This PR addresses emoji background misalignment issues by tracking visual columns separately from buffer columns and improving emoji detection logic. The changes span 4 files with significant refactoring in the rendering system.


Strengths

  1. Well-scoped problem: The PR clearly addresses the visual/buffer column mismatch that caused emoji background misalignment.

  2. Comprehensive emoji handling: The new ColumnConversionUtils.kt provides a solid foundation for buffer↔visual column conversion with proper handling of:

    • DWC markers
    • Variation selectors
    • ZWJ sequences
    • Surrogate pairs
    • Skin tone modifiers
  3. Good documentation: The ColumnConversionUtils class has excellent inline documentation explaining the different character types that don't consume visual space.

  4. Centralized emoji detection: The GraphemeUtils.isEmojiPresentation() method consolidates emoji detection logic, reducing duplication.


⚠️ Issues & Concerns

1. Performance - Redundant Character Analysis

File: TerminalCanvasRenderer.kt

The PR introduces an AnalysisCache to avoid redundant computation, but there's a critical issue:

// In renderBackgrounds() - line ~240
val analysis = analyzeCharacter(char, line, col, ctx.visibleCols, ctx.ambiguousCharsAreDoubleWidth)
analysisCache[lineIndex to col] = analysis

Problem: The cache key uses lineIndex (absolute buffer line) but col (relative column). This could cause collisions when scrolling or with very large buffers.

Recommendation: Use row (visible row index) instead of lineIndex for the cache key, or clear the cache between passes:

analysisCache[row to col] = analysis  // Use visible row, not absolute line index

2. Code Duplication Between renderBackgrounds() and analyzeCharacter()

File: TerminalCanvasRenderer.kt

The skip logic for ZWJ, skin tones, and gender symbols appears in two places:

  1. Directly in renderBackgrounds() (lines ~147-180)
  2. Implicitly handled in ColumnConversionUtils.shouldSkipChar()

Problem: This creates maintenance burden and potential inconsistencies.

Recommendation: Refactor renderBackgrounds() to use ColumnConversionUtils.bufferColToVisualCol() directly instead of duplicating skip logic:

// In renderBackgrounds() - instead of manual skip logic
var col = 0
var visualCol = 0
while (col < ctx.visibleCols) {
    val char = line.charAt(col)
    
    // Use ColumnConversionUtils to determine if we should skip
    val skipResult = ColumnConversionUtils.shouldSkipChar(line, col, ctx.visibleCols)
    if (skipResult.shouldSkip) {
        col += skipResult.colsToAdvance
        continue
    }
    
    // Compute visualCol using ColumnConversionUtils
    visualCol = ColumnConversionUtils.bufferColToVisualCol(line, col, ctx.visibleCols)
    
    // ... render background at visualCol ...
    
    col++
}

This would eliminate ~80 lines of duplicate logic.


3. Incomplete Emoji Detection in isEmojiPresentation()

File: GraphemeUtils.kt (lines ~221-283)

The new isEmojiPresentation() method only covers BMP emoji explicitly, missing some important ranges:

Missing ranges:

  • Supplementary plane emoji (U+1F100+): The method handles these with a catch-all in the rendering code (actualCodePoint >= 0x1F100) but should be explicit in isEmojiPresentation()
  • Regional Indicators (U+1F1E6-1F1FF): These form flag emoji but aren't covered by isEmojiPresentation()

Current workaround: The rendering code has this:

val isBaseDoubleWidth = if (actualCodePoint >= 0x1F100) true else isWcwidthDoubleWidth

Recommendation: Move all emoji detection into GraphemeUtils.isEmojiPresentation():

fun isEmojiPresentation(codePoint: Int): Boolean {
    return when {
        // === SUPPLEMENTARY PLANE EMOJI (always 2-cell) ===
        codePoint in 0x1F100..0x1F1FF -> true  // Includes regional indicators
        codePoint in 0x1F300..0x1F5FF -> true  // Misc Symbols and Pictographs
        codePoint in 0x1F600..0x1F64F -> true  // Emoticons
        codePoint in 0x1F680..0x1F6FF -> true  // Transport and Map Symbols
        codePoint in 0x1F900..0x1F9FF -> true  // Supplemental Symbols and Pictographs
        codePoint in 0x1FA70..0x1FAFF -> true  // Symbols and Pictographs Extended-A
        codePoint in 0x1FA00..0x1FA6F -> true  // Chess Symbols
        
        // === BMP EMOJI (existing ranges) ===
        0x2614, 0x2615 -> true // ☔☕
        // ... rest of existing BMP cases ...
        
        else -> false
    }
}

Then remove the >= 0x1F100 checks from rendering code.


4. Regional Indicator Flag Handling Inconsistency

File: TerminalCanvasRenderer.kt (lines ~186-230 in renderBackgrounds)

There's a special case for Regional Indicator sequences (flags) with its own rendering logic, but:

Problem 1: checkRegionalIndicatorSequence() function is called but not defined in the diff. This suggests it was added elsewhere or the diff is incomplete.

Problem 2: The hardcoded logic assumes flags are always 4 chars (2 surrogate pairs):

col += 4  // Skip both Regional Indicators (2 surrogate pairs = 4 chars)

This is brittle - what if there are DWC markers or variation selectors mixed in?

Recommendation: Use ColumnConversionUtils.getCharacterVisualWidth() to determine how many buffer columns to advance instead of hardcoding col += 4.


5. analyzeCharacter() Returns Unused Fields

File: TerminalCanvasRenderer.kt (lines ~88-186)

The CharacterAnalysis data class returns 11 fields, but many seem redundant:

  • isWcwidthDoubleWidth, isBaseDoubleWidth, isDoubleWidth - three overlapping width flags
  • isCursiveOrMath, isTechnicalSymbol, isEmojiOrWideSymbol - font selection hints that duplicate logic in renderText()

Recommendation: Simplify CharacterAnalysis to only essential fields:

data class CharacterAnalysis(
    val actualCodePoint: Int,
    val charTextToRender: String,
    val visualWidth: Int,
    val hasVariationSelector: Boolean,
    val fontFamily: FontFamily  // Pre-computed font selection
)

This would make the cache more memory-efficient and easier to reason about.


6. Potential Off-by-One in ColumnConversionUtils.getCharacterVisualWidth()

File: ColumnConversionUtils.kt (lines ~164-192)

The lookahead logic for DWC detection after surrogate pairs could skip too far:

if (Character.isHighSurrogate(char)) {
    var nextCol = col + 1
    while (nextCol < width) {
        val nextChar = line.charAt(nextCol)
        if (nextChar == CharUtils.DWC) return 2  // Found DWC
        if (Character.isLowSurrogate(nextChar) ||
            nextChar.code == 0xFE0F || ...) {
            nextCol++
            continue
        }
        // ... skin tone check ...
        break
    }
}

Problem: If there's a sequence like [HIGH][LOW][FE0F][DWC], this logic will correctly find the DWC, but the caller only advances col by 1, causing the LOW surrogate to be processed again.

Recommendation: Return both width and buffer columns consumed:

data class CharWidthResult(val visualWidth: Int, val bufferColsConsumed: Int)

fun getCharacterVisualWidth(line: TerminalLine, col: Int, width: Int): CharWidthResult {
    // ... existing logic ...
    return CharWidthResult(visualWidth = 2, bufferColsConsumed = 4)  // Example
}

Or document that callers must use bufferColToVisualCol() to advance properly.


7. Missing Unit Tests

Files: All new utility functions

The new ColumnConversionUtils and GraphemeUtils.isEmojiPresentation() functions lack unit tests.

Recommendation: Add test coverage for:

  • bufferColToVisualCol() / visualColToBufferCol() round-trip conversion
  • Edge cases: empty lines, lines with only DWC markers, mixed emoji sequences
  • Regional indicators, ZWJ sequences, skin tones
  • isEmojiPresentation() for all Unicode ranges

🔧 Minor Issues

1. Inconsistent Comments

// compose-ui/.../TerminalCanvasRenderer.kt:149
// ZWJ sequences like 👨‍💻 are: [emoji1][ZWJ][emoji2][DWC]

This comment implies a single DWC at the end, but ZWJ sequences can have multiple emojis with DWC markers in between.

2. Magic Number

// compose-ui/.../TerminalCanvasRenderer.kt:1010
val scaleValue = minOf(widthScale, heightScale).coerceIn(0.8f, 2.5f)

The 0.8f lower bound (changed from 1.0f) lacks explanation. Why allow downscaling below 1.0?


📝 Recommendations Summary

Must Fix (Blocking)

  1. ✅ Fix analysisCache key collision issue (use row instead of lineIndex)
  2. ✅ Make ColumnConversionUtils.shouldSkipChar() public and refactor renderBackgrounds() to use it
  3. ✅ Add checkRegionalIndicatorSequence() to the diff or remove the call

Should Fix (Important)

  1. ⚠️ Consolidate emoji detection into GraphemeUtils.isEmojiPresentation()
  2. ⚠️ Simplify CharacterAnalysis data class
  3. ⚠️ Document or fix getCharacterVisualWidth() caller contract

Nice to Have

  1. 💡 Add unit tests for new utility functions
  2. 💡 Add inline comments explaining the 0.8f scaling change

🎯 Overall Assessment

Score: 7/10 - Good foundational work with room for improvement

Verdict: The PR solves the core problem (emoji background alignment) effectively, but the implementation has some architectural issues that should be addressed:

  • Code duplication between renderBackgrounds() skip logic and ColumnConversionUtils
  • Incomplete abstraction in CharacterAnalysis
  • Missing tests for critical utility functions

Recommendation: Request changes to address the "Must Fix" and "Should Fix" items before merging.


💬 Questions for Author

  1. What is checkRegionalIndicatorSequence() and why isn't it in the diff?
  2. Have you tested this with complex emoji sequences like 👨‍👩‍👧‍👦 (family emoji with ZWJ)?
  3. What was the motivation for changing the scale lower bound from 1.0f to 0.8f?

Great work on tackling this complex rendering issue! The ColumnConversionUtils abstraction is a solid foundation. Looking forward to seeing the refinements. 🚀

Add bounds check before skipping flag sequence chars to prevent
potential crash if flag emoji is at end of line or buffer is malformed.

🤖 Generated with [Claude Code](https://claude.com/claude-code)
@claude
Copy link
Copy Markdown

claude Bot commented Dec 23, 2025

Code Review - PR #136: Emoji Background Alignment Fix

Summary

This PR addresses a critical rendering issue where emoji backgrounds were misaligned with their text. The solution introduces visual column tracking and comprehensive character analysis to ensure backgrounds render at the correct positions.


Strengths

  1. Well-Identified Root Cause: The diagnosis is spot-on - the divergence between buffer columns (with invisible chars) and visual columns was causing the misalignment.

  2. Comprehensive Emoji Support: The expanded isEmojiPresentation() function covers a wide range of emoji categories including Regional Indicators (flags), which is thorough.

  3. Analysis Cache: Smart optimization in TerminalCanvasRenderer.kt - caching CharacterAnalysis results between renderBackgrounds() and renderText() passes avoids redundant computation.

  4. Shared Logic: The analyzeCharacter() helper eliminates code duplication and ensures consistency between rendering passes.

  5. Regional Indicator Handling: Special case handling for flag emoji (lines 294-326 in TerminalCanvasRenderer.kt) shows attention to complex Unicode scenarios.


⚠️ Issues & Concerns

1. Missing Test Coverage (High Priority)

  • No unit tests for the new ColumnConversionUtils class (196 lines)
  • No tests for updated GraphemeUtils.isEmojiPresentation()
  • Complex logic like surrogate pair handling, ZWJ sequence skipping, and skin tone modifiers needs test coverage

Recommendation: Add test files:

// ColumnConversionUtilsTest.kt
- Test bufferColToVisualCol with various grapheme sequences
- Test visualColToBufferCol boundary snapping
- Test getCharacterVisualWidth with DWC markers
- Test edge cases: empty lines, line boundaries, mixed emoji/text

2. Performance - Multiple Buffer Accesses

In renderBackgrounds() (lines 289-292), there's a potential for excessive line.charAt() calls:

if (Character.isHighSurrogate(char)) {
    val nextChar = if (col + 1 < ctx.visibleCols) line.charAt(col + 1) else null
    // ... more charAt calls ...
}

Issue: For a 200-column terminal with emoji, this could result in 2-3x buffer accesses per cell.

Recommendation: Consider pre-fetching a small lookahead buffer (e.g., 4 chars) at the start of the while loop to reduce repeated charAt() calls.

3. Magic Numbers - Regional Indicator Handling

Line 327 in TerminalCanvasRenderer.kt:

col += minOf(4, ctx.visibleCols - col)

Issue: The value 4 (2 surrogate pairs for flag emoji) is hardcoded without explanation.

Recommendation: Add a constant with documentation:

private const val REGIONAL_INDICATOR_SEQUENCE_LENGTH = 4 // 2 surrogate pairs
// Usage: col += minOf(REGIONAL_INDICATOR_SEQUENCE_LENGTH, ctx.visibleCols - col)

4. Bounds Safety - Index Out of Bounds Risk

In ColumnConversionUtils.getCharacterVisualWidth() (lines 164-192), there's complex lookahead logic that could miss edge cases:

if (Character.isHighSurrogate(nextChar) && nextCol + 1 < width) {
    val afterNext = line.charAt(nextCol + 1)
    // ...
}

Question: What happens if a truncated emoji sequence appears at the line boundary? For example:

  • Line width = 80
  • Column 78 = High surrogate
  • Column 79 = Low surrogate
  • Column 80+ = out of bounds

The code checks nextCol + 1 < width, but does it handle nextCol == width gracefully?

Recommendation: Add defensive checks and document expected behavior for truncated sequences at line boundaries.

5. Code Duplication - Gender Symbol Logic

The gender symbol skip logic appears in both:

  • ColumnConversionUtils.shouldSkipChar() (lines 67-71)
  • TerminalCanvasRenderer.renderBackgrounds() (lines 319-324)

Recommendation: Extract to a shared utility method to maintain consistency.

6. AnalysisCache Memory - Unbounded Growth?

private typealias AnalysisCache = MutableMap<Pair<Int, Int>, CharacterAnalysis>

Question: For a large terminal (e.g., 200 columns × 50 rows = 10,000 cells), the cache could hold 10,000 entries. Is this acceptable?

Recommendation: Document expected memory usage or consider an LRU cache if this becomes an issue.


🔍 Security Considerations

Low Risk: No obvious security vulnerabilities. The code handles Unicode properly and doesn't introduce XSS, injection, or buffer overflow risks.


📊 Testing Recommendations

Per CLAUDE.md, user will test manually, but consider adding:

  1. Unit Tests:

    • ColumnConversionUtilsTest.kt (cover all public methods)
    • GraphemeUtilsTest.kt (test isEmojiPresentation coverage)
  2. Integration Tests:

    • Test script with varied emoji sequences
    • Mixed emoji + text + CJK characters
    • Edge cases: line boundaries, truncated sequences
  3. Regression Tests:

    • Ensure fix doesn't break existing rendering (screenshots before/after)

💡 Minor Suggestions

  1. Line 318 (TerminalCanvasRenderer.kt): Comment could be more precise:

    // Skip gender symbols ONLY when preceded by ZWJ (they're part of ZWJ sequences like 👨‍💻)
  2. GraphemeUtils.isEmojiPresentation(): Consider documenting which emoji ranges were intentionally excluded (e.g., ▶◀⏹⏺) and why.

  3. ProperTerminal.kt changes (lines 1027-1073): The diff shows refactoring of analyzeCharacter usage. Ensure this doesn't break cursor positioning or selection logic.


Approval Status

Overall Assessment: This is a high-quality fix that tackles a complex rendering problem with a sound architectural approach. The analysis cache and shared logic are excellent patterns.

Blockers:

  • Missing test coverage for critical new code

Recommendations Before Merge:

  1. Add unit tests for ColumnConversionUtils
  2. Address magic number constant
  3. Verify bounds safety at line edges
  4. Document memory characteristics of AnalysisCache

Once tests are added, this is ready to merge.


Great work on the comprehensive emoji handling! The rendering architecture is solid. 🚀


Reviewed per CLAUDE.md guidelines - focusing on correctness, performance, and maintainability.

- Made ColumnConversionUtils.SkipResult and shouldSkipChar() public
- Refactored renderBackgrounds() to use shared shouldSkipChar()
- Kept special ZWJ handling (skip until DWC) and Regional Indicator handling
- Eliminates ~25 lines of duplicate skip logic

🤖 Generated with [Claude Code](https://claude.com/claude-code)
@claude
Copy link
Copy Markdown

claude Bot commented Dec 23, 2025

Code Review: Emoji Background Rendering Fix

✅ Overall Assessment

Strong implementation that properly addresses emoji background alignment issues. The PR introduces a well-architected solution with good separation of concerns. The code demonstrates solid understanding of Unicode complexity (surrogate pairs, variation selectors, ZWJ sequences, Regional Indicators).


💡 Strengths

1. Excellent Architecture

  • ColumnConversionUtils.kt: Clean abstraction for buffer↔visual column conversion
  • CharacterAnalysis: Encapsulates complex surrogate pair/width detection logic
  • Analysis caching: Smart optimization to avoid redundant computation between rendering passes

2. Comprehensive Unicode Handling

  • Properly handles surrogate pairs (U+10000+)
  • Variation selectors (U+FE0F, U+FE0E) correctly skipped
  • ZWJ sequences (e.g., 👨‍💻) properly treated as single units
  • Regional Indicators (flag emoji 🇺🇸) handled as atomic graphemes
  • Skin tone modifiers (U+1F3FB-U+1F3FF) correctly identified

3. Good Documentation

  • Clear KDoc comments explaining purpose and edge cases
  • Inline comments for non-obvious logic
  • Test plan included in PR description

4. Shared Logic

  • GraphemeUtils.isEmojiPresentation() ensures renderer and buffer logic stay in sync
  • analyzeCharacter() helper prevents duplication between renderBackgrounds() and renderText()

🐛 Issues & Concerns

Critical: Potential Index Out of Bounds in renderBackgrounds()

Location: TerminalCanvasRenderer.kt:270-276 (flag emoji handling)

// Skip all chars in the flag sequence (2 surrogate pairs + possible DWC markers)
// Use minOf to prevent index out of bounds if flag is at end of line
col += minOf(4, ctx.visibleCols - col)
while (col < ctx.visibleCols && line.charAt(col) == CharUtils.DWC) {
    col++  // Skip any trailing DWC markers
}

Issue: After col += minOf(4, ...), col can equal ctx.visibleCols. The next while loop correctly guards with col < ctx.visibleCols, but earlier code at lines 257-270 doesn't check bounds before calling checkRegionalIndicatorSequence().

Risk: If a flag emoji spans the end of the visible line, the code could attempt to read past ctx.visibleCols.

Recommendation: Add explicit bounds check:

if (col + 3 < ctx.visibleCols && checkRegionalIndicatorSequence(line, col, ctx.visibleCols)) {
    // ... flag handling ...
}

Performance: Analysis Cache Memory Growth

Location: TerminalCanvasRenderer.kt:92-95, 212-214

// Cache character analysis to avoid redundant computation between passes
val analysisCache: AnalysisCache = mutableMapOf()

Issue: Cache is created per frame but never bounded. With 24 rows × 80 cols = 1,920 entries per frame at 60fps, this creates ~115,200 short-lived Map entries per second.

Impact:

  • Allocation churn may trigger frequent GC pauses
  • Cache entries are thrown away after each frame (no reuse between frames)

Recommendation: Consider one of:

  1. Object pooling: Reuse cache map across frames with clear() between uses
  2. Pre-allocated capacity: HashMap(visibleRows * visibleCols) to avoid rehashing
  3. Benchmark first: Profile to verify if this is actually a bottleneck

Minor: Potential Logic Error in shouldSkipChar()

Location: ColumnConversionUtils.kt:66-71

// Skip gender symbols only when preceded by ZWJ (part of ZWJ sequences)
if (char.code == 0x2640 || char.code == 0x2642) {
    if (col > 0 && line.charAt(col - 1).code == 0x200D) {
        return SkipResult(true, 1)
    }
}

Issue: This assumes gender symbols (♀♂) are ONLY skippable when part of ZWJ sequences. However, they could also be standalone characters with legitimate visual presence.

Risk: If someone types a plain ♀ or ♂ character (not part of emoji), it won't be skipped, but that's correct behavior.

Verdict: This is actually correct as-is. Gender symbols should only be skipped when they're modifiers in ZWJ sequences, not when standalone.


Code Duplication: Grapheme Extender Lists

Location: Multiple files contain hardcoded lists of grapheme extenders:

  • ColumnConversionUtils.kt:172-177
  • ColumnConversionUtils.kt:179-188

Issue: Unicode ranges duplicated in multiple places. If Unicode standard updates (e.g., new skin tone ranges), multiple locations need updates.

Recommendation: Extract to shared constants:

object GraphemeExtenders {
    val VARIATION_SELECTORS = setOf(0xFE0E, 0xFE0F)
    val ZWJ = 0x200D
    val SKIN_TONE_RANGE = 0x1F3FB..0x1F3FF
    val GENDER_SYMBOLS = setOf(0x2640, 0x2642)
}

Missing Edge Case: Fitzpatrick Skin Tone Combinations

Location: shouldSkipChar() checks for single skin tone modifiers but doesn't handle multiple modifiers (rare but valid).

Example: Some emoji allow stacking skin tones (e.g., handshake with different skin tones: 🤝🏻🤝🏿).

Impact: Low - these are rare and terminal applications rarely use them.

Recommendation: Document as known limitation or add handling if needed.


🧪 Testing Considerations

Test Coverage Gaps

  1. Boundary conditions:

    • Emoji at column 0 and last column
    • Flag emoji spanning end of line
    • Line with only emoji (no ASCII)
  2. Mixed content:

    • ASCII + emoji + ZWJ sequences in single line
    • Alternating single/double-width characters
  3. Performance:

    • Large terminal (200×80) with emoji-heavy content
    • Scrolling through emoji-rich buffer (e.g., cat emoji.txt)

Recommendation: Add unit tests for ColumnConversionUtils methods:

@Test
fun testBufferColToVisualCol_withFlagEmoji() {
    // Test flag at start, middle, end of line
}

@Test
fun testBufferColToVisualCol_withVariationSelectors() {
    // Test ⚠️ vs ⚠
}

🔒 Security

No security concerns identified. All input validation uses safe Kotlin stdlib functions (Character.isHighSurrogate, Character.toCodePoint, bounds checks).


⚡ Performance

Positive Optimizations

  • ✅ Analysis cache prevents duplicate work
  • ✅ Early returns in shouldSkipChar() minimize branching
  • ✅ Fast-path for simple BMP characters in getGraphemeWidth()

Potential Concerns

  • ⚠️ Analysis cache allocation churn (see above)
  • ⚠️ Multiple line.charAt() calls per character (could batch)

📋 Recommendations Summary

Must Fix Before Merge

  1. Add bounds check for flag emoji handling to prevent potential index out of bounds

Should Consider

  1. Profile analysis cache memory usage under high-volume scenarios
  2. Add unit tests for ColumnConversionUtils boundary conditions
  3. Extract grapheme extender constants to shared location

Nice to Have

  1. Document known limitations (e.g., rare multi-skin-tone emoji)
  2. Benchmark with emoji-heavy terminal content

🎯 Verdict

Approve with minor fixes. The core logic is sound and the architecture is clean. Address the index out of bounds risk before merging, then consider the performance profiling as a follow-up task.

Great work on tackling Unicode complexity! The solution properly handles the common cases while remaining maintainable.


Build Status: ✅ All checks passing
Files Changed: 4 files, +724/-113 lines
Complexity: Medium-High (Unicode handling inherently complex)

🤖 Generated with Claude Code

Centralizes grapheme extender magic numbers (~40 occurrences across 7 files)
into a single UnicodeConstants.kt file for consistency and maintainability.

Constants extracted:
- Variation Selectors (FE0E, FE0F)
- Zero-Width Joiner (200D)
- Skin Tone Range (1F3FB-1F3FF)
- Gender Symbols (2640, 2642)
- Regional Indicator Range (1F1E6-1F1FF)

Helper functions added:
- isVariationSelector()
- isSkinToneModifier()
- isGenderSymbol()
- isRegionalIndicator()
- isRegionalIndicatorHighSurrogate()
- isRegionalIndicatorLowSurrogate()

Files updated:
- GraphemeUtils.kt
- GraphemeCluster.kt
- ColumnConversionUtils.kt
- GraphemeMetadata.kt
- GraphemeBoundaryUtils.kt
- TerminalCanvasRenderer.kt

🤖 Generated with [Claude Code](https://claude.com/claude-code)
@claude
Copy link
Copy Markdown

claude Bot commented Dec 23, 2025

Pull Request Review

Summary

This PR fixes a critical rendering bug where emoji background colors were misaligned due to divergence between buffer column positions and visual column positions. The fix introduces proper visual column tracking and comprehensive Unicode handling infrastructure.


✅ Strengths

1. Excellent Problem Analysis

The root cause analysis is spot-on: renderBackgrounds() used buffer columns while renderText() used visual columns, causing misalignment when invisible characters (variation selectors, ZWJ, etc.) exist in the buffer.

2. Well-Architected Solution

  • New ColumnConversionUtils: Clean separation of concerns for buffer↔visual column conversion
  • New UnicodeConstants: Centralized constants eliminate magic numbers and ensure consistency
  • Shared analyzeCharacter(): DRY principle - one source of truth for character analysis
  • Analysis caching: Smart optimization to avoid redundant computation between render passes

3. Comprehensive Unicode Support

The isEmojiPresentation() function is well-documented and thoughtfully designed:

  • Clear distinction between supplementary plane emoji (always 2-cell) and BMP characters
  • Intentionally excludes ambiguous symbols (▶◀⏹⏺) that are commonly used as 1-cell text in TUIs
  • Covers all major emoji ranges including regional indicators, ZWJ sequences, and skin tone modifiers

4. Code Quality

  • Extensive documentation with clear comments
  • Proper handling of edge cases (bounds checking, surrogate pairs, grapheme clusters)
  • Consistent naming conventions and code style

🔍 Issues & Concerns

Critical: Potential Performance Regression

Issue 1: Analysis Cache Key Mismatch (TerminalCanvasRenderer.kt:313-314)

val analysis = analyzeCharacter(char, line, col, ctx.visibleCols, ctx.ambiguousCharsAreDoubleWidth)
analysisCache[lineIndex to col] = analysis  // ⚠️ Using lineIndex, but should use row

Problem: You're caching with lineIndex (which is row - scrollOffset) but the key should be row to match how renderText() looks up cached results. This means the cache will miss on every lookup, causing redundant analysis.

Fix: Change line 314 to:

analysisCache[row to col] = analysis

Issue 2: analyzeCharacter() Called Twice Per Character

Even with the cache fixed, each character goes through analysis TWICE:

  1. First in renderBackgrounds() - creates analysis and caches it
  2. Then in renderText() - retrieves from cache... but still calls analyzeCharacter() again

Looking at the text rendering code (not shown in diff), verify that renderText() uses the cache like this:

val analysis = analysisCache[row to col] ?: analyzeCharacter(...)

If it's unconditionally calling analyzeCharacter() again, the cache provides no benefit.


Medium: Regional Indicator Handling Issues

Issue 3: Regional Indicator Loop May Read Out of Bounds (TerminalCanvasRenderer.kt:299)

col += minOf(4, ctx.visibleCols - col)  // ✅ Good - bounds check
while (col < ctx.visibleCols && line.charAt(col) == CharUtils.DWC) {
    col++  // Skip any trailing DWC markers
}

Potential Issue: The minOf(4, ...) assumes flags are always exactly 4 buffer chars (2 surrogate pairs). However, if DWC markers are inserted, a flag could be:

  • [High1][Low1][DWC][High2][Low2][DWC] = 6 chars

The follow-up while loop handles trailing DWCs, but what about DWCs between the two Regional Indicators? The loop might exit prematurely if it encounters a DWC before reading the second indicator.

Recommendation: Use checkRegionalIndicatorSequence() result to return both the flag AND the number of buffer columns it occupies, then advance by that exact amount.

Issue 4: checkRegionalIndicatorSequence() Not Shown

The function is called but not defined in this diff. Please verify it correctly handles:

  • DWC markers between Regional Indicators
  • Partial sequences at line boundaries
  • Invalid sequences (single Regional Indicator)

Low: Code Clarity & Maintainability

Issue 5: Magic Number 0x1F100 (TerminalCanvasRenderer.kt:155)

val isBaseDoubleWidth = if (actualCodePoint >= 0x1F100) true else isWcwidthDoubleWidth

Suggestion: Add this to UnicodeConstants with a descriptive name:

const val SUPPLEMENTARY_PLANE_EMOJI_START = 0x1F100

Issue 6: Hardcoded Range Check (TerminalCanvasRenderer.kt:168-169)

val isCursiveOrMath = actualCodePoint in 0x1D400..0x1D7FF
val isTechnicalSymbol = actualCodePoint in 0x23E9..0x23FF

Suggestion: Move these to UnicodeConstants for consistency:

val CURSIVE_MATH_RANGE = 0x1D400..0x1D7FF
val TECHNICAL_SYMBOL_RANGE = 0x23E9..0x23FF

Issue 7: ZWJ Handling in renderBackgrounds() Duplicates Logic (TerminalCanvasRenderer.kt:248-254)

The special ZWJ skip logic is inline in renderBackgrounds() but conceptually belongs in ColumnConversionUtils.shouldSkipChar(). However, I see why you didn't do this - ZWJ sequences require skipping multiple characters until DWC.

Suggestion: Add a comment explaining why this is separate:

// Special handling for ZWJ sequences: Unlike simple extenders (variation selectors, skin tones),
// ZWJ sequences require skipping ALL characters until DWC marker (e.g., [emoji1][ZWJ][emoji2][DWC]).
// This logic is intentionally separate from ColumnConversionUtils.shouldSkipChar() which handles
// single-character skips.

Low: Potential Logic Issues

Issue 8: Variation Selector Detection May Miss Edge Cases (TerminalCanvasRenderer.kt:157-162)

val nextCharOffset = if (isWcwidthDoubleWidth) 2 else 1
val nextChar = if (col + nextCharOffset < width) line.charAt(col + nextCharOffset) else null
val hasVariationSelector = (nextChar != null && UnicodeConstants.isVariationSelector(nextChar)) ||
    hasVariationSelectorAtCol1

Scenario: Surrogate pair + variation selector WITHOUT DWC marker (1-cell emoji treated as 2-cell):

  • Buffer: [High][Low][FE0F]
  • isWcwidthDoubleWidth = false (no DWC marker, wcwidth returns 1)
  • nextCharOffset = 1
  • nextChar = line.charAt(col + 1) = Low (NOT the variation selector!)

The hasVariationSelectorAtCol1 fallback catches this case, but only if col+1 is checked. For surrogate pairs, the variation selector is at col+2, not col+1.

Test Case: Emoji like (U+2B50) which is BMP but often has VS16. If rendered as [⭐][FE0F] (no DWC), does this code detect the variation selector?


🧪 Testing Recommendations

1. Add Unit Tests for ColumnConversionUtils

@Test
fun testBufferColToVisualCol_withVariationSelector() {
    // Buffer: [A][⚠][FE0F][DWC][B]
    // Visual: [A][⚠⚠][B] - positions 0,1,3
    val line = createMockLine("A\u26A0\uFE0F${CharUtils.DWC}B")
    assertEquals(0, ColumnConversionUtils.bufferColToVisualCol(line, 0, 5)) // A at 0
    assertEquals(1, ColumnConversionUtils.bufferColToVisualCol(line, 1, 5)) // ⚠ at 1
    assertEquals(1, ColumnConversionUtils.bufferColToVisualCol(line, 2, 5)) // FE0F doesn't add width
    assertEquals(1, ColumnConversionUtils.bufferColToVisualCol(line, 3, 5)) // DWC doesn't add width
    assertEquals(3, ColumnConversionUtils.bufferColToVisualCol(line, 4, 5)) // B at 3
}

@Test
fun testVisualColToBufferCol_snapsToGraphemeBoundary() {
    // Buffer: [⚠][FE0F][DWC]
    // Visual: [⚠⚠] - clicking anywhere in the 2 cells should return buffer col 0
    val line = createMockLine("\u26A0\uFE0F${CharUtils.DWC}")
    assertEquals(0, ColumnConversionUtils.visualColToBufferCol(line, 0, 3))
    assertEquals(0, ColumnConversionUtils.visualColToBufferCol(line, 1, 3)) // Snaps to start
}

2. Test Regional Indicator Edge Cases

  • Single Regional Indicator at end of line
  • Regional Indicators with DWC markers between them
  • Invalid sequences (three consecutive Regional Indicators)

3. Performance Benchmark

Compare rendering performance before/after this PR:

  • Large file with mixed ASCII and emoji (cat large_file_with_emoji.txt)
  • Fast scrolling through emoji-heavy content
  • Verify the analysis cache is actually being used (add debug logging)

📋 Additional Suggestions

1. Consider Lazy Evaluation for CharacterAnalysis

If most cells don't need all fields, consider using lazy properties:

data class CharacterAnalysis(
    val actualCodePoint: Int,
    val lowSurrogate: Char?,
    val charTextToRender: String
) {
    val isWcwidthDoubleWidth: Boolean by lazy { /* compute */ }
    val visualWidth: Int by lazy { if (isDoubleWidth) 2 else 1 }
    // ...
}

2. Add Debug Logging for Cache Hit Rate

if (ctx.settings.debugModeEnabled) {
    val hitRate = cacheHits / (cacheHits + cacheMisses)
    println("Analysis cache hit rate: ${hitRate * 100}%")
}

3. Document Buffer Layout Examples

Add comments showing buffer layouts for tricky cases:

// Example buffer layouts:
// Simple emoji:     [😀][DWC]
// Emoji + VS:       [⚠][FE0F][DWC]
// Surrogate + VS:   [High][Low][FE0F][DWC]
// ZWJ sequence:     [👨][ZWJ][💻][DWC]
// Flag:             [RIH1][RIL1][DWC][RIH2][RIL2][DWC]

🎯 Verdict

Overall Assessment: Strong PR with a well-designed solution, but critical performance issues must be addressed before merge.

Required Changes (🔴 Blocking):

  1. ✅ Fix analysis cache key mismatch (lineIndexrow)
  2. ✅ Verify renderText() actually uses the cache (not shown in diff)
  3. ✅ Review Regional Indicator handling for DWC edge cases

Recommended Changes (🟡 High Priority):

  1. Add unit tests for ColumnConversionUtils
  2. Move magic numbers to UnicodeConstants
  3. Verify variation selector detection for all emoji types

Optional Improvements (🟢 Nice to Have):

  1. Add debug logging for cache hit rate
  2. Document buffer layout examples in comments
  3. Consider lazy evaluation for CharacterAnalysis

Once the critical issues are addressed, this will be an excellent contribution that significantly improves emoji rendering correctness. The architecture is solid, the Unicode handling is comprehensive, and the code quality is high. Great work! 🎉

Extends UnicodeConstants.kt with combining character ranges:
- COMBINING_DIACRITICS_RANGE (0x0300-0x036F)
- COMBINING_MARKS_FOR_SYMBOLS_RANGE (0x20D0-0x20FF)
- HEBREW_COMBINING_MARKS_RANGE (0x0591-0x05BD)
- ARABIC_COMBINING_MARKS_RANGE (0x0610-0x061A)

Added helper functions:
- isCombiningDiacritic()
- isCombiningCharacter()

Updated files:
- GraphemeUtils.kt (both isGraphemeExtender functions)
- GraphemeBoundaryUtils.kt
- GraphemeMetadata.kt

🤖 Generated with [Claude Code](https://claude.com/claude-code)
Only the Char version is used in GraphemeBoundaryUtils.kt.

🤖 Generated with [Claude Code](https://claude.com/claude-code)
@claude
Copy link
Copy Markdown

claude Bot commented Dec 23, 2025

Code Review: Emoji Background Alignment Fix

Strengths

  1. Root Cause Analysis: Excellent identification that renderBackgrounds() was using buffer columns while renderText() uses visual columns
  2. Comprehensive Refactoring: UnicodeConstants.kt centralizes magic numbers, ColumnConversionUtils.kt provides clean buffer-visual conversion
  3. Performance: The analysisCache prevents redundant computation
  4. Documentation: Excellent inline comments for Unicode handling

Issues & Concerns

1. CRITICAL - Cache Performance (TerminalCanvasRenderer.kt:314, 524)
Using Pair<Int,Int> as cache key causes excessive allocations (288k objects/sec at 60fps with 80x60 terminal). Recommend: lineIndex * ctx.visibleCols + col

2. CRITICAL - Cache Invalidation
analysisCache is per-frame which is fine, but should be documented. Cross-frame cache would need version tracking.

3. HIGH - Emoji Detection Coverage (GraphemeUtils.kt:217-293)
isEmojiPresentation() is missing some ranges and excludes media controls. Recommend adding inline rationale comments and consider treatMediaSymbolsAsEmoji setting.

4. MEDIUM - Surrogate Pair Edge Case (analyzeCharacter:128-131)
Doesn't handle variation selector between high/low surrogate. Recommend forward scanning through extenders.

5. MEDIUM - Regional Indicator Validation (GraphemeUtils.kt:160-163)
Accepts 3+ RIs as flags, but Unicode requires exactly 2. Change >= 2 to == 2

6. LOW - Redundant Null Check (analyzeCharacter:134)
Can be simplified with Kotlin idioms

Testing Recommendations

  • Unit tests for ColumnConversionUtils.bufferColToVisualCol() with complex graphemes
  • Performance benchmarks for cache allocation overhead
  • Tests for analyzeCharacter() with malformed surrogates

Verdict: Approve with Minor Revisions

Required: Fix cache key allocation and Regional Indicator validation
Recommended: Add surrogate pair scanning, document emoji rationale, add unit tests

Great work on centralizing Unicode constants and visual column tracking!

@claude
Copy link
Copy Markdown

claude Bot commented Dec 23, 2025

Pull Request Review: Fix Emoji Background Alignment

Summary

This PR fixes a critical emoji background rendering bug where backgrounds weren't aligned with text for emoji containing variation selectors. The fix introduces proper buffer-to-visual column conversion and significantly expands emoji detection coverage.


Strengths

1. Excellent Architecture

  • ColumnConversionUtils.kt (196 lines): Clean buffer-visual conversion abstraction
  • UnicodeConstants.kt (88 lines): Centralized Unicode constants
  • analyzeCharacter(): Shared analysis eliminates duplication
  • Analysis caching: Prevents redundant computation

2. Comprehensive Emoji Detection

  • Covers supplementary plane emoji (U+1F000+)
  • Thoughtful BMP emoji handling (only unambiguous cases)
  • Intentionally excludes ambiguous symbols often 1-cell in TUIs
  • Clear documentation

3. Robust Surrogate Pair Handling

  • shouldSkipChar() handles skin tone modifiers (2-char advance)
  • getCharacterVisualWidth() scans grapheme extenders
  • Complex case handling

4. Code Quality

  • Excellent inline documentation
  • Clear naming (visualCol vs col)
  • Proper Kotlin idioms

Issues Found

CRITICAL: Regional Indicator Flag Bug
Location: TerminalCanvasRenderer.kt:299

Problem: Hardcoded col += minOf(4, ctx.visibleCols - col) assumes flags are always 4 buffer columns.

Incorrect for:

  1. Flags without DWC markers (implementation-dependent)
  2. Flags near line boundaries (e.g., col 78 in 80-col terminal)

Fix: Use ColumnConversionUtils.shouldSkipChar() or getCharacterVisualWidth() for dynamic skip count.


MEDIUM: analyzeCharacter() Complexity
Location: TerminalCanvasRenderer.kt:115-187

72-line function with complex branching. Consider extracting helpers for surrogate pairs and double-width detection.


LOW: isEmojiPresentation() Maintenance
Location: GraphemeUtils.kt:233-294

Hardcoded BMP list needs:

  1. Unicode version comment
  2. Unit tests
  3. Consider ICU4J properties (future)

LOW: Missing Validation
Location: ColumnConversionUtils.kt:154

getCharacterVisualWidth() needs better col validation.


Testing

  1. Flags at line boundaries (cols 76-79)
  2. ZWJ sequences
  3. Surrogate pairs at boundaries
  4. Performance with 10,000+ lines

Performance

Good: Caching, O(1) shouldSkipChar()
Concern: Unbounded scan in getCharacterVisualWidth() (worst O(width))
Mitigation: Add 10-char scan limit


Security

No concerns - safe iteration, bound validation, no external parsing.


Conclusion

Verdict: Approve with Minor Revisions

Required:

  1. Fix flag hardcoded skip (critical)
  2. Add Unicode version comment
  3. Add bounds validation

Recommended:

  1. Extract helpers
  2. Unit tests
  3. Max scan limit

Outstanding work on Unicode consolidation and visual column tracking!

Centralizes all emoji-related Unicode constants:

Supplementary plane emoji blocks:
- ENCLOSED_ALPHANUMERIC_SUPPLEMENT_RANGE
- MISC_SYMBOLS_PICTOGRAPHS_RANGE
- EMOTICONS_RANGE
- TRANSPORT_MAP_SYMBOLS_RANGE
- SUPPLEMENTAL_SYMBOLS_RANGE
- SYMBOLS_PICTOGRAPHS_EXTENDED_A_RANGE
- CHESS_SYMBOLS_RANGE

BMP emoji constants (40+ individual code points)

Helper functions:
- isSupplementaryPlaneEmoji()
- isBmpEmoji()

Updated files:
- GraphemeUtils.kt - simplified isEmojiPresentation()
- GraphemeCluster.kt - now uses GraphemeUtils.isEmojiPresentation()
- TerminalCanvasRenderer.kt - uses constant instead of magic number

🤖 Generated with [Claude Code](https://claude.com/claude-code)
@claude
Copy link
Copy Markdown

claude Bot commented Dec 23, 2025

Code Review - PR #136: Fix Emoji Background Rendering

Summary

This PR addresses emoji background rendering misalignment by introducing visual column tracking and comprehensive Unicode constants. The changes are well-structured and solve a real problem with emoji variation selectors.


✅ Strengths

  1. Excellent Problem Analysis: The PR clearly identifies that renderBackgrounds() was using buffer positions while renderText() uses visual positions, causing misalignment.

  2. New ColumnConversionUtils.kt (196 lines):

    • Well-documented utility class with clear purpose
    • Good separation of concerns with shouldSkipChar(), bufferColToVisualCol(), and visualColToBufferCol()
    • Comprehensive handling of grapheme extenders (DWC, ZWJ, variation selectors, skin tones, gender symbols)
  3. New UnicodeConstants.kt (226 lines):

    • Excellent centralization of Unicode constants
    • Comprehensive emoji range coverage (supplementary plane + BMP)
    • Good use of helper functions for type checking
    • Clear documentation for each constant
  4. DRY Principle: Refactored code to use UnicodeConstants helpers, eliminating magic numbers like 0x200D, 0xFE0F, etc.

  5. Analysis Caching: The analysisCache pattern in renderBackgrounds() and renderText() avoids redundant character analysis between passes.


🔍 Issues & Concerns

1. Critical: Potential Index Out of Bounds in ColumnConversionUtils.getCharacterVisualWidth() ⚠️

Location: ColumnConversionUtils.kt:165-192

Issue: The scan-ahead loop may read beyond buffer boundaries. If width > actual line length, charAt(nextCol) could throw IndexOutOfBoundsException.

Fix: Add bounds check - while loop should be: while (nextCol < width && nextCol < line.length)

Same issue in shouldSkipChar() at line 56-64.


2. Performance: Redundant Emoji Detection in analyzeCharacter()

Location: TerminalCanvasRenderer.kt:171

Issue: GraphemeUtils.isEmojiPresentation(actualCodePoint) is called for every character, even though analysis already has isWcwidthDoubleWidth and isBaseDoubleWidth.

Impact: Extra CPU cycles per character. For 80x24 terminal = 1,920 calls per frame.

Recommendation: Consider deriving isEmojiOrWideSymbol from existing isBaseDoubleWidth or cache more aggressively.


3. Potential Logic Error: hasVariationSelectorAtCol1 Check

Location: TerminalCanvasRenderer.kt:150-153

Issue: Check charAtCol2 == CharUtils.DWC doesn't validate charAtCol2 exists first.

Edge Case: Line ends at col+1 (variation selector is last char).

Recommendation: Add null check: (hasVariationSelectorAtCol1 && charAtCol2 != null && charAtCol2 == CharUtils.DWC)


4. Missing Unit Tests ⚠️

Test Coverage: This PR adds 918 lines of complex Unicode handling logic but includes no automated tests.

Critical Test Cases Needed:

  1. ColumnConversionUtils.bufferColToVisualCol() with emoji+VS+DWC sequences
  2. Regional Indicator pairs (flag emoji)
  3. ZWJ sequences (family emoji)
  4. Edge case: Line ending mid-grapheme
  5. Edge case: Surrogate pair at line boundary
  6. Skin tone modifiers with ZWJ

Recommendation: Add unit tests in bossterm-core-mpp/src/jvmTest/ before merging.


🎯 Security Review

No security concerns identified - no user input processing, external calls, or file I/O. Bounded loops with proper exit conditions (assuming bounds check fix above).


🚀 Performance Considerations

  1. Analysis Cache: Good optimization to avoid redundant character analysis between passes.

  2. Allocation Pressure: Each frame creates new AnalysisCache + ~1,920 CharacterAnalysis objects (80×24 terminal). Impact is moderate since CharacterAnalysis is a data class with mostly primitives.

  3. GraphemeUtils Cache: Good use of LRU cache (1024 entries) for grapheme width calculations.


📝 Style & Best Practices

Excellent:

  • Comprehensive KDoc comments
  • Consistent naming conventions
  • Proper use of Kotlin idioms
  • Good file organization

Questions:

  1. Should ColumnConversionUtils be in terminal.util or compose.rendering.util? It seems renderer-specific.
  2. The AnalysisCache type alias (line 193) is only used within TerminalCanvasRenderer - could it be a private typealias?

🧪 Test Plan Review

The PR description mentions: /tmp/test_emoji_bg.sh

Question: Where is this test script? Is it committed to the repo or a local script?

Recommendation: Either commit to /scripts/test_emoji_bg.sh for reproducibility, or document manual test steps.


✅ Recommendations Summary

Must Fix Before Merge:

  1. ⚠️ Add bounds checks in ColumnConversionUtils (lines 56-64, 165-192)
  2. ⚠️ Add null check for charAtCol2 in analyzeCharacter() (line 152)

Should Fix Before Merge:
3. Add basic unit tests for ColumnConversionUtils and UnicodeConstants
4. Clarify or commit the test script mentioned in test plan

Nice to Have:
5. Document isEmojiPresentation() fallback behavior with cross-reference to settings
6. Consider optimizing redundant isEmojiPresentation() calls in hot path


🎉 Overall Assessment

Quality: High - well-architected solution with good separation of concerns

Risk: Medium - complex Unicode logic without automated tests

Recommendation: Approve with changes - Fix the bounds check issues and add basic tests, then merge.


Great work on tackling this tricky emoji rendering problem! The centralized Unicode constants are a solid foundation for future Unicode work. 🚀

Changed checkRegionalIndicatorSequence() to return the number of buffer
columns occupied by a flag emoji instead of a boolean. This allows the
caller to skip exactly the right number of columns without relying on
fallback skip logic.

Possible flag layouts handled:
- [High1][Low1][High2][Low2] = 4 chars
- [High1][Low1][DWC][High2][Low2] = 5 chars
- [High1][Low1][DWC][High2][Low2][DWC] = 6 chars

This is cleaner than the previous approach which skipped 4 chars and
relied on subsequent loop iterations to handle remaining chars.

🤖 Generated with [Claude Code](https://claude.com/claude-code)
@claude
Copy link
Copy Markdown

claude Bot commented Dec 23, 2025

Pull Request Review: Emoji Background Rendering Fix

Overview

This PR addresses a critical rendering issue where emoji backgrounds were misaligned with their text. The solution introduces comprehensive Unicode handling infrastructure with proper buffer-to-visual column conversion. Overall, this is excellent work with strong architectural improvements.


✅ Strengths

1. Exceptional Code Organization

  • UnicodeConstants.kt (226 lines): Centralizes all Unicode constants with excellent documentation
    • Clear categorization: Variation Selectors, ZWJ, Skin Tones, Regional Indicators, Emoji Blocks, BMP Emoji
    • Helper functions for common checks (isVariationSelector, isSkinToneModifier, etc.)
    • This eliminates magic numbers scattered throughout the codebase

2. Clean Abstraction Layer

  • ColumnConversionUtils.kt (196 lines): Well-designed utility for buffer ↔ visual column conversion
    • shouldSkipChar(): Encapsulates complex skip logic in one place
    • bufferColToVisualCol(): Handles DWC markers, surrogate pairs, ZWJ sequences
    • getCharacterVisualWidth(): Looks ahead through grapheme extenders to find DWC markers
    • Clear documentation explaining visual vs buffer column concepts

3. Comprehensive Unicode Support

  • Handles surrogate pairs correctly (U+10000+)
  • Supports variation selectors (FE0E/FE0F)
  • Accounts for ZWJ sequences, skin tones, gender symbols
  • Regional indicator support for flag emoji
  • Covers supplementary plane emoji (U+1F300-U+1FAFF)

4. Correct Root Cause Fix

The PR description correctly identifies the problem: renderBackgrounds() used buffer columns while renderText() used visual columns. The fix properly aligns both by:

  • Adding visualCol tracking to renderBackgrounds()
  • Using ColumnConversionUtils.bufferColToVisualCol() for conversion
  • Ensuring backgrounds span the correct width for emoji

5. DRY Principle

  • analyzeCharacter() function (lines 115-246 in TerminalCanvasRenderer.kt) eliminates duplication between renderBackgrounds() and renderText()
  • CharacterAnalysis data class provides clean interface for character metadata

⚠️ Areas for Improvement

1. Missing Tests ⚠️

Critical Gap: No automated tests for the new Unicode utilities.

Recommended test coverage:

// ColumnConversionUtilsTest.kt
class ColumnConversionUtilsTest {
    @Test
    fun `test emoji with variation selector conversion`() {
        // Buffer: [⚠][FE0F][DWC] → Visual: [⚠️ (2 cells)]
        // Verify bufferColToVisualCol(0) = 0, bufferColToVisualCol(3) = 2
    }
    
    @Test
    fun `test surrogate pair conversion`() {
        // Buffer: [HIGH][LOW][DWC] → Visual: [𝕳 (2 cells)]
    }
    
    @Test
    fun `test ZWJ sequence conversion`() {
        // Buffer: [👨][ZWJ][👩][ZWJ][👧][DWC] → Visual: [👨‍👩‍👧 (2 cells)]
    }
    
    @Test
    fun `test mixed ASCII and emoji`() {
        // Buffer: [h][e][l][l][o][⚠][FE0F][DWC] → Visual positions
    }
}

// UnicodeConstantsTest.kt
class UnicodeConstantsTest {
    @Test fun `verify all emoji ranges`()
    @Test fun `verify variation selector detection`()
    @Test fun `verify skin tone detection`()
}

Impact: Without tests, future refactoring could break these complex Unicode edge cases. The PR mentions /tmp/test_emoji_bg.sh for manual testing, but automated tests are essential for regression prevention.

2. Performance Consideration 🔍

Potential Issue: analyzeCharacter() is called for every character in both renderBackgrounds() and renderText().

Current approach (lines 586-598 in TerminalCanvasRenderer.kt):

for (col in 0 until width) {
    val char = line.charAt(col)
    val analysis = analyzeCharacter(char, line, col, width, ambiguousCharsAreDoubleWidth)
    // ... use analysis ...
}

Concern: For a 80×24 terminal, this runs 1,920+ times per frame at 60fps (115,200 calls/sec). The function does:

  • Multiple line.charAt() lookups (col+1, col+2)
  • Surrogate pair detection
  • DWC marker scanning
  • Variation selector checks

Recommendation:

  1. Profile first: Measure actual impact with a profiler (may be negligible)
  2. If slow: Cache CharacterAnalysis results per line:
    val analysisCache = Array(width) { col -> analyzeCharacter(...) }
    // Reuse for both renderBackgrounds() and renderText()
  3. Monitor: Check frame times with settings.debugModeEnabled

3. Edge Case Handling 🤔

Question: What happens with incomplete grapheme clusters at line boundaries?

Scenario: If a line ends mid-emoji (e.g., high surrogate without low surrogate), does analyzeCharacter() handle it gracefully?

Current code (lines 126-132):

val lowSurrogate = if (Character.isHighSurrogate(char)) {
    when {
        charAtCol1 != null && Character.isLowSurrogate(charAtCol1) -> charAtCol1
        charAtCol1 == CharUtils.DWC && charAtCol2 != null && Character.isLowSurrogate(charAtCol2) -> charAtCol2
        else -> null  // ← What happens here?
    }
} else null

Recommendation: Add defensive check:

val actualCodePoint = if (lowSurrogate != null && Character.isLowSurrogate(lowSurrogate)) {
    Character.toCodePoint(char, lowSurrogate)
} else if (Character.isHighSurrogate(char)) {
    // Incomplete surrogate pair - treat as replacement character
    0xFFFD  // REPLACEMENT CHARACTER
} else char.code

4. Documentation Clarity 📝

Minor: The distinction between "wcwidth double-width" and "emoji double-width" could be clearer.

Current (lines 98-102 in TerminalCanvasRenderer.kt):

val isWcwidthDoubleWidth: Boolean,
val isBaseDoubleWidth: Boolean,
// ... 
val isDoubleWidth: Boolean,

Suggestion: Add KDoc:

/**
 * @property isWcwidthDoubleWidth True if wcwidth() returns 2 (CJK, some symbols)
 * @property isBaseDoubleWidth True if base character (before modifiers) is 2-cell
 * @property isDoubleWidth True if final rendered width is 2 cells (includes emoji)
 */

5. BMP Emoji List Completeness

UnicodeConstants.isBmpEmoji() (lines 202-225) includes ~40 specific BMP emoji. This is a curated list of "unambiguous emoji" that should always render as 2-cell.

Question: Is this list exhaustive? Examples that might be missing:

  • ☃ U+2603 (SNOWMAN) vs ☃️ U+2603 U+FE0F (already has SNOWMAN at U+26C4)
  • ☂ U+2602 (UMBRELLA) vs ☂️ U+2602 U+FE0F (already has UMBRELLA_RAIN at U+2614)
  • Many more BMP symbols that have emoji presentation

Recommendation:

  1. Document the selection criteria ("commonly used emoji without variation selector")
  2. Consider referencing Unicode emoji-data.txt for comprehensive list
  3. OR: Keep current approach and document that edge cases need FE0F for emoji presentation

🔒 Security Assessment

No security concerns identified

  • No user input validation issues (operates on internal buffer)
  • No command injection or XSS vectors
  • No memory safety issues (Kotlin)

🚀 Performance Assessment

  • Memory: Adds ~400 bytes per frame for character analysis (negligible)
  • CPU: Multiple analyzeCharacter() calls per frame (profile recommended)
  • Allocation: Minimal (mostly primitives and reused data classes)

📦 Test Coverage Assessment

Current: ❌ No automated tests
Required:

  • Unit tests for ColumnConversionUtils
  • Unit tests for UnicodeConstants helper functions
  • Integration tests for emoji rendering alignment
  • Edge case tests (incomplete surrogates, line boundaries)

🎯 Overall Assessment

Recommendation: ✅ APPROVE with minor improvements

This PR demonstrates:

  • Strong architectural design (separation of concerns, DRY)
  • Comprehensive Unicode knowledge
  • Clean code with excellent documentation
  • Correct fix for the root cause

Required before merge:

  1. ✅ Add unit tests for ColumnConversionUtils
  2. ✅ Add tests for UnicodeConstants helper functions
  3. ⚠️ Consider defensive handling for incomplete surrogates
  4. ⚠️ Profile performance if rendering feels sluggish

Nice to have:

  • Enhanced KDoc for CharacterAnalysis properties
  • Document BMP emoji selection criteria in UnicodeConstants

🧪 Manual Testing Checklist

Per the PR description, verified:

  • ✅ Emoji with variation selectors: ⚠️ ☁️ ❤️ ☀️
  • ✅ Emoji without variation selectors: ✅ ⭐
  • ✅ Emoji mixed with text
  • ✅ Build succeeds

Additional testing recommended:

  • Flag emoji (🇺🇸 - two regional indicators)
  • Skin tone emoji (👍🏽)
  • ZWJ sequences (👨‍👩‍👧‍👦)
  • Full-width CJK characters (中文)
  • Terminal edge cases (emoji at column 79/80)

Great work on this complex Unicode handling! The architecture is solid and will serve as a strong foundation for future rendering improvements. 🎉

- CharacterAnalysis: Document width property evaluation order
  (isWcwidthDoubleWidth -> isBaseDoubleWidth -> isDoubleWidth)
- analyzeCharacter(): Document orphaned surrogate and DWC marker handling
- isBmpEmoji(): Document selection criteria and intentionally excluded
  TUI symbols (play/pause, geometric shapes, arrows, box drawing)

🤖 Generated with [Claude Code](https://claude.com/claude-code)
@kshivang kshivang merged commit a3f3d56 into master Dec 23, 2025
4 checks passed
@claude
Copy link
Copy Markdown

claude Bot commented Dec 23, 2025

Code Review: Emoji Background Rendering Fix

This is an excellent PR that systematically addresses emoji rendering misalignment issues. The solution is well-architected with strong attention to detail. Below are my findings:


✅ Strengths

1. Outstanding Refactoring

  • UnicodeConstants.kt: Centralizing ~40 magic numbers across 7 files into a single source of truth is exemplary. The comprehensive KDoc (lines 201-225) explaining BMP emoji selection criteria vs TUI symbols is exceptional.
  • ColumnConversionUtils.kt: Extracting buffer-to-visual column conversion eliminates duplicate logic and makes the codebase significantly more maintainable.
  • analyzeCharacter() helper: Encapsulating character analysis (CharacterAnalysis data class) eliminates ~60 lines of duplication between renderBackgrounds() and renderText().
  • Analysis caching: Pass 1 populates cache, Pass 2 reuses it - eliminates ~50% redundant computation in hot path.

2. Comprehensive Emoji Support

  • Flag emoji (Regional Indicators): Proper 2-cell handling with bounds checking (commit eac56b9).
  • Supplementary plane coverage: Extended from ~60 to 1000+ emoji code points.
  • Edge case handling: Orphaned surrogates, DWC between surrogates, variation selectors, ZWJ sequences.

3. Code Quality

  • Excellent documentation: KDoc explains edge cases (orphaned surrogates at line 135-138, BMP emoji selection at 201-225).
  • Consistent error handling: Bounds checks prevent index out of bounds (line 722 in TerminalCanvasRenderer).
  • Performance conscious: Character analysis caching, copy-on-write snapshots.

🔍 Issues Found

CRITICAL: Potential Buffer Overflow in shouldSkipChar()

File: ColumnConversionUtils.kt:56-64

// Skip skin tone modifiers (U+1F3FB-U+1F3FF, encoded as surrogate pairs)
if (Character.isHighSurrogate(char) && col + 1 < width) {
    val nextChar = line.charAt(col + 1)
    if (Character.isLowSurrogate(nextChar)) {
        val codePoint = Character.toCodePoint(char, nextChar)
        if (UnicodeConstants.isSkinToneModifier(codePoint)) {
            return SkipResult(true, 2)  // ⚠️ Advances 2 cols without verifying we have 2 cols remaining
        }
    }
}

Problem: When a high surrogate appears at col = width - 1, the bounds check col + 1 < width passes (true), but returning colsToAdvance = 2 will cause the caller to advance beyond the line boundary.

Impact: In bufferColToVisualCol() line 95, the caller does col += skipResult.colsToAdvance without verifying col + 2 <= width. This can cause:

  • Off-by-one errors in column tracking
  • Reading past line boundaries in subsequent iterations
  • Visual misalignment for lines ending with partial graphemes

Fix: Change bounds check to col + 2 <= width:

if (Character.isHighSurrogate(char) && col + 2 <= width) {

Locations to fix:

  • ColumnConversionUtils.kt:56 (shouldSkipChar skin tone check)
  • ColumnConversionUtils.kt:180 (getCharacterVisualWidth skin tone check)

MEDIUM: Inconsistent Bounds Checking

File: ColumnConversionUtils.kt:154-195

The getCharacterVisualWidth() function has inconsistent bounds checking:

  • Line 160: col + 1 < width (allows reading at boundary)
  • Line 180: nextCol + 1 < width (allows reading at boundary)

But the function safely handles boundary cases by returning early. However, for consistency with the critical fix above, these should use <= comparisons where surrogate pairs are involved.

Recommendation: Audit all surrogate pair bounds checks to use col + 2 <= width pattern.


MINOR: Regional Indicator Bounds Check Could Be Clearer

File: TerminalCanvasRenderer.kt:722-760

The checkRegionalIndicatorSequence() function has proper bounds checking, but the logic is complex. Consider extracting the "count columns in flag sequence" logic into a helper function for clarity.

Current structure is correct but dense:

// Lines 722-760: Multiple nested checks for High1, Low1, DWC, High2, Low2, DWC

Recommendation: Add inline comments explaining the possible layouts:

// Possible flag layouts:
// 1. [High1][Low1][High2][Low2] = 4 chars (1 cell each, no DWC)
// 2. [High1][Low1][DWC][High2][Low2] = 5 chars (first RI is 2-cell)
// 3. [High1][Low1][DWC][High2][Low2][DWC] = 6 chars (both RIs are 2-cell)

(Note: This comment exists at line 725-728, so this is already addressed)


🎯 Performance Considerations

GOOD: Analysis Caching

The introduction of AnalysisCache (line 222) is excellent:

  • Pass 1 (renderBackgrounds) populates cache
  • Pass 2 (renderText) reuses cached results
  • Eliminates ~50% of analyzeCharacter() calls

POTENTIAL OPTIMIZATION: Cache Allocation

The analysisCache is created fresh for every render frame (line 242). For a typical 80x24 terminal, this creates a map with ~1920 entries per frame.

Recommendation: Consider using an object pool for the cache map, or preallocating with expected capacity:

val analysisCache: AnalysisCache = HashMap(ctx.visibleRows * ctx.visibleCols)

🔒 Security Considerations

GOOD: No Security Issues Detected

  • Bounds checks prevent buffer overflows (except the critical issue noted above)
  • No unsafe casts or unchecked array access
  • Unicode handling follows UTF-16 surrogate pair specifications

🧪 Test Coverage

The PR description mentions manual testing with /tmp/test_emoji_bg.sh, but I recommend adding:

  1. Unit tests for ColumnConversionUtils:

    • Boundary case: Skin tone modifier at col = width - 1
    • Boundary case: Surrogate pair at col = width - 1
    • Regional Indicator sequence at line end
  2. Unit tests for analyzeCharacter():

    • Orphaned high surrogate
    • DWC between surrogates
    • Variation selector without DWC
    • Flag emoji at line boundary
  3. Integration tests for rendering:

    • Emoji with variation selectors (⚠️, ☁️, ❤️)
    • Flag emoji (🇺🇸, 🇬🇧, 🇯🇵)
    • ZWJ sequences (👨‍👩‍👧‍👦)
    • Skin tones (👍🏽)

📊 Summary

Metric Score Notes
Code Quality ⭐⭐⭐⭐⭐ Excellent refactoring, documentation, and structure
Correctness ⚠️ ⭐⭐⭐⭐ One critical bounds check issue, otherwise solid
Performance ⭐⭐⭐⭐☆ Good caching strategy, minor optimization opportunities
Test Coverage ⚠️ ⭐⭐☆☆ Needs unit tests for edge cases
Security ✅ ⭐⭐⭐⭐⭐ No security concerns (after fixing bounds issue)

🎬 Recommendation

Status: ⚠️ Approve with changes required

Required Changes:

  1. Fix critical bounds check in shouldSkipChar() (line 56) - change to col + 2 <= width
  2. Fix bounds check in getCharacterVisualWidth() (line 180) - same fix

Optional Improvements:

  1. Preallocate analysisCache with expected capacity
  2. Add unit tests for boundary cases
  3. Consider adding integration tests for emoji rendering

🙌 Excellent Work!

This PR demonstrates:

  • Deep understanding of Unicode complexity (surrogates, graphemes, emoji sequences)
  • Strong refactoring skills (eliminating duplication across 7 files)
  • Attention to performance (caching, copy-on-write optimization)
  • Comprehensive documentation (KDoc explaining edge cases)

The bounds check issue is straightforward to fix and doesn't diminish the overall quality of this work. After addressing the critical fix, this will be a significant improvement to emoji rendering in BossTerm.


Next Steps: Please fix the critical bounds check issue in ColumnConversionUtils.kt and I'll give this an approval! 🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant