Skip to content

fix(notifications): use safe UTF-8 slicing for session IDs#32

Closed
echobt wants to merge 2 commits intomainfrom
fix/utf8-notification-session-id
Closed

fix(notifications): use safe UTF-8 slicing for session IDs#32
echobt wants to merge 2 commits intomainfrom
fix/utf8-notification-session-id

Conversation

@echobt
Copy link
Contributor

@echobt echobt commented Feb 4, 2026

Summary

Fixes #5274 - Notification session ID slicing panics on multi-byte UTF-8.

Problem

Session ID string handling uses byte-based slicing which can panic on multi-byte UTF-8 characters when the slice boundary falls in the middle of a character.

Solution

Used safe string slicing with char boundaries - iterates through character indices to find the last valid UTF-8 boundary at or before position 8, then uses .get() for safe slicing.

@greptile-apps
Copy link

greptile-apps bot commented Feb 4, 2026

Greptile Overview

Greptile Summary

This PR replaces unsafe byte-based string slicing with UTF-8 character boundary aware slicing to prevent panics when slicing multi-byte UTF-8 characters.

Changes Made:

  • import_cmd.rs: Replaced all direct byte slicing ([start..], [..end]) with safe .get() methods that return Option<&str>, gracefully handling invalid boundaries by continuing to the next iteration
  • notification.rs: Replaced [..8.min(len)] with character-aware slicing logic using char_indices() to find valid UTF-8 boundaries

Issues Found:

  • Logic bug in notification.rs:69 where the condition checks if character start index is <8 rather than checking if character end position is ≤8, allowing characters that straddle the 8-byte boundary to be included (e.g., a 4-byte emoji starting at byte 7 would extend the result to 11 bytes)

Confidence Score: 2/5

  • This PR has a logic bug in notification.rs that violates the intended 8-byte truncation constraint
  • The import_cmd.rs changes are correct and robust, but notification.rs contains a subtle logic error where multi-byte characters can cause the result to exceed the intended 8-byte limit
  • src/cortex-cli/src/utils/notification.rs requires attention due to the boundary check logic error

Important Files Changed

Filename Overview
src/cortex-cli/src/utils/notification.rs Replaced unsafe byte slicing with UTF-8 aware slicing, but has logic bug that can exceed 8-byte limit when multi-byte chars straddle the boundary
src/cortex-cli/src/import_cmd.rs Correctly replaced all unsafe byte slicing with safe .get() methods that gracefully handle invalid UTF-8 boundaries

Sequence Diagram

sequenceDiagram
    participant Caller
    participant send_task_notification
    participant String Slicing
    participant send_notification
    
    Caller->>send_task_notification: session_id, success
    send_task_notification->>String Slicing: Truncate session_id to ~8 bytes
    Note over String Slicing: char_indices().take_while(idx < 8)<br/>map to char end positions<br/>get last position
    String Slicing-->>send_task_notification: short_id
    send_task_notification->>send_notification: title, body, urgency
    send_notification-->>Caller: Result
Loading

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

// Use safe UTF-8 slicing - find the last valid char boundary at or before position 8
let short_id = session_id
.char_indices()
.take_while(|(idx, _)| *idx < 8)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logic bug: take_while(|(idx, _)| *idx < 8) takes chars whose start index is <8, but doesn't check if they exceed 8 bytes when included. If a 4-byte char starts at index 7, it extends to byte 11.

Suggested change
.take_while(|(idx, _)| *idx < 8)
.take_while(|(idx, ch)| idx + ch.len_utf8() <= 8)
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/cortex-cli/src/utils/notification.rs
Line: 69:69

Comment:
Logic bug: `take_while(|(idx, _)| *idx < 8)` takes chars whose start index is <8, but doesn't check if they exceed 8 bytes when included. If a 4-byte char starts at index 7, it extends to byte 11.

```suggestion
        .take_while(|(idx, ch)| idx + ch.len_utf8() <= 8)
```

How can I resolve this? If you propose a fix, please make it concise.

echobt added a commit that referenced this pull request Feb 4, 2026
This PR consolidates the following UTF-8 safety fixes:
- #31: Use safe UTF-8 slicing in import command base64 extraction
- #32: Use safe UTF-8 slicing for session IDs in notifications
- #33: Use char-aware string truncation for UTF-8 safety in resume
- #35: Use safe UTF-8 slicing for session IDs in lock command
- #37: Validate UTF-8 boundaries in mention parsing

All changes ensure safe string operations that respect UTF-8 boundaries:
- Replaced direct byte slicing with char-aware methods
- Added floor_char_boundary checks before slicing
- Prevents panics from slicing multi-byte characters
@echobt
Copy link
Contributor Author

echobt commented Feb 4, 2026

Consolidated into #70 - fix: consolidated UTF-8 safety improvements for string slicing

@echobt echobt closed this Feb 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant