-
Notifications
You must be signed in to change notification settings - Fork 3
fix(resume): use char-aware string truncation for UTF-8 safety #33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Greptile OverviewGreptile SummaryFixed critical UTF-8 safety bugs that caused panics when slicing multi-byte characters. Replaced unsafe byte-based string slicing with char-aware methods across three files:
All changes maintain original functionality while preventing runtime panics on multi-byte UTF-8 input (emoji, CJK characters, etc.). Comprehensive unit tests added for UTF-8 edge cases. Confidence Score: 5/5
|
| Filename | Overview |
|---|---|
| src/cortex-resume/src/resume_picker.rs | Fixed UTF-8 panic in truncate_string by using char-based iteration instead of byte slicing, with comprehensive tests |
| src/cortex-cli/src/utils/notification.rs | Replaced byte-based string slicing with char-aware boundary detection for session ID truncation |
| src/cortex-cli/src/import_cmd.rs | Added safe slicing with .get() to prevent panics when extracting base64 data from message content |
Sequence Diagram
sequenceDiagram
participant User
participant ResumePicker as Resume Picker UI
participant Notification as Notification System
participant ImportCmd as Import Command
participant TruncateString as truncate_string()
participant SafeSlicing as Safe UTF-8 Slicing
Note over ResumePicker,SafeSlicing: String Truncation Flow (resume_picker.rs)
User->>ResumePicker: Display session with path/title
ResumePicker->>TruncateString: truncate_string(text, width)
TruncateString->>TruncateString: Count chars (not bytes)
alt char_count <= width
TruncateString-->>ResumePicker: Return original string
else width > 3
TruncateString->>TruncateString: Take (width-3) chars safely
TruncateString-->>ResumePicker: Return truncated + "..."
else
TruncateString->>TruncateString: Take width chars
TruncateString-->>ResumePicker: Return truncated string
end
ResumePicker-->>User: Display safe truncated text
Note over Notification,SafeSlicing: Session ID Truncation (notification.rs)
User->>Notification: Task completes (session_id)
Notification->>SafeSlicing: Truncate session_id to 8 chars
SafeSlicing->>SafeSlicing: char_indices().take_while(idx < 8)
SafeSlicing->>SafeSlicing: Find last valid char boundary
SafeSlicing->>SafeSlicing: Use .get(..end) for safe slice
SafeSlicing-->>Notification: Return safe truncated ID
Notification-->>User: Show desktop notification
Note over ImportCmd,SafeSlicing: Base64 Extraction (import_cmd.rs)
User->>ImportCmd: Import messages with embedded images
ImportCmd->>ImportCmd: Find "data:image/" pattern
ImportCmd->>SafeSlicing: Slice at data_uri_start with .get()
alt Valid byte boundary
SafeSlicing-->>ImportCmd: Return Some(slice)
ImportCmd->>SafeSlicing: Slice at base64_start with .get()
alt Valid byte boundary
SafeSlicing-->>ImportCmd: Return Some(base64_data)
ImportCmd->>ImportCmd: Validate base64 encoding
else Invalid boundary
SafeSlicing-->>ImportCmd: Return None
ImportCmd->>ImportCmd: Skip (continue)
end
else Invalid boundary
SafeSlicing-->>ImportCmd: Return None
ImportCmd->>ImportCmd: Skip message (continue)
end
ImportCmd-->>User: Safe import without panics
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
3 files reviewed, no comments
This PR consolidates the following UTF-8 safety fixes: - #31: Use safe UTF-8 slicing in import command base64 extraction - #32: Use safe UTF-8 slicing for session IDs in notifications - #33: Use char-aware string truncation for UTF-8 safety in resume - #35: Use safe UTF-8 slicing for session IDs in lock command - #37: Validate UTF-8 boundaries in mention parsing All changes ensure safe string operations that respect UTF-8 boundaries: - Replaced direct byte slicing with char-aware methods - Added floor_char_boundary checks before slicing - Prevents panics from slicing multi-byte characters
|
Consolidated into #70 - fix: consolidated UTF-8 safety improvements for string slicing |
Summary
Fixes #5288 - Path truncation panics on multi-byte UTF-8.
Problem
String truncation used byte-based slicing that panics on multi-byte characters.
Solution
Use char-based iteration for safe truncation.