-
Notifications
You must be signed in to change notification settings - Fork 3
fix(cli): use safe UTF-8 slicing in import command base64 extraction #31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Greptile OverviewGreptile SummaryReplaced direct byte-offset string slicing with safe
Confidence Score: 4/5
|
| Filename | Overview |
|---|---|
| src/cortex-cli/src/import_cmd.rs | Replaced direct byte-offset string slicing with safe .get() calls to prevent panics on UTF-8 boundaries, but silently skips validation when slicing fails |
Sequence Diagram
sequenceDiagram
participant User
participant ImportCmd
participant Validation as validate_export_messages
participant SafeSlice as String.get()
User->>ImportCmd: import command with JSON
ImportCmd->>Validation: validate_export_messages(messages)
loop For each message
Validation->>Validation: Find data:image/ marker
alt Marker found
Validation->>SafeSlice: content.get(data_uri_start..)
alt Valid UTF-8 boundary
SafeSlice-->>Validation: Some(substring)
Validation->>Validation: Find base64 marker in substring
Validation->>SafeSlice: content.get(base64_start..)
alt Valid offset
SafeSlice-->>Validation: Some(remaining)
Validation->>Validation: Find end delimiter
Validation->>SafeSlice: remaining.get(..base64_end)
alt Valid range
SafeSlice-->>Validation: Some(base64_data)
Validation->>Validation: Validate base64 encoding
else Invalid range
SafeSlice-->>Validation: None
Note over Validation: Skip validation (continue)
end
else Invalid offset
SafeSlice-->>Validation: None
Note over Validation: Skip validation (continue)
end
else Invalid boundary
SafeSlice-->>Validation: None
Note over Validation: Skip validation (continue)
end
end
end
Validation-->>ImportCmd: Ok() or Error
ImportCmd-->>User: Import success/failure
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1 file reviewed, 2 comments
| let content_after_start = match message.content.get(data_uri_start..) { | ||
| Some(s) => s, | ||
| None => continue, // Invalid byte offset, skip this message | ||
| }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Silently skipping validation when .get() returns None could hide issues. If data_uri_start is a valid position found by .find(), then .get(data_uri_start..) should succeed unless there's memory corruption. Consider logging a warning or returning an error instead of continuing, since this indicates potentially malformed data that should be caught.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/cortex-cli/src/import_cmd.rs
Line: 362:365
Comment:
Silently skipping validation when `.get()` returns `None` could hide issues. If `data_uri_start` is a valid position found by `.find()`, then `.get(data_uri_start..)` should succeed unless there's memory corruption. Consider logging a warning or returning an error instead of continuing, since this indicates potentially malformed data that should be caught.
How can I resolve this? If you propose a fix, please make it concise.| let remaining = match message.content.get(base64_start..) { | ||
| Some(s) => s, | ||
| None => continue, // Invalid byte offset, skip this message | ||
| }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The arithmetic data_uri_start + base64_marker + 8 could potentially result in an out-of-bounds index if the string ends unexpectedly. While using .get() prevents panics, silently continuing on None means validation is skipped for potentially malformed data. Consider whether this should be an error instead of silently continuing.
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/cortex-cli/src/import_cmd.rs
Line: 371:374
Comment:
The arithmetic `data_uri_start + base64_marker + 8` could potentially result in an out-of-bounds index if the string ends unexpectedly. While using `.get()` prevents panics, silently continuing on `None` means validation is skipped for potentially malformed data. Consider whether this should be an error instead of silently continuing.
<sub>Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!</sub>
How can I resolve this? If you propose a fix, please make it concise.This PR consolidates the following UTF-8 safety fixes: - #31: Use safe UTF-8 slicing in import command base64 extraction - #32: Use safe UTF-8 slicing for session IDs in notifications - #33: Use char-aware string truncation for UTF-8 safety in resume - #35: Use safe UTF-8 slicing for session IDs in lock command - #37: Validate UTF-8 boundaries in mention parsing All changes ensure safe string operations that respect UTF-8 boundaries: - Replaced direct byte slicing with char-aware methods - Added floor_char_boundary checks before slicing - Prevents panics from slicing multi-byte characters
|
Consolidated into #70 - fix: consolidated UTF-8 safety improvements for string slicing |
Summary
Fixes #5284 - Import command base64 extraction panics on multi-byte UTF-8.
Problem
Base64 data extraction uses byte offsets for string slicing that can fall inside multi-byte UTF-8 sequences.
Solution
Replaced direct slicing with safe .get() calls and boundary validation.
Testing