fix(cli): use safe UTF-8 slicing in import command base64 extraction #31

echobt · 2026-02-04T14:44:54Z

Summary

Fixes #5284 - Import command base64 extraction panics on multi-byte UTF-8.

Problem

Base64 data extraction uses byte offsets for string slicing that can fall inside multi-byte UTF-8 sequences.

Solution

Replaced direct slicing with safe .get() calls and boundary validation.

Testing

Verified with cargo check -p cortex-cli

greptile-apps · 2026-02-04T14:48:10Z

Greptile Overview

Greptile Summary

Replaced direct byte-offset string slicing with safe .get() method calls in base64 data extraction logic to prevent panics on UTF-8 character boundaries.

Changed three instances of direct slicing (content[start..]) to safe .get(start..) calls in validate_export_messages function
When .get() returns None, validation is skipped for that message (continues to next) instead of panicking
Affects validation of base64-encoded image data in both message content and tool call arguments
The fix is defensive - while .find() should always return valid UTF-8 boundaries, using .get() adds an extra safety layer

Confidence Score: 4/5

This PR is safe to merge with low risk - it replaces panic-prone operations with safer alternatives
The change is a straightforward safety improvement that replaces direct string slicing with .get() method calls. While the behavior changes slightly (skipping validation on None instead of panicking), this is acceptable for a defensive fix. The main consideration is whether silently skipping validation is preferable to logging/warning, but for preventing crashes this is reasonable.
No files require special attention - the change is localized and straightforward

Important Files Changed

Filename	Overview
src/cortex-cli/src/import_cmd.rs	Replaced direct byte-offset string slicing with safe `.get()` calls to prevent panics on UTF-8 boundaries, but silently skips validation when slicing fails

Sequence Diagram

sequenceDiagram
    participant User
    participant ImportCmd
    participant Validation as validate_export_messages
    participant SafeSlice as String.get()
    
    User->>ImportCmd: import command with JSON
    ImportCmd->>Validation: validate_export_messages(messages)
    
    loop For each message
        Validation->>Validation: Find data:image/ marker
        alt Marker found
            Validation->>SafeSlice: content.get(data_uri_start..)
            alt Valid UTF-8 boundary
                SafeSlice-->>Validation: Some(substring)
                Validation->>Validation: Find base64 marker in substring
                Validation->>SafeSlice: content.get(base64_start..)
                alt Valid offset
                    SafeSlice-->>Validation: Some(remaining)
                    Validation->>Validation: Find end delimiter
                    Validation->>SafeSlice: remaining.get(..base64_end)
                    alt Valid range
                        SafeSlice-->>Validation: Some(base64_data)
                        Validation->>Validation: Validate base64 encoding
                    else Invalid range
                        SafeSlice-->>Validation: None
                        Note over Validation: Skip validation (continue)
                    end
                else Invalid offset
                    SafeSlice-->>Validation: None
                    Note over Validation: Skip validation (continue)
                end
            else Invalid boundary
                SafeSlice-->>Validation: None
                Note over Validation: Skip validation (continue)
            end
        end
    end
    
    Validation-->>ImportCmd: Ok() or Error
    ImportCmd-->>User: Import success/failure

greptile-apps

_{1 file reviewed, 2 comments}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-02-04T14:48:14Z

src/cortex-cli/src/import_cmd.rs

+            let content_after_start = match message.content.get(data_uri_start..) {
+                Some(s) => s,
+                None => continue, // Invalid byte offset, skip this message
+            };


Silently skipping validation when .get() returns None could hide issues. If data_uri_start is a valid position found by .find(), then .get(data_uri_start..) should succeed unless there's memory corruption. Consider logging a warning or returning an error instead of continuing, since this indicates potentially malformed data that should be caught.

Prompt To Fix With AI

This is a comment left during a code review. Path: src/cortex-cli/src/import_cmd.rs Line: 362:365 Comment: Silently skipping validation when `.get()` returns `None` could hide issues. If `data_uri_start` is a valid position found by `.find()`, then `.get(data_uri_start..)` should succeed unless there's memory corruption. Consider logging a warning or returning an error instead of continuing, since this indicates potentially malformed data that should be caught. How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-02-04T14:48:15Z

src/cortex-cli/src/import_cmd.rs

+                let remaining = match message.content.get(base64_start..) {
+                    Some(s) => s,
+                    None => continue, // Invalid byte offset, skip this message
+                };


The arithmetic data_uri_start + base64_marker + 8 could potentially result in an out-of-bounds index if the string ends unexpectedly. While using .get() prevents panics, silently continuing on None means validation is skipped for potentially malformed data. Consider whether this should be an error instead of silently continuing.

_{Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!}

Prompt To Fix With AI

This is a comment left during a code review. Path: src/cortex-cli/src/import_cmd.rs Line: 371:374 Comment: The arithmetic `data_uri_start + base64_marker + 8` could potentially result in an out-of-bounds index if the string ends unexpectedly. While using `.get()` prevents panics, silently continuing on `None` means validation is skipped for potentially malformed data. Consider whether this should be an error instead of silently continuing. <sub>Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!</sub> How can I resolve this? If you propose a fix, please make it concise.

This PR consolidates the following UTF-8 safety fixes: - #31: Use safe UTF-8 slicing in import command base64 extraction - #32: Use safe UTF-8 slicing for session IDs in notifications - #33: Use char-aware string truncation for UTF-8 safety in resume - #35: Use safe UTF-8 slicing for session IDs in lock command - #37: Validate UTF-8 boundaries in mention parsing All changes ensure safe string operations that respect UTF-8 boundaries: - Replaced direct byte slicing with char-aware methods - Added floor_char_boundary checks before slicing - Prevents panics from slicing multi-byte characters

echobt · 2026-02-04T15:41:15Z

Consolidated into #70 - fix: consolidated UTF-8 safety improvements for string slicing

fix(cli): use safe UTF-8 slicing in import command base64 extraction

912acd7

greptile-apps bot reviewed Feb 4, 2026

View reviewed changes

echobt mentioned this pull request Feb 4, 2026

fix: consolidated UTF-8 safety improvements for string slicing #70

Closed

echobt closed this Feb 4, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(cli): use safe UTF-8 slicing in import command base64 extraction #31

fix(cli): use safe UTF-8 slicing in import command base64 extraction #31

Uh oh!

echobt commented Feb 4, 2026

Uh oh!

greptile-apps bot commented Feb 4, 2026

Important Files Changed

Uh oh!

greptile-apps bot left a comment

Uh oh!

greptile-apps bot Feb 4, 2026

Uh oh!

greptile-apps bot Feb 4, 2026

Uh oh!

echobt commented Feb 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

fix(cli): use safe UTF-8 slicing in import command base64 extraction #31

fix(cli): use safe UTF-8 slicing in import command base64 extraction #31

Uh oh!

Conversation

echobt commented Feb 4, 2026

Summary

Problem

Solution

Testing

Uh oh!

greptile-apps bot commented Feb 4, 2026

Greptile Overview

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 4, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 4, 2026

Choose a reason for hiding this comment

Uh oh!

echobt commented Feb 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant