Skip to content

Conversation

@echobt
Copy link
Contributor

@echobt echobt commented Feb 4, 2026

Summary

This PR consolidates 5 UTF-8 safety fixes into a single, cohesive change.

Included PRs:

Changes:

  • Replaced direct byte slicing with char-aware methods
  • Added floor_char_boundary checks before slicing
  • Prevents panics from slicing multi-byte characters

Files Modified:

  • src/cortex-agents/src/mention.rs
  • src/cortex-cli/src/import_cmd.rs
  • src/cortex-cli/src/lock_cmd.rs
  • src/cortex-cli/src/utils/notification.rs
  • src/cortex-resume/src/resume_picker.rs

Closes #31, #32, #33, #35, #37

This PR consolidates the following UTF-8 safety fixes:
- #31: Use safe UTF-8 slicing in import command base64 extraction
- #32: Use safe UTF-8 slicing for session IDs in notifications
- #33: Use char-aware string truncation for UTF-8 safety in resume
- #35: Use safe UTF-8 slicing for session IDs in lock command
- #37: Validate UTF-8 boundaries in mention parsing

All changes ensure safe string operations that respect UTF-8 boundaries:
- Replaced direct byte slicing with char-aware methods
- Added floor_char_boundary checks before slicing
- Prevents panics from slicing multi-byte characters
@greptile-apps
Copy link

greptile-apps bot commented Feb 4, 2026

Greptile Overview

Greptile Summary

This PR consolidates 5 separate UTF-8 safety fixes into a unified improvement, preventing potential runtime panics when slicing strings containing multi-byte characters (emojis, CJK characters, etc.).

Key Changes:

  • mention.rs: Added safe_slice_up_to() and safe_slice_from() helper functions that check is_char_boundary() and search for valid boundaries before slicing
  • import_cmd.rs: Replaced direct slicing with .get() method for safe Option-based access when extracting base64 data from user messages
  • lock_cmd.rs: Introduced safe_char_prefix() to truncate session IDs by character count (not bytes) using char_indices()
  • notification.rs: Rewrote session ID truncation logic to use char_indices() for safe multi-byte handling
  • resume_picker.rs: Updated truncate_string() to count characters using .chars() iterator instead of byte length

Testing:
All changes include comprehensive unit tests covering ASCII, multi-byte UTF-8 (Japanese characters - 3 bytes each), emojis (4 bytes each), and mixed content.

Impact:
Eliminates panic risks in edge cases where session IDs, user inputs, or file paths contain multi-byte UTF-8 characters.

Confidence Score: 5/5

  • This PR is safe to merge with no identified risks
  • All changes follow established Rust UTF-8 safety patterns, include comprehensive test coverage for edge cases, and address real panic risks. The implementations use standard library methods (is_char_boundary, char_indices, .get()) correctly.
  • No files require special attention

Important Files Changed

Filename Overview
src/cortex-agents/src/mention.rs Added helper functions safe_slice_up_to and safe_slice_from to handle UTF-8 boundaries safely, with comprehensive tests for multi-byte characters
src/cortex-cli/src/import_cmd.rs Replaced direct string slicing with .get() method for safe UTF-8 boundary handling in base64 extraction logic
src/cortex-cli/src/lock_cmd.rs Introduced safe_char_prefix function to safely truncate session IDs by character count instead of byte count, with comprehensive tests
src/cortex-cli/src/utils/notification.rs Rewrote session ID truncation using char_indices() to safely handle multi-byte UTF-8 characters in notifications
src/cortex-resume/src/resume_picker.rs Updated truncate_string to use character-based truncation with .chars() iterator, plus comprehensive UTF-8 tests

Sequence Diagram

sequenceDiagram
    participant Client as Client Code
    participant OldAPI as Old (Unsafe) Slicing
    participant NewAPI as New (Safe) UTF-8 Slicing
    participant String as Rust String

    Note over Client,String: Before: Direct byte-based slicing
    Client->>OldAPI: &text[..pos]
    OldAPI->>String: Slice at byte position
    alt Byte pos is mid-character
        String-->>OldAPI: ❌ PANIC!
        OldAPI-->>Client: Thread crash
    else Byte pos is valid boundary
        String-->>OldAPI: ✅ Valid slice
        OldAPI-->>Client: Return slice
    end

    Note over Client,String: After: Character-boundary aware slicing
    Client->>NewAPI: safe_slice_up_to(text, pos)
    NewAPI->>String: Check is_char_boundary(pos)
    alt Position invalid
        NewAPI->>NewAPI: Search backwards for valid boundary
        NewAPI->>String: Slice at valid boundary
        String-->>NewAPI: ✅ Valid slice
    else Position valid
        NewAPI->>String: Slice at position
        String-->>NewAPI: ✅ Valid slice
    end
    NewAPI-->>Client: Always returns valid UTF-8

    Note over Client,String: Alternative: Using .get() method
    Client->>NewAPI: text.get(start..end)
    NewAPI->>String: Option-based safe access
    alt Invalid boundaries
        String-->>NewAPI: None
        NewAPI-->>Client: Handle gracefully
    else Valid boundaries
        String-->>NewAPI: Some(&str)
        NewAPI-->>Client: Use slice
    end
Loading

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

@echobt
Copy link
Contributor Author

echobt commented Feb 4, 2026

Closing this PR to consolidate into a single mega-PR combining all bug fixes. The changes will be included in a new consolidated PR.

@echobt echobt closed this Feb 4, 2026
echobt added a commit that referenced this pull request Feb 4, 2026
This PR consolidates all bug fixes and security improvements from PRs #69-88 into a single cohesive change.

## Categories

### Security Fixes
- Path traversal prevention in MCP and session storage
- Shell injection prevention in restore scripts
- Secure random temp files for external editor
- TOCTOU race condition fixes

### TUI Improvements
- Overflow prevention for u16 conversions
- Cursor positioning fixes in selection lists
- Unicode width handling for popups
- Empty section handling in help browser

### Error Handling
- Graceful semaphore and init failure handling
- Improved error propagation in middleware
- Better client access error handling
- SystemTime operation safety

### Memory and Storage
- Cache size limits to prevent unbounded growth
- File lock cleanup for memory leak prevention
- fsync after critical writes for durability
- Bounded ToolResponseStore with automatic cleanup

### Protocol Robustness
- Buffer size limits for StreamProcessor
- ToolState transition validation
- State machine documentation

### Numeric Safety
- Saturating operations to prevent overflow/underflow
- Safe UTF-8 string slicing throughout codebase

### Tools
- Parameter alias support for backward compatibility
- Handler name consistency fixes

## Files Modified
Multiple files across cortex-tui, cortex-engine, cortex-exec, cortex-common,
cortex-protocol, cortex-storage, cortex-mcp-server, and other crates.

Closes #69, #70, #71, #73, #75, #80, #82, #87, #88
echobt added a commit that referenced this pull request Feb 4, 2026
This PR consolidates all bug fixes and security improvements from PRs #69-88 into a single cohesive change.

## Categories

### Security Fixes
- Path traversal prevention in MCP and session storage
- Shell injection prevention in restore scripts
- Secure random temp files for external editor
- TOCTOU race condition fixes

### TUI Improvements
- Overflow prevention for u16 conversions
- Cursor positioning fixes in selection lists
- Unicode width handling for popups
- Empty section handling in help browser

### Error Handling
- Graceful semaphore and init failure handling
- Improved error propagation in middleware
- Better client access error handling
- SystemTime operation safety

### Memory and Storage
- Cache size limits to prevent unbounded growth
- File lock cleanup for memory leak prevention
- fsync after critical writes for durability
- Bounded ToolResponseStore with automatic cleanup

### Protocol Robustness
- Buffer size limits for StreamProcessor
- ToolState transition validation
- State machine documentation

### Numeric Safety
- Saturating operations to prevent overflow/underflow
- Safe UTF-8 string slicing throughout codebase

### Tools
- Parameter alias support for backward compatibility
- Handler name consistency fixes

## Files Modified
Multiple files across cortex-tui, cortex-engine, cortex-exec, cortex-common,
cortex-protocol, cortex-storage, cortex-mcp-server, and other crates.

Closes #69, #70, #71, #73, #75, #80, #82, #87, #88
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant