-
Notifications
You must be signed in to change notification settings - Fork 3
fix: consolidated UTF-8 safety improvements for string slicing #70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This PR consolidates the following UTF-8 safety fixes: - #31: Use safe UTF-8 slicing in import command base64 extraction - #32: Use safe UTF-8 slicing for session IDs in notifications - #33: Use char-aware string truncation for UTF-8 safety in resume - #35: Use safe UTF-8 slicing for session IDs in lock command - #37: Validate UTF-8 boundaries in mention parsing All changes ensure safe string operations that respect UTF-8 boundaries: - Replaced direct byte slicing with char-aware methods - Added floor_char_boundary checks before slicing - Prevents panics from slicing multi-byte characters
Greptile OverviewGreptile SummaryThis PR consolidates 5 separate UTF-8 safety fixes into a unified improvement, preventing potential runtime panics when slicing strings containing multi-byte characters (emojis, CJK characters, etc.). Key Changes:
Testing: Impact: Confidence Score: 5/5
|
| Filename | Overview |
|---|---|
| src/cortex-agents/src/mention.rs | Added helper functions safe_slice_up_to and safe_slice_from to handle UTF-8 boundaries safely, with comprehensive tests for multi-byte characters |
| src/cortex-cli/src/import_cmd.rs | Replaced direct string slicing with .get() method for safe UTF-8 boundary handling in base64 extraction logic |
| src/cortex-cli/src/lock_cmd.rs | Introduced safe_char_prefix function to safely truncate session IDs by character count instead of byte count, with comprehensive tests |
| src/cortex-cli/src/utils/notification.rs | Rewrote session ID truncation using char_indices() to safely handle multi-byte UTF-8 characters in notifications |
| src/cortex-resume/src/resume_picker.rs | Updated truncate_string to use character-based truncation with .chars() iterator, plus comprehensive UTF-8 tests |
Sequence Diagram
sequenceDiagram
participant Client as Client Code
participant OldAPI as Old (Unsafe) Slicing
participant NewAPI as New (Safe) UTF-8 Slicing
participant String as Rust String
Note over Client,String: Before: Direct byte-based slicing
Client->>OldAPI: &text[..pos]
OldAPI->>String: Slice at byte position
alt Byte pos is mid-character
String-->>OldAPI: ❌ PANIC!
OldAPI-->>Client: Thread crash
else Byte pos is valid boundary
String-->>OldAPI: ✅ Valid slice
OldAPI-->>Client: Return slice
end
Note over Client,String: After: Character-boundary aware slicing
Client->>NewAPI: safe_slice_up_to(text, pos)
NewAPI->>String: Check is_char_boundary(pos)
alt Position invalid
NewAPI->>NewAPI: Search backwards for valid boundary
NewAPI->>String: Slice at valid boundary
String-->>NewAPI: ✅ Valid slice
else Position valid
NewAPI->>String: Slice at position
String-->>NewAPI: ✅ Valid slice
end
NewAPI-->>Client: Always returns valid UTF-8
Note over Client,String: Alternative: Using .get() method
Client->>NewAPI: text.get(start..end)
NewAPI->>String: Option-based safe access
alt Invalid boundaries
String-->>NewAPI: None
NewAPI-->>Client: Handle gracefully
else Valid boundaries
String-->>NewAPI: Some(&str)
NewAPI-->>Client: Use slice
end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
5 files reviewed, no comments
|
Closing this PR to consolidate into a single mega-PR combining all bug fixes. The changes will be included in a new consolidated PR. |
This PR consolidates all bug fixes and security improvements from PRs #69-88 into a single cohesive change. ## Categories ### Security Fixes - Path traversal prevention in MCP and session storage - Shell injection prevention in restore scripts - Secure random temp files for external editor - TOCTOU race condition fixes ### TUI Improvements - Overflow prevention for u16 conversions - Cursor positioning fixes in selection lists - Unicode width handling for popups - Empty section handling in help browser ### Error Handling - Graceful semaphore and init failure handling - Improved error propagation in middleware - Better client access error handling - SystemTime operation safety ### Memory and Storage - Cache size limits to prevent unbounded growth - File lock cleanup for memory leak prevention - fsync after critical writes for durability - Bounded ToolResponseStore with automatic cleanup ### Protocol Robustness - Buffer size limits for StreamProcessor - ToolState transition validation - State machine documentation ### Numeric Safety - Saturating operations to prevent overflow/underflow - Safe UTF-8 string slicing throughout codebase ### Tools - Parameter alias support for backward compatibility - Handler name consistency fixes ## Files Modified Multiple files across cortex-tui, cortex-engine, cortex-exec, cortex-common, cortex-protocol, cortex-storage, cortex-mcp-server, and other crates. Closes #69, #70, #71, #73, #75, #80, #82, #87, #88
This PR consolidates all bug fixes and security improvements from PRs #69-88 into a single cohesive change. ## Categories ### Security Fixes - Path traversal prevention in MCP and session storage - Shell injection prevention in restore scripts - Secure random temp files for external editor - TOCTOU race condition fixes ### TUI Improvements - Overflow prevention for u16 conversions - Cursor positioning fixes in selection lists - Unicode width handling for popups - Empty section handling in help browser ### Error Handling - Graceful semaphore and init failure handling - Improved error propagation in middleware - Better client access error handling - SystemTime operation safety ### Memory and Storage - Cache size limits to prevent unbounded growth - File lock cleanup for memory leak prevention - fsync after critical writes for durability - Bounded ToolResponseStore with automatic cleanup ### Protocol Robustness - Buffer size limits for StreamProcessor - ToolState transition validation - State machine documentation ### Numeric Safety - Saturating operations to prevent overflow/underflow - Safe UTF-8 string slicing throughout codebase ### Tools - Parameter alias support for backward compatibility - Handler name consistency fixes ## Files Modified Multiple files across cortex-tui, cortex-engine, cortex-exec, cortex-common, cortex-protocol, cortex-storage, cortex-mcp-server, and other crates. Closes #69, #70, #71, #73, #75, #80, #82, #87, #88
Summary
This PR consolidates 5 UTF-8 safety fixes into a single, cohesive change.
Included PRs:
Changes:
floor_char_boundarychecks before slicingFiles Modified:
src/cortex-agents/src/mention.rssrc/cortex-cli/src/import_cmd.rssrc/cortex-cli/src/lock_cmd.rssrc/cortex-cli/src/utils/notification.rssrc/cortex-resume/src/resume_picker.rsCloses #31, #32, #33, #35, #37