⚡️ Speed up method LocalStorageService.parse_file_path by 64% in PR #10929 (fix-image-s3)#10930
Conversation
* Fix image pathing to operate with s3 storage * [autofix.ci] apply automated fixes * [autofix.ci] apply automated fixes (attempt 2/3) * add test * [autofix.ci] apply automated fixes * ruff * Add abstract method annotation * [autofix.ci] apply automated fixes * fix: use parse_file_path in get_files for S3 storage compatibility --------- Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com> Co-authored-by: himavarshagoutham <himavarshajan17@gmail.com>
The optimized code achieves a **64% speedup** through two key optimizations that eliminate expensive repeated operations:
**1. Precomputed string conversion (56% time savings)**
The original code called `str(self.data_dir)` on every function call, which consumed 56.8% of execution time. The optimized version precomputes this as `self._data_dir_str` during initialization, reducing this operation to a simple attribute access (10.1% of execution time).
**2. Optimized path splitting (8% time savings)**
The original code used `rsplit("/", 1)` which internally scans the string and creates temporary substrings. The optimized version uses `rfind("/")` to locate the last slash once, then performs direct string slicing (`[:slash_index]` and `[slash_index+1:]`), which is more efficient for Python's string operations.
**3. Minor control flow improvement**
The optimized version avoids unnecessary variable assignments when the path doesn't start with the data directory prefix, using an if/else structure instead of always assigning then conditionally reassigning.
**Performance impact on test cases:**
- **Basic cases** (paths with/without data_dir): Benefit significantly from cached string conversion
- **Edge cases** (empty paths, trailing slashes): Maintain correctness while gaining speed
- **Large scale cases** (many nested folders, long paths): Double benefit from both optimizations since they avoid repeated expensive operations
The optimizations preserve all original behavior and edge case handling while reducing the most expensive operations in the hot path. Since `parse_file_path` appears to be called frequently (1116 hits in profiling), these micro-optimizations compound into meaningful performance gains.
|
Important Review skippedBot user detected. To trigger a single review, invoke the You can disable this status message by setting the Comment |
Codecov Report❌ Patch coverage is ❌ Your project status has failed because the head coverage (40.02%) is below the target coverage (60.00%). You can increase the head coverage or adjust the target coverage. Additional details and impacted files@@ Coverage Diff @@
## release-1.7.0 #10930 +/- ##
=================================================
+ Coverage 32.43% 32.72% +0.29%
=================================================
Files 1367 1368 +1
Lines 63315 63497 +182
Branches 9357 9379 +22
=================================================
+ Hits 20538 20782 +244
+ Misses 41744 41674 -70
- Partials 1033 1041 +8
Flags with carried forward coverage won't be shown. Click here to find out more.
🚀 New features to boost your workflow:
|
|
Closing automated codeflash PR. |
⚡️ This pull request contains optimizations for PR #10929
If you approve this dependent PR, these changes will be merged into the original PR branch
fix-image-s3.📄 64% (0.64x) speedup for
LocalStorageService.parse_file_pathinsrc/backend/base/langflow/services/storage/local.py⏱️ Runtime :
1.18 milliseconds→714 microseconds(best of110runs)📝 Explanation and details
The optimized code achieves a 64% speedup through two key optimizations that eliminate expensive repeated operations:
1. Precomputed string conversion (56% time savings)
The original code called
str(self.data_dir)on every function call, which consumed 56.8% of execution time. The optimized version precomputes this asself._data_dir_strduring initialization, reducing this operation to a simple attribute access (10.1% of execution time).2. Optimized path splitting (8% time savings)
The original code used
rsplit("/", 1)which internally scans the string and creates temporary substrings. The optimized version usesrfind("/")to locate the last slash once, then performs direct string slicing ([:slash_index]and[slash_index+1:]), which is more efficient for Python's string operations.3. Minor control flow improvement
The optimized version avoids unnecessary variable assignments when the path doesn't start with the data directory prefix, using an if/else structure instead of always assigning then conditionally reassigning.
Performance impact on test cases:
The optimizations preserve all original behavior and edge case handling while reducing the most expensive operations in the hot path. Since
parse_file_pathappears to be called frequently (1116 hits in profiling), these micro-optimizations compound into meaningful performance gains.✅ Correctness verification report:
🌀 Generated Regression Tests and Runtime
To edit these changes
git checkout codeflash/optimize-pr10929-2025-12-08T17.16.19and push.