feat: Changing location, name and functions of SaveToFileComponent. by Empreiteiro · Pull Request #9868 · langflow-ai/langflow

Empreiteiro · 2025-09-15T16:39:22Z

Summary by CodeRabbit

New Features
- Added a Save to File component with a UI to store outputs to Local, AWS S3, or Google Drive.
- Supports saving DataFrame/Data/Message in multiple formats (CSV, Excel, JSON, Markdown, Text) and auto-adjusts file extensions.
- Dynamically shows required fields per storage option and can create Google Docs/Slides.
- Returns file paths or shareable links after upload.
Refactor
- Replaced the previous save-to-file component and relocated it under data components for a more streamlined workflow and validation.

coderabbitai · 2025-09-15T16:39:29Z

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

Adds a new SaveToFileComponent under data/ with support for Local, AWS S3, and Google Drive destinations and various formats; removes the previous processing/ SaveToFileComponent. The new component dynamically configures UI fields per storage location and routes save operations to location-specific handlers, then uploads the saved file via the existing upload pipeline.

Changes

Cohort / File(s)	Summary
New data SaveToFile component `src/lfx/src/lfx/components/data/save_file.py`	Adds SaveToFileComponent with UI inputs for storage location (Local/AWS/Google Drive), format selection, credentials/paths, and output Message. Implements save_to_file control flow, type/format inference, local save logic, S3 upload via temporary file, Google Drive and Google Apps creation, content extraction, path/extension handling, and post-save upload.
Removed legacy processing SaveToFile `src/lfx/src/lfx/components/processing/save_file.py`	Removes prior SaveToFileComponent that handled only local saves with format options and upload; functionality superseded by the new data component supporting multiple storage backends.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor User
  participant UI as Component UI
  participant C as SaveToFileComponent
  participant L as Local FS
  participant S3 as AWS S3
  participant GD as Google Drive
  participant U as Upload Service

  User->>UI: Configure inputs (location, format, path/credentials)
  UI->>C: save_to_file()
  alt Detect input type and default format
    C->>C: _get_input_type / _get_default_format
  end
  opt Adjust path
    C->>C: _adjust_file_path_with_format
  end
  alt Location = Local
    C->>L: Write file (DataFrame/Data/Message)
    C->>U: _upload_file(final_path)
  else Location = AWS S3
    C->>C: Serialize to temp file
    C->>S3: Upload to bucket/prefix
    C->>U: _upload_file(temp or reference)
  else Location = Google Drive
    C->>GD: Create file or Google Doc/Slides
    C->>U: _upload_file(reference or exported content)
  end
  C-->>UI: Message(File Path / reference)
  UI-->>User: Display result

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check	✅ Passed	The title describes relocating and changing SaveToFileComponent, which matches the changeset that removes the component from components/processing and adds an expanded implementation under components/data; therefore it is related to the main change. The title's mention of a "name" change appears misleading because the class name remains SaveToFileComponent in the diff. Overall the title is concise and conveys the primary intent but contains one inaccurate detail.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🧪 Early access (Sonnet 4.5): enabled

We are currently testing the Sonnet 4.5 model, which is expected to improve code review quality. However, this model may lead to increased noise levels in the review comments. Please disable the early access features if the noise level causes any inconvenience.

Note:

Public repositories are always opted into early access features.
You can enable or disable early access features from the CodeRabbit UI or by updating the CodeRabbit configuration file.

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 10

🧹 Nitpick comments (6)

src/lfx/src/lfx/components/data/save_file.py (6)

388-391: Validate AWS credentials securely before use

The AWS credential validation only checks if the values exist but doesn't verify their format or validity. Invalid credentials will only fail during the boto3 call, potentially after creating temporary files.

Consider adding basic validation for AWS credentials format:

 # Validate AWS credentials
 if not getattr(self, 'aws_access_key_id', None):
     raise ValueError("AWS Access Key ID is required for S3 storage")
 if not getattr(self, 'aws_secret_access_key', None):
     raise ValueError("AWS Secret Key is required for S3 storage")
 if not getattr(self, 'bucket_name', None):
     raise ValueError("S3 Bucket Name is required for S3 storage")
+
+# Basic format validation
+if len(self.aws_access_key_id) < 16:
+    raise ValueError("AWS Access Key ID appears to be invalid")
+if len(self.aws_secret_access_key) < 20:
+    raise ValueError("AWS Secret Key appears to be invalid")

362-365: Path creation without proper error handling

Creating parent directories with mkdir could fail with permissions errors or disk space issues, but these aren't caught specifically.

Add specific error handling for directory creation:

 # Prepare file path
 file_path = Path(self.file_name).expanduser()
 if not file_path.parent.exists():
-    file_path.parent.mkdir(parents=True, exist_ok=True)
+    try:
+        file_path.parent.mkdir(parents=True, exist_ok=True)
+    except PermissionError as e:
+        raise ValueError(f"Cannot create directory {file_path.parent}: Permission denied") from e
+    except OSError as e:
+        raise ValueError(f"Cannot create directory {file_path.parent}: {e}") from e

463-465: Improve JSON parsing error message for service account key

The error message for invalid JSON could be more helpful by suggesting common issues.

Enhance the error message:

 try:
     credentials_dict = json.loads(self.service_account_key)
 except json.JSONDecodeError as e:
-    raise ValueError(f"Invalid JSON in service account key: {str(e)}")
+    raise ValueError(
+        f"Invalid JSON in service account key: {str(e)}. "
+        "Ensure you've pasted the complete JSON content from your service account key file."
+    )

248-250: Consider using a mapping for file extensions

The logic for adjusting file paths with extensions could be simplified using a mapping.

 def _adjust_file_path_with_format(self, path: Path, fmt: str) -> Path:
     """Adjust the file path to include the correct extension."""
+    FORMAT_EXTENSIONS = {
+        "excel": "xlsx",
+        # Add other format mappings if needed
+    }
     file_extension = path.suffix.lower().lstrip(".")
-    if fmt == "excel":
-        return Path(f"{path}.xlsx").expanduser() if file_extension not in ["xlsx", "xls"] else path
-    return Path(f"{path}.{fmt}").expanduser() if file_extension != fmt else path
+    
+    target_ext = FORMAT_EXTENSIONS.get(fmt, fmt)
+    if fmt == "excel" and file_extension in ["xlsx", "xls"]:
+        return path
+    if file_extension != target_ext:
+        return Path(f"{path}.{target_ext}").expanduser()
+    return path

397-401: Import boto3 at module level

Dynamic imports inside methods can cause unexpected delays and make dependency management harder.

Move the import to the top of the file and handle ImportError gracefully:

# At the top of the file
try:
    import boto3
    from botocore.exceptions import ClientError, NoCredentialsError
    HAS_BOTO3 = True
except ImportError:
    HAS_BOTO3 = False

# In the method
if not HAS_BOTO3:
    msg = "boto3 is not installed. Please install it using `uv pip install boto3`."
    raise ImportError(msg)

450-459: Same issue with Google API imports

The Google API libraries should also be imported at module level.

Apply the same pattern as suggested for boto3:

# At the top of the file
try:
    from googleapiclient.discovery import build
    from googleapiclient.http import MediaFileUpload
    from google.oauth2 import service_account
    HAS_GOOGLE_API = True
except ImportError:
    HAS_GOOGLE_API = False

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 974cbe0 and ae2c2ab.

📒 Files selected for processing (2)

src/lfx/src/lfx/components/data/save_file.py (1 hunks)
src/lfx/src/lfx/components/processing/save_file.py (0 hunks)

💤 Files with no reviewable changes (1)

src/lfx/src/lfx/components/processing/save_file.py

🧰 Additional context used

🧬 Code graph analysis (1)

src/lfx/src/lfx/components/data/save_file.py (3)

src/backend/base/langflow/api/v2/files.py (1)

upload_user_file (83-177)

src/backend/base/langflow/services/database/models/user/crud.py (1)

get_user_by_id (19-23)

src/backend/base/langflow/services/deps.py (3)

get_settings_service (111-124)

get_storage_service (88-96)

session_scope (151-173)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)

GitHub Check: Update Starter Projects
GitHub Check: Run Ruff Check and Format
GitHub Check: Ruff Style Check (3.13)

🔇 Additional comments (3)

src/lfx/src/lfx/components/data/save_file.py (3)
226-233: Good use of exact type checking

The implementation correctly uses type() instead of isinstance() to avoid inheritance issues with Message/Data types. The comment clearly explains the reasoning.

148-193: Clean implementation of dynamic UI configuration

The update_build_config method provides a clean implementation for dynamically showing/hiding fields based on storage location selection. The logic is clear and maintainable.

569-583: Verify async iterator handling in _extract_content_for_upload

ripgrep returned no matches for Message async-iterator usage; unable to confirm whether Message.text can be an async iterator—manual verification required. If Message.text can be async, convert _extract_content_for_upload to async and handle async/sync iterables; suggested change:
-def _extract_content_for_upload(self) -> str:
+async def _extract_content_for_upload(self) -> str:
     """Extract content from input for upload to cloud services."""
     if self._get_input_type() == "DataFrame":
         return self.input.to_csv(index=False)
     elif self._get_input_type() == "Data":
         if hasattr(self.input, 'data') and self.input.data:
             if isinstance(self.input.data, dict):
                 import json
                 return json.dumps(self.input.data, indent=2, ensure_ascii=False)
             return str(self.input.data)
         return str(self.input)
     elif self._get_input_type() == "Message":
-        return str(self.input.text) if self.input.text else str(self.input)
+        if self.input.text is None:
+            return str(self.input)
+        # async iterable
+        if hasattr(self.input.text, "__aiter__"):
+            parts = []
+            async for item in self.input.text:
+                parts.append(str(item))
+            return " ".join(parts)
+        # sync iterable (exclude str/bytes)
+        if hasattr(self.input.text, "__iter__") and not isinstance(self.input.text, (str, bytes)):
+            return " ".join(str(item) for item in self.input.text)
+        return str(self.input.text)
     else:
         return str(self.input)
Also verify and update all callers to await this function if it becomes async (e.g., _save_message).

…into pr/9868

sonarqubecloud · 2025-09-30T16:33:43Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

Empreiteiro added 2 commits September 15, 2025 13:35

feat: Move save_file.py to data directory

52ca516

Update save_file.py

ae2c2ab

github-actions Bot added the enhancement New feature or request label Sep 15, 2025

[autofix.ci] apply automated fixes

e10350e

github-actions Bot added enhancement New feature or request and removed enhancement New feature or request labels Sep 15, 2025

[autofix.ci] apply automated fixes (attempt 2/3)

da12af4

github-actions Bot added enhancement New feature or request and removed enhancement New feature or request labels Sep 15, 2025

coderabbitai Bot reviewed Sep 15, 2025

View reviewed changes

fix: Update import statements to use 'lfx' namespace

391ec17

github-actions Bot added enhancement New feature or request and removed enhancement New feature or request labels Sep 15, 2025

[autofix.ci] apply automated fixes

51ec9ad

github-actions Bot added enhancement New feature or request and removed enhancement New feature or request labels Sep 15, 2025

carlosrcoelho requested a review from erichare September 15, 2025 19:54

erichare approved these changes Sep 22, 2025

View reviewed changes

erichare added 2 commits September 22, 2025 09:26

Update save_file.py

ae9a080

Merge branch 'main' into write-file

cbb6115

github-actions Bot added enhancement New feature or request and removed enhancement New feature or request labels Sep 22, 2025

[autofix.ci] apply automated fixes

c059406

github-actions Bot added enhancement New feature or request and removed enhancement New feature or request labels Sep 22, 2025

Fix test location and import

50ac069

github-actions Bot removed the enhancement New feature or request label Sep 22, 2025

github-actions Bot added the enhancement New feature or request label Sep 30, 2025

Merge branch 'write-file' of https://github.com/Empreiteiro/langflow …

16cc011

…into pr/9868