Skip to content

fix: Proper parsing of GCP credentials JSON#10828

Merged
erichare merged 37 commits into
mainfrom
fix-write-file-gcp
Dec 3, 2025
Merged

fix: Proper parsing of GCP credentials JSON#10828
erichare merged 37 commits into
mainfrom
fix-write-file-gcp

Conversation

@erichare
Copy link
Copy Markdown
Collaborator

@erichare erichare commented Dec 1, 2025

Fixes parsing of Write File Component with GCP JSON credentials.

This pull request introduces robust improvements to the Google Drive credential parsing logic and adds new unit tests to ensure reliability when handling Google Cloud service account JSON keys, especially those containing control characters or formatting irregularities. These changes address a bug where pasted GCP service account JSONs with literal newlines or unexpected whitespace would fail to parse, resulting in errors during file uploads to Google Drive.

Credential parsing improvements:

  • Enhanced the credential parsing in _save_to_google_drive in save_file.py to use multiple fallback strategies, including relaxed JSON parsing, whitespace stripping, double-decoding, and newline replacement, to handle a wider range of malformed or pasted service account JSONs.

Testing and reliability:

  • Added test_google_drive_credential_parsing_with_control_characters in test_save_file_component.py to verify that service account JSONs containing literal newlines are parsed and used successfully for Google Drive uploads.
  • Added test_google_drive_credential_parsing_strategies in test_save_file_component.py to systematically test various credential parsing strategies, including normal JSON, JSONs with control characters, and those with extra whitespace, ensuring no parsing errors occur.

Summary by CodeRabbit

  • Bug Fixes

    • Improved robustness of service account credential parsing with multiple fallback strategies to handle varied JSON formats, control characters, double-encoded payloads, and whitespace anomalies — preventing parse failures and ensuring reliable Google Drive uploads.
  • Tests

    • Added tests covering credential parsing edge cases (control characters, extra whitespace, and alternate encodings) and successful upload flows.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Dec 1, 2025

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

Adds four fallback strategies to Google Drive service-account JSON parsing in SaveToFile and adds async tests covering control characters, whitespace, and double-encoded credential formats.

Changes

Cohort / File(s) Summary
Google Drive credential parsing resilience
src/lfx/src/lfx/components/files_and_knowledge/save_file.py
Enhanced _save_to_google_drive to try multiple credential JSON parsing strategies: non-strict JSON, whitespace stripping, double-decoding, and literal "\n" → newline replacement; aggregates parse errors and raises a consolidated ValueError if all fail.
SaveToFile component tests
src/backend/tests/unit/components/processing/test_save_file_component.py
Added two async tests: test_google_drive_credential_parsing_with_control_characters and test_google_drive_credential_parsing_strategies. Both mock GCP credential creation and Drive uploads to validate parsing across malformed formats and successful upload reporting.
Starter project metadata update
src/backend/base/langflow/initial_setup/starter_projects/News Aggregator.json
Updated embedded SaveToFile component code/hash to reflect the new implementation (code text and code_hash changed).

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

  • Pay special attention to:
    • Correct ordering and independence of the four parsing strategies.
    • Completeness and clarity of aggregated error messages in the consolidated ValueError.
    • Tests' mocking scopes and reset between iterations to avoid cross-test contamination.
    • Consistency between the actual implementation and the embedded code in the starter project JSON.

Possibly related PRs

Suggested labels

lgtm, size:S

Suggested reviewers

  • ogabrielluiz

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Test Quality And Coverage ⚠️ Warning Tests lack error scenario coverage - no pytest.raises validation when all four parsing strategies fail or ValueError with detailed error message is raised. Add error scenario test providing invalid JSON to verify all parsing strategies fail gracefully with proper ValueError containing guidance to copy entire JSON from service account key file.
✅ Passed checks (6 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fix: Proper parsing of GCP credentials JSON' accurately and specifically summarizes the main change: fixing credential JSON parsing for Google Cloud Platform in the Write File component.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Test Coverage For New Implementations ✅ Passed PR includes comprehensive async test coverage for bug fix with two new test functions in test_save_file_component.py verifying four fallback JSON parsing strategies.
Test File Naming And Structure ✅ Passed Test file follows correct pytest patterns with descriptive test function names covering credential parsing edge cases including control characters and whitespace variations.
Excessive Mock Usage Warning ✅ Passed New tests appropriately mock only 2 external API dependencies while testing real JSON parsing logic, with idiomatic MagicMock chaining validating correct credential dictionaries and error handling.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions github-actions Bot added bug Something isn't working and removed bug Something isn't working labels Dec 1, 2025
@github-actions github-actions Bot added bug Something isn't working and removed bug Something isn't working labels Dec 1, 2025
@github-actions github-actions Bot added bug Something isn't working and removed bug Something isn't working labels Dec 1, 2025
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Dec 1, 2025

Frontend Unit Test Coverage Report

Coverage Summary

Lines Statements Branches Functions
Coverage: 15%
15.43% (4244/27499) 8.61% (1811/21013) 9.69% (587/6057)

Unit Test Results

Tests Skipped Failures Errors Time
1671 0 💤 0 ❌ 0 🔥 21.92s ⏱️

@codecov
Copy link
Copy Markdown

codecov Bot commented Dec 1, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 32.55%. Comparing base (b05d0eb) to head (9e776b1).
⚠️ Report is 1 commits behind head on main.

❌ Your project status has failed because the head coverage (39.99%) is below the target coverage (60.00%). You can increase the head coverage or adjust the target coverage.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main   #10828      +/-   ##
==========================================
+ Coverage   32.53%   32.55%   +0.01%     
==========================================
  Files        1370     1370              
  Lines       63514    63514              
  Branches     9391     9391              
==========================================
+ Hits        20665    20675      +10     
+ Misses      41810    41800      -10     
  Partials     1039     1039              
Flag Coverage Δ
backend 51.56% <ø> (+0.06%) ⬆️
frontend 14.28% <ø> (ø)
lfx 39.99% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.
see 3 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@github-actions github-actions Bot added bug Something isn't working and removed bug Something isn't working labels Dec 1, 2025
coderabbitai[bot]

This comment was marked as outdated.

@github-actions github-actions Bot added bug Something isn't working and removed bug Something isn't working labels Dec 2, 2025
@github-actions github-actions Bot added bug Something isn't working and removed bug Something isn't working labels Dec 2, 2025
@github-actions github-actions Bot added bug Something isn't working and removed bug Something isn't working labels Dec 2, 2025
@github-actions github-actions Bot added bug Something isn't working and removed bug Something isn't working labels Dec 2, 2025
@github-actions github-actions Bot added bug Something isn't working and removed bug Something isn't working labels Dec 2, 2025
@Adam-Aghili Adam-Aghili added the fix-for-release PR to be merged into a release branch label Dec 2, 2025
Copy link
Copy Markdown
Collaborator

@Empreiteiro Empreiteiro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@erichare The codec error persists:

Error building Component Write File:
'charmap' codec can't encode characters in position 20439-20442: character maps to <undefined>

When trying to save a plain text file, I encountered a new error:

Error building Component Write File:

Unable to access Google Drive folder '1FCkhUiNYiaBJrc979ovLWLnPaaCP-tAB'. Error: <HttpError 404 when requesting https://www.googleapis.com/drive/v3/files/1FCkhUiNYiaBJrc979ovLWLnPaaCP-tAB?fields=id%2Cname&alt=json returned "File not found: 1FCkhUiNYiaBJrc979ovLWLnPaaCP-tAB." Details: "[{'message': 'File not found: 1FCkhUiNYiaBJrc979ovLWLnPaaCP-tAB.', 'domain': 'global', 'reason': 'notFound', 'location': 'fileId', 'locationType': 'parameter'}]">. Please ensure: 1) The folder ID is correct, 2) The folder exists, 3) The service account has been granted access to this folder.

I tested the same files with a specific component for Google Drive and it worked correctly, indicating that there is no problem with the key and/or the folder.

Component I used for comparison: https://github.com/Empreiteiro/langflow-factory/blob/master/components/google/drive/google_drive_write.py

@github-actions github-actions Bot added bug Something isn't working and removed bug Something isn't working labels Dec 3, 2025
@erichare erichare requested a review from Empreiteiro December 3, 2025 16:18
@github-actions github-actions Bot added bug Something isn't working and removed bug Something isn't working labels Dec 3, 2025
@github-actions github-actions Bot added bug Something isn't working and removed bug Something isn't working labels Dec 3, 2025
Copy link
Copy Markdown
Collaborator

@Empreiteiro Empreiteiro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After the new revisions, the functions are working as expected! LGTM!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working fix-for-release PR to be merged into a release branch lgtm This PR has been approved by a maintainer

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants