Skip to content

feat: implement JSON and CSV auto-parsing in TypeConverter#9716

Merged
italojohnny merged 17 commits into
mainfrom
feat/improvement_type_convert
Sep 16, 2025
Merged

feat: implement JSON and CSV auto-parsing in TypeConverter#9716
italojohnny merged 17 commits into
mainfrom
feat/improvement_type_convert

Conversation

@italojohnny
Copy link
Copy Markdown
Contributor

@italojohnny italojohnny commented Sep 5, 2025

This PR adds automatic structured data parsing functionality (JSON and CSV) for Message to Data and DataFrame conversions in the TypeConverter component.

Features Added
Auto-detection of JSON: Automatically converts simple JSON objects and object arrays
Auto-detection of CSV: Parses CSV strings into appropriate data structures
Unified conversions: Consistent logic for both Message → Data and Message → DataFrame

Behavior
When auto_parse=True:

  • JSON object: {"name": "Ana"} → Data with structured fields or DataFrame with appropriate columns
  • JSON array: [{"name": "Ana"}, {"name": "Bruno"}] → Data with array in "records" or multi-row DataFrame
  • CSV: "name,age\nAna,28\nBruno,34" → Data with array in "records" or DataFrame with CSV columns

When auto_parse=False (default):
Original behavior preserved: content goes to "text" field

How to Test
Important: To test this functionality, you must enable the "auto_parse" option in the component, as it defaults to False to maintain compatibility with existing code.
Sample Flow: TestTypeConvert.json
Screenshot 2025-09-05 at 09 27 19

Tests
Comprehensive tests have been added covering all structured data conversion scenarios.
Compatibility
Fully backward compatible - default behavior remains unchanged.

Summary by CodeRabbit

  • New Features

    • Added an “Auto Parse” toggle to the Type Converter, enabling automatic detection and conversion of JSON and CSV text into structured data or tables.
    • Enhanced conversions from messages to data/tables, including support for single objects and arrays, with smarter handling of compact/partial records.
    • Improved support for converting existing tables without manual steps.
  • Tests

    • Added comprehensive test coverage for automatic parsing across JSON objects/arrays and CSV, validating both structured data and table outputs for various input shapes.

Add tests for Message to Data/DataFrame conversions with auto_parse enabled:
- JSON object to Data/DataFrame
- JSON array to Data/DataFrame
- CSV to Data/DataFrame
Enable auto_parse to detect and convert JSON/CSV content when converting
Message to Data or DataFrame, creating proper structured output instead
of plain text fields.
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Sep 5, 2025

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

Adds an Auto Parse option to TypeConverterComponent and extends converter logic to parse Message text as JSON or CSV into structured Data/DataFrame. Introduces new parsing helpers and constants, updates component template/UI, and expands tests to cover auto-parse behavior for JSON objects/arrays and CSV.

Changes

Cohort / File(s) Summary
Component template and metadata
src/backend/base/langflow/initial_setup/starter_projects/Research Translation Loop.json
Adds BoolInput auto_parse to TypeConverterComponent template; updates inputs/template; introduces MIN_CSV_LINES; updates code to support auto-parse (JSON/CSV); updates code_hash.
Converter implementation
src/lfx/src/lfx/components/processing/converter.py
Adds auto_parse flag to convert_to_data/convert_to_dataframe; implements parse_structured_data with JSON/CSV detection and parsing helpers; supports DataFrame passthrough; wires component input to converters.
Unit tests
src/backend/tests/unit/components/processing/test_type_converter_component.py
Updates assertions for baseline conversions; adds tests for auto-parse of JSON (object/array) and CSV to Data and DataFrame; introduces json and StringIO in tests.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor User
  participant Component as TypeConverterComponent
  participant Converter as Converter Functions
  participant JSON as JSON Parser
  participant CSV as CSV Parser/Pandas

  User->>Component: Provide input (Message/Data/DF) + auto_parse
  Component->>Converter: convert_to_data / convert_to_dataframe(v, auto_parse)

  alt v is Message
    Converter->>Converter: Create Data(text=v.text)
    alt auto_parse = true
      Converter->>JSON: Try parse text
      alt JSON parsed
        JSON-->>Converter: dict or list[dict]
        Converter->>Converter: To Data/DataFrame
      else JSON not parsed
        Converter->>CSV: Heuristic check + parse
        alt CSV parsed
          CSV-->>Converter: records
          Converter->>Converter: To Data/DataFrame
        else Not parsed
          Converter->>Converter: Fallback to plain conversion
        end
      end
    else auto_parse = false
      Converter->>Converter: Plain conversion (text/columns)
    end
  else v is Data/DataFrame/dict
    Converter->>Converter: Normalize to target type
  end

  Converter-->>Component: Data or DataFrame
  Component-->>User: Output
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested labels

enhancement, size:L, lgtm

Suggested reviewers

  • rodrigosnader
  • edwinjosechittilappilly
  • carlosrcoelho
✨ Finishing touches
🧪 Generate unit tests
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/improvement_type_convert

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions github-actions Bot added enhancement New feature or request and removed enhancement New feature or request labels Sep 5, 2025
@codecov
Copy link
Copy Markdown

codecov Bot commented Sep 5, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 21.48%. Comparing base (5ba5519) to head (035a38c).
⚠️ Report is 2 commits behind head on main.

❌ Your project status has failed because the head coverage (5.81%) is below the target coverage (10.00%). You can increase the head coverage or adjust the target coverage.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #9716      +/-   ##
==========================================
- Coverage   22.86%   21.48%   -1.39%     
==========================================
  Files        1086     1074      -12     
  Lines       39710    39649      -61     
  Branches     5418     5418              
==========================================
- Hits         9081     8519     -562     
- Misses      30474    30986     +512     
+ Partials      155      144      -11     
Flag Coverage Δ
backend 46.51% <100.00%> (+0.03%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
...e/langflow/components/knowledge_bases/ingestion.py 76.12% <100.00%> (ø)

... and 59 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (5)
src/lfx/src/lfx/components/processing/converter.py (3)

36-38: Prefer Message.text over v.data["text"]

Accessing v.data["text"] is brittle. Use the public text attribute.

-        data = Data(data={"text": v.data["text"]})
+        data = Data(data={"text": v.text})

63-66: Same: use Message.text to avoid key errors

-        data = Data(data={"text": v.data["text"]})
+        data = Data(data={"text": v.text})

112-120: Tighten CSV detection heuristic to reduce false positives

Also require the first data line to contain a comma and roughly match header column count.

 def _looks_like_csv(text: str) -> bool:
     """Simple heuristic to detect CSV content."""
     lines = text.strip().split("\n")
     if len(lines) < MIN_CSV_LINES:
         return False
 
-    header_line = lines[0]
-    return "," in header_line and len(lines) > 1
+    header_line = lines[0]
+    first_data = lines[1]
+    header_commas = header_line.count(",")
+    data_commas = first_data.count(",")
+    return header_commas > 0 and data_commas > 0 and abs(header_commas - data_commas) <= 1
src/backend/tests/unit/components/processing/test_type_converter_component.py (1)

33-47: Add coverage: default off + CSV error fallback

Two gaps to solidify backward-compat and resilience:

  • Verify default auto_parse=False keeps text unparsed.
  • If CSV parsing fails, component returns original text (after code change).

Example tests you can add:

def test_message_to_data_auto_parse_default_off(component_class):
    """Auto-parse is disabled by default; keep text as-is."""
    component = component_class(input_data=Message(text='{"a":1}'), output_type="Data")
    result = component.convert_to_data()
    assert result.data == {"text": '{"a":1}'}

def test_message_with_malformed_csv_falls_back_to_text(component_class, monkeypatch):
    """Malformed CSV should not crash; fallback to original text."""
    bad_csv = "a,b\nonly_one_value"
    component = component_class(input_data=Message(text=bad_csv), output_type="Data", auto_parse=True)

    # Force pandas to raise (simulate parse error)
    import lfx.components.processing.converter as conv
    original = conv._parse_csv_to_data
    def boom(_): raise ValueError("parse error")
    monkeypatch.setattr(conv, "_parse_csv_to_data", boom)

    result = component.convert_to_data()
    assert result.data == {"text": bad_csv}

Also applies to: 119-225

src/backend/base/langflow/initial_setup/starter_projects/Research Translation Loop.json (1)

1753-1771: Expose ‘Auto Parse’ in the UI order

You added the auto_parse input (advanced=True) — good. Consider adding it to this node’s field_order so users can easily find/toggle it in the template.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

💡 Knowledge Base configuration:

  • MCP integration is disabled by default for public repositories
  • Jira integration is disabled by default for public repositories
  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 33785f8 and 6952592.

📒 Files selected for processing (3)
  • src/backend/base/langflow/initial_setup/starter_projects/Research Translation Loop.json (3 hunks)
  • src/backend/tests/unit/components/processing/test_type_converter_component.py (3 hunks)
  • src/lfx/src/lfx/components/processing/converter.py (6 hunks)
🧰 Additional context used
📓 Path-based instructions (6)
src/backend/tests/unit/components/**/*.py

📄 CodeRabbit inference engine (.cursor/rules/backend_development.mdc)

src/backend/tests/unit/components/**/*.py: Mirror the component directory structure for unit tests in src/backend/tests/unit/components/
Use ComponentTestBaseWithClient or ComponentTestBaseWithoutClient as base classes for component unit tests
Provide file_names_mapping for backward compatibility in component tests
Create comprehensive unit tests for all new components

Files:

  • src/backend/tests/unit/components/processing/test_type_converter_component.py
{src/backend/**/*.py,tests/**/*.py,Makefile}

📄 CodeRabbit inference engine (.cursor/rules/backend_development.mdc)

{src/backend/**/*.py,tests/**/*.py,Makefile}: Run make format_backend to format Python code before linting or committing changes
Run make lint to perform linting checks on backend Python code

Files:

  • src/backend/tests/unit/components/processing/test_type_converter_component.py
src/backend/tests/unit/**/*.py

📄 CodeRabbit inference engine (.cursor/rules/backend_development.mdc)

Test component integration within flows using create_flow, build_flow, and get_build_events utilities

Files:

  • src/backend/tests/unit/components/processing/test_type_converter_component.py
src/backend/tests/**/*.py

📄 CodeRabbit inference engine (.cursor/rules/testing.mdc)

src/backend/tests/**/*.py: Unit tests for backend code must be located in the 'src/backend/tests/' directory, with component tests organized by component subdirectory under 'src/backend/tests/unit/components/'.
Test files should use the same filename as the component under test, with an appropriate test prefix or suffix (e.g., 'my_component.py' → 'test_my_component.py').
Use the 'client' fixture (an async httpx.AsyncClient) for API tests in backend Python tests, as defined in 'src/backend/tests/conftest.py'.
When writing component tests, inherit from the appropriate base class in 'src/backend/tests/base.py' (ComponentTestBase, ComponentTestBaseWithClient, or ComponentTestBaseWithoutClient) and provide the required fixtures: 'component_class', 'default_kwargs', and 'file_names_mapping'.
Each test in backend Python test files should have a clear docstring explaining its purpose, and complex setups or mocks should be well-commented.
Test both sync and async code paths in backend Python tests, using '@pytest.mark.asyncio' for async tests.
Mock external dependencies appropriately in backend Python tests to isolate unit tests from external services.
Test error handling and edge cases in backend Python tests, including using 'pytest.raises' and asserting error messages.
Validate input/output behavior and test component initialization and configuration in backend Python tests.
Use the 'no_blockbuster' pytest marker to skip the blockbuster plugin in tests when necessary.
Be aware of ContextVar propagation in async tests; test both direct event loop execution and 'asyncio.to_thread' scenarios to ensure proper context isolation.
Test error handling by mocking internal functions using monkeypatch in backend Python tests.
Test resource cleanup in backend Python tests by using fixtures that ensure proper initialization and cleanup of resources.
Test timeout and performance constraints in backend Python tests using 'asyncio.wait_for' and timing assertions.
Test Langflow's Messag...

Files:

  • src/backend/tests/unit/components/processing/test_type_converter_component.py
src/backend/**/*component*.py

📄 CodeRabbit inference engine (.cursor/rules/icons.mdc)

In your Python component class, set the icon attribute to a string matching the frontend icon mapping exactly (case-sensitive).

Files:

  • src/backend/tests/unit/components/processing/test_type_converter_component.py
src/backend/**/components/**/*.py

📄 CodeRabbit inference engine (.cursor/rules/icons.mdc)

In your Python component class, set the icon attribute to a string matching the frontend icon mapping exactly (case-sensitive).

Files:

  • src/backend/tests/unit/components/processing/test_type_converter_component.py
🧬 Code graph analysis (1)
src/lfx/src/lfx/components/processing/converter.py (1)
src/backend/base/langflow/services/database/models/flow/model.py (1)
  • to_data (198-207)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (56)
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 32/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 35/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 39/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 37/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 28/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 36/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 38/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 34/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 40/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 33/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 25/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 27/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 30/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 16/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 21/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 31/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 26/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 29/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 7/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 17/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 24/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 19/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 18/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 22/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 20/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 23/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 15/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 12/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 11/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 14/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 5/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 10/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 2/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 13/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 4/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 9/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 8/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 6/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 3/40
  • GitHub Check: Run Frontend Tests / Playwright Tests - Shard 1/40
  • GitHub Check: Run Backend Tests / Unit Tests - Python 3.10 - Group 5
  • GitHub Check: Run Backend Tests / Integration Tests - Python 3.10
  • GitHub Check: Run Backend Tests / Unit Tests - Python 3.10 - Group 4
  • GitHub Check: Lint Backend / Run Mypy (3.10)
  • GitHub Check: Lint Backend / Run Mypy (3.13)
  • GitHub Check: Run Backend Tests / Unit Tests - Python 3.10 - Group 3
  • GitHub Check: Run Backend Tests / Unit Tests - Python 3.10 - Group 1
  • GitHub Check: Run Backend Tests / Unit Tests - Python 3.10 - Group 2
  • GitHub Check: Lint Backend / Run Mypy (3.12)
  • GitHub Check: Lint Backend / Run Mypy (3.11)
  • GitHub Check: Test Starter Templates
  • GitHub Check: Ruff Style Check (3.13)
  • GitHub Check: Optimize new Python code in this PR
  • GitHub Check: Run Ruff Check and Format
  • GitHub Check: Update Starter Projects
  • GitHub Check: test-starter-projects
🔇 Additional comments (14)
src/lfx/src/lfx/components/processing/converter.py (3)

8-9: LGTM: minimal CSV guard

A small but sensible floor for CSV detection.


149-156: LGTM: new ‘Auto Parse’ input

Flag defaults to False, preserving existing behavior.


229-231: LGTM: component methods correctly thread through auto_parse

Also applies to: 241-243

src/backend/tests/unit/components/processing/test_type_converter_component.py (11)

1-2: LGTM: imports for JSON/CSV fixtures


38-39: LGTM: stricter equality for Data payload

Asserting the full dict guards regressions.


45-46: LGTM: exact column assertion

Ensures stable schema for non-parsed Message→DataFrame.


119-126: LGTM: JSON object → Data


127-135: LGTM: invalid JSON falls back to text


136-146: LGTM: JSON array → Data(records=...)


147-163: LGTM: CSV → Data(records=...)

Good typed expectations on integer fields.


164-180: LGTM: CSV → DataFrame with exact schema and equality

Robust check via assert_frame_equal.


181-193: LGTM: JSON object → single-row DataFrame


194-211: LGTM: JSON array → multi-row DataFrame


212-225: LGTM: compact JSON array → DataFrame subset schema

"legacy": false,
"metadata": {
"code_hash": "95e3542c4866",
"code_hash": "a99682150534",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Keep embedded component code in sync with the Python module

The embedded TypeConverter code mirrors the Python file. Please apply the same fixes here: default auto_parse=False, use Message.text, BOM guard, and CSV try/except fallback; otherwise the template behavior will diverge from the module.

Also applies to: 1788-1789


🏁 Script executed:

#!/usr/bin/env bash
# Locate the JSON template and show the embedded component code around the relevant lines
file=$(fd -t f "Research Translation Loop.json")
echo "Found file: $file"
sed -n '1700,1750p' "$file"
sed -n '1780,1820p' "$file"

Length of output: 11779


Sync embedded TypeConverter code with Python module (src/backend/base/langflow/initial_setup/starter_projects/Research Translation Loop.json lines 1713, 1788)
Missing module fixes—apply all four here:

  • default auto_parse=False in convert_to_data/convert_to_dataframe signatures
  • replace direct v.data["text"] access with Message.text
  • strip BOM in parse_structured_data (e.g. text = data.get_text().lstrip("\ufeff"))
  • wrap CSV parsing (_parse_csv_to_data) in a try/except fallback
🤖 Prompt for AI Agents
In src/backend/base/langflow/initial_setup/starter_projects/Research Translation
Loop.json around line 1713 (and also update the related block near line 1788),
the embedded TypeConverter code is out of sync with the Python module: update
the convert_to_data and convert_to_dataframe function signatures to include
default auto_parse=False, change any direct v.data["text"] accesses to use
Message.text (e.g., v.get_text()/v.text accessor), ensure parse_structured_data
strips a UTF-8 BOM from incoming text (e.g., text =
data.get_text().lstrip("\ufeff")), and wrap the CSV parsing helper
(_parse_csv_to_data) in a try/except that falls back gracefully on failure;
apply identical fixes at both locations so the JSON-embedded code matches the
Python module.

Comment on lines +23 to 29
def convert_to_data(v: DataFrame | Data | Message | dict, *, auto_parse: bool) -> Data:
"""Convert input to Data type.

Args:
v: Input to convert (Message, Data, DataFrame, or dict)
auto_parse: Enable automatic parsing of structured data (JSON/CSV)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Make auto_parse keyword optional to avoid breaking callers

These functions previously had no auto_parse param. Requiring it now breaks external callers and contradicts the PR’s “backward compatible (auto_parse defaults to False)” claim. Default it to False.

-def convert_to_data(v: DataFrame | Data | Message | dict, *, auto_parse: bool) -> Data:
+def convert_to_data(v: DataFrame | Data | Message | dict, *, auto_parse: bool = False) -> Data:

-def convert_to_dataframe(v: DataFrame | Data | Message | dict, *, auto_parse: bool) -> DataFrame:
+def convert_to_dataframe(v: DataFrame | Data | Message | dict, *, auto_parse: bool = False) -> DataFrame:

Also applies to: 42-49

🤖 Prompt for AI Agents
In src/lfx/src/lfx/components/processing/converter.py around lines 23-29 (and
likewise for the function at lines 42-49), the new required auto_parse parameter
breaks backward compatibility; change the function signatures to make auto_parse
an optional keyword argument with a default of False (e.g., auto_parse: bool =
False) so callers that don’t pass it continue to behave the same, and ensure any
internal references treat it as a boolean flag rather than a required positional
parameter.

Comment thread src/lfx/src/lfx/components/processing/converter.py Outdated
@github-actions github-actions Bot added enhancement New feature or request and removed enhancement New feature or request labels Sep 5, 2025
@github-actions github-actions Bot added enhancement New feature or request and removed enhancement New feature or request labels Sep 8, 2025
@github-actions github-actions Bot added enhancement New feature or request and removed enhancement New feature or request labels Sep 15, 2025
@github-actions github-actions Bot added enhancement New feature or request and removed enhancement New feature or request labels Sep 16, 2025
Copy link
Copy Markdown
Collaborator

@edwinjosechittilappilly edwinjosechittilappilly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

functionality LGTM

@github-actions github-actions Bot added the lgtm This PR has been approved by a maintainer label Sep 16, 2025
Copy link
Copy Markdown
Collaborator

@edwinjosechittilappilly edwinjosechittilappilly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can improve if any edge cases missing in follow up PRs
Good Work @italojohnny

@github-actions github-actions Bot added enhancement New feature or request and removed enhancement New feature or request labels Sep 16, 2025
fix: handle parsing errors gracefully by returning original data

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
@github-actions github-actions Bot added enhancement New feature or request and removed enhancement New feature or request labels Sep 16, 2025
@github-actions github-actions Bot added enhancement New feature or request and removed enhancement New feature or request labels Sep 16, 2025
@github-actions github-actions Bot added enhancement New feature or request and removed enhancement New feature or request labels Sep 16, 2025
@github-actions github-actions Bot added enhancement New feature or request and removed enhancement New feature or request labels Sep 16, 2025
@github-actions github-actions Bot added enhancement New feature or request and removed enhancement New feature or request labels Sep 16, 2025
@sonarqubecloud
Copy link
Copy Markdown

@italojohnny italojohnny added this pull request to the merge queue Sep 16, 2025
Merged via the queue into main with commit fbf2a4a Sep 16, 2025
132 of 136 checks passed
@italojohnny italojohnny deleted the feat/improvement_type_convert branch September 16, 2025 18:04
lucaseduoli pushed a commit that referenced this pull request Sep 17, 2025
* test: ensure Message to Data has only 'text' key in data

* fix: ensure Message to Data conversion returns only 'text' key

* test: ensure Message to DataFrame has only 'text' column

* fix: ensure Message to DataFrame conversion returns only 'text' column

* test: add comprehensive conversion tests for structured data parsing

Add tests for Message to Data/DataFrame conversions with auto_parse enabled:
- JSON object to Data/DataFrame
- JSON array to Data/DataFrame
- CSV to Data/DataFrame

* feat: add structured data parsing for Message conversions

Enable auto_parse to detect and convert JSON/CSV content when converting
Message to Data or DataFrame, creating proper structured output instead
of plain text fields.

* chore: update starter project

* fix: update function calls after making auto_parse keyword-only

* Update src/lfx/src/lfx/components/processing/converter.py

fix: handle parsing errors gracefully by returning original data

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>

* [autofix.ci] apply automated fixes

* [autofix.ci] apply automated fixes (attempt 2/3)

* fix: ruff error

* [autofix.ci] apply automated fixes

---------

Co-authored-by: Edwin Jose <edwin.jose@datastax.com>
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request lgtm This PR has been approved by a maintainer

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants